Event log quality blocking process mining—how are you handling it?

diegovalue · August 18, 2025, 2:57pm

We’re trying to scale process mining across our org after a successful pilot in order-to-cash, but we’re hitting a wall on event log quality. Our initial analysis looks great until we dig deeper and find missing case IDs, timestamp inconsistencies across different systems, and duplicate records that inflate activity counts. The worst part is zero timestamps—we’ve got events recorded as 1970 or 2100, which makes case durations look like decades.

We’re pulling data from ERP, a couple of CRM instances, and some legacy systems that don’t talk to each other well. Each system uses its own ID scheme, so tracking a single process instance end-to-end is proving really difficult. We’re spending more time cleaning data than actually analyzing processes, and I’m worried we’re going to lose executive support if we can’t show value faster.

How are others tackling this? Are you building dedicated data prep pipelines, or is there a governance approach that’s worked? Also curious if anyone has found ways to automate the detection of these quality issues before they distort the analysis.

cloudleader · August 27, 2025, 2:57pm

Governance saved us here. We set up a data quality framework with documented standards for timestamps (everything gets converted to UTC), activity naming conventions, and case ID formats before data even hits the process mining tool. We also run automated profiling scripts that compare event log characteristics against known operational metrics—if case counts or durations are way off, we know something’s wrong before analysis starts. It’s boring work but absolutely necessary.

javier_api · August 22, 2025, 2:57pm

The cross-system case ID problem is brutal. We built a mapping table that links order IDs, requisition IDs, and invoice IDs so we can stitch together the full process. It’s manual work upfront but pays off when you can actually see end-to-end flow. The key was getting IT and the business units to agree on which identifier is the “golden” case ID for each process type. Without that alignment, you’re just guessing.

sqladvisor · August 19, 2025, 2:57pm

We had almost the exact same issues pulling from SAP. The zero timestamp problem was killing us—turns out migration scripts from an upgrade years ago left placeholder dates all over the place. We ended up writing validation rules that flag any timestamp outside a reasonable range (say, 2015 to present) and then decide case-by-case whether to remove just those events or the whole case. If it’s a small percentage, we drop the cases. If it’s widespread, we remove only the bad events and keep the rest of the process instance intact.

Topic		Views
Event log quality blocking our process mining rollout – where to start? AI Adoption in BPM discussion , data-quality , erp-integration , process-mining , ai-adoption , exploring , bpm-ai , event-logs , timestamp-management	5	August 27, 2025
Cleaning up event logs for process mining in order-to-cash AI Adoption in BPM use-case , data-governance , erp-integration , process-mining , ai-adoption , piloting , bpm-ai , event-log-quality	5	August 15, 2025
Process mining: Data quality vs volume-how do you optimize for finance workflow analytics? ServiceNow discussion , data-quality , audit , compliance , data-integration , process-mining , process-analytics , snow-san-diego , finance-workflows	3	November 30, 2024
Preparing event logs for process mining vs automation: best practices Appian discussion , data-quality , aml , process-mining , process-analytics , process-modeling , appian-23-2 , event-log-export , automation-integration	6	March 6, 2025
Automated process mining data pipeline for order-to-cash vis Mendix use-case , rest-api , data-integration , process-mining , order-to-cash , event-logs , mendix-9-18 , etl-microflow , dashboard-visualization	7	May 26, 2025
Process intelligence orchestration: bridging mining and RPA at scale AI Adoption in BPM use-case , process-mining , orchestration , rpa , scaling , ai-adoption , bpm-ai , celonis , decision-engine	4	December 3, 2025
How are you handling real-time SLA tracking and audit trail generation for compliance? AI Adoption in BPM question , gdpr , process-mining , sla-monitoring , audit-trails , ai-adoption , piloting , continuous-monitoring , bpm-ai	5	August 15, 2025
Process mining event log import fails due to missing case ID Mendix question , data-quality , xml , process-mining , process-modeling , csv-import , mendix-9-24 , event-log , case-id	3	April 25, 2025
Process mining dashboard slow to load with large event logs ServiceNow question , performance-opt , query-optimization , process-mining , dashboard-performance , database-indexing , event-logs , snow-san-d , performance-ana	4	November 19, 2025

Event log quality blocking process mining—how are you handling it?

Related topics