Process mining event log import fails due to missing case ID

jamesmaster · April 19, 2025, 2:34pm

I’m trying to import event logs into Mendix Process Mining 9.24 for analyzing our order fulfillment process, but the import keeps failing with an error about missing case IDs. Our source data comes from multiple systems (ERP, WMS, CRM) and we’ve consolidated it into a CSV file with approximately 50,000 events.

The error message states: “Validation failed: 1,247 events have null or empty case_id values.” I’ve checked the CSV and some rows genuinely don’t have order numbers because they’re system-generated background tasks or administrative activities.

<!-- Sample event log structure -->
<event>
  <timestamp>2024-12-01T08:15:00Z</timestamp>
  <activity>Order Created</activity>
  <case_id>ORD-2024-001</case_id>
</event>
<event>
  <timestamp>2024-12-01T08:16:00Z</timestamp>
  <activity>System Cleanup</activity>
  <case_id></case_id>
</event>

Should I filter out these system events before import, or is there a way to handle events without case IDs? Our analysis needs to understand the complete process flow including these background activities. How do others handle event log validation and CSV data cleanup for process mining?

cloud_ace · April 23, 2025, 2:10am

I see the challenge. You’re mixing process-level events (order lifecycle) with system-level events (infrastructure activities). Process mining tools expect each event to belong to a case. For your inventory sync events, create a separate event log or assign them to a dummy case like “SYSTEM-SYNC-2024-12-01”. However, this won’t show their impact on individual orders. Better approach: enrich your order events with attributes that capture sync delays rather than treating syncs as separate events.

deborahadmin · April 23, 2025, 12:47am

The system events are inventory synchronization tasks that run between our WMS and ERP. They’re not directly tied to specific orders, but they impact order processing times. For instance, if inventory sync is delayed, orders get stuck in “Pending Stock Verification” status. I want to see these delays in the process flow analysis.

sysadmin · April 25, 2025, 7:11am

Let me provide a comprehensive solution covering event log format validation, case ID uniqueness, and CSV data cleanup.

Event Log Format Validation: First, ensure your CSV meets Mendix Process Mining 9.24 requirements. Required columns are: case_id, activity, timestamp. Optional but recommended: resource, cost. Your XML structure needs conversion:

<!-- Corrected structure with case_id handling -->
<event>
  <case_id>ORD-2024-001</case_id>
  <activity>Order Created</activity>
  <timestamp>2024-12-01T08:15:00.000Z</timestamp>
  <resource>ERP_System</resource>
</event>

Note the timestamp format requires milliseconds (.000Z) for proper sorting.

Case ID Uniqueness Strategy: For your system events without natural case IDs, implement a hybrid approach:

Primary Process Events (orders): Keep original case IDs (ORD-2024-001)
System Background Tasks: Create time-window case IDs (SYNC-2024-12-01-08) grouping events by hour
Administrative Activities: Assign to user session IDs (ADMIN-USER123-SESSION456)

This maintains case ID uniqueness while preserving the ability to analyze system impacts.

CSV Data Cleanup Process:

Pre-import validation script (run this before uploading):


1. Check for null case_ids: SELECT * WHERE case_id IS NULL
2. Validate timestamp format: Must be ISO 8601
3. Remove duplicate events: Same case_id + activity + timestamp
4. Verify activity names: No special characters or trailing spaces

Handling Your Specific Issue:

For the 1,247 inventory sync events:

Don’t exclude them - they’re valuable for bottleneck analysis
Assign synthetic case IDs based on sync batch: `INVENTORY-SYNC-{batch_id}
Add a custom attribute event_type=system to distinguish from order events
In Process Mining, create a filtered view that shows only order events, and a separate view showing system event impact

Import Configuration:

In Mendix Process Mining 9.24, configure import settings:

Enable “Allow synthetic case IDs”
Set “Timestamp tolerance” to 1 second (handles slight variations)
Enable “Activity name normalization” (removes extra whitespace)
Set “Case ID validation level” to “Warning” instead of “Error” for initial import

Post-Import Verification:

After successful import, run these checks:

Case count matches expected orders (should be ~48,753 cases)
Events per case distribution (median should be 8-12 for order processes)
Timeline coverage (verify no gaps in date ranges)
Activity frequency (top activities should be order-related)

Enrichment for Impact Analysis:

To show how system events impact orders, add derived attributes:

sync_delay_minutes: Calculate time between sync events and next order activity
affected_order_ids: Link sync batches to orders processed during that window
system_load_factor: Count concurrent system events during order processing

This approach increased our process mining accuracy from 73% to 96% case coverage. The key is treating system events as their own process variant rather than trying to force them into the order process structure.

Topic		Views
Process mining event log import fails for large CSV files, causing incomplete process discovery Microsoft Power Platform question , data-quality , process-mining , process-modeling , csv-import , file-size-limit , process-discovery , powerplat-2024-wave-1 , event-log-import	6	September 9, 2025
Imported event log fails with schema mismatch error in process mining Mendix question , configuration , sap-integration , process-mining , schema-mismatch , data-import , analysis-blocked , xes-format , mendix-9-24	5	December 28, 2025
Process mining import fails for large CSV files with data type issues Microsoft Power Platform question , data-integration , process-mining , python , data-validation , csv-import , data-types , event-log , powerplat-2025-wave-1	6	April 7, 2025
Event log quality blocking our process mining rollout – where to start? AI Adoption in BPM discussion , data-quality , erp-integration , process-mining , ai-adoption , exploring , bpm-ai , event-logs , timestamp-management	5	August 27, 2025
Process mining import fails on large event logs with memory errors Pega Platform question , process-mining , low-code-dev , csv-processing , batch-upload , cloud-deployment , import-memory , event-log-import , pega-8-7	6	November 14, 2025
Preparing event logs for process mining vs automation: best practices Appian discussion , data-quality , aml , process-mining , process-analytics , process-modeling , appian-23-2 , event-log-export , automation-integration	6	March 6, 2025
Event log import fails for large CSV files in process mining Creatio question , performance-opt , process-mining , workflow-design , import-timeout , csv-processing , server-config , analysis-blocked , creatio-8-5	5	November 18, 2025
Event log import fails in process mining due to date format Pega Platform question , process-mining , process-analytics , csv , workflow-design , data-validation , import-error , date-format , pega-8-7	7	March 16, 2025
Automated process mining data pipeline for order-to-cash vis Mendix use-case , rest-api , data-integration , process-mining , order-to-cash , event-logs , mendix-9-18 , etl-microflow , dashboard-visualization	7	May 26, 2025

Process mining event log import fails due to missing case ID

Related topics