Event log import fails in process mining due to date format

I’m trying to import event logs into Pega Process Mining but keep getting ‘Invalid date format’ errors. Our source system exports timestamps in dd/MM/yyyy HH:mm:ss format, but the import keeps failing during validation.

The error occurs at the schema validation stage:


Error: Invalid date format in column 'EventTimestamp'
Expected: ISO 8601 format
Found: 15/03/2025 09:30:45

I’ve checked the event log schema requirements in the documentation, but it’s not clear how to handle date format conversion before import. The logs contain about 50,000 events from our case management system, and we need this analysis to identify workflow bottlenecks. Has anyone dealt with date format mismatches during import validation?

There’s no built-in date conversion in the import wizard unfortunately. Your best bet is preprocessing. If you’re comfortable with Excel, you can use a formula to convert the format. For larger datasets, I recommend a script approach. Also verify that your source data doesn’t have any null timestamps or malformed entries - those will cause the entire import to fail even if 99% of your dates are correct.

Let me provide a comprehensive solution covering all the key aspects:

Event Log Schema Requirements: Pega Process Mining mandates ISO 8601 format (yyyy-MM-dd’T’HH:mm:ss or yyyy-MM-dd’T’HH:mm:ss.SSSZ with timezone). The schema validator checks this before any data processing begins, which is why your import fails immediately.

Date Format Conversion Approach: For your dd/MM/yyyy HH:mm:ss format, here’s a reliable conversion method:


# Python preprocessing example
import pandas as pd
df = pd.read_csv('event_log.csv')
df['EventTimestamp'] = pd.to_datetime(df['EventTimestamp'],
    format='%d/%m/%Y %H:%M:%S').dt.strftime('%Y-%m-%dT%H:%M:%S')
df.to_csv('event_log_converted.csv', index=False)

Import Validation Best Practices:

  1. Validate all required columns exist: CaseID, Activity, EventTimestamp, Resource
  2. Check for null or empty timestamps - filter these out before import
  3. Ensure consistent date format across all rows (scan for anomalies)
  4. Include timezone offset if your data spans multiple regions: yyyy-MM-dd’T’HH:mm:ss+00:00
  5. Test with a small sample (100-200 rows) before importing the full 50K events

For your specific case, preprocess the CSV to convert dates, then verify the first few rows match the expected schema. The import validation will then pass, and you’ll be able to proceed with your workflow bottleneck analysis. If you’re doing regular imports, consider automating this conversion step in your data pipeline.

I had this exact issue last month. Process Mining requires ISO 8601 format (yyyy-MM-dd’T’HH:mm:ss). You need to transform your dates before import. I used a simple Python script to preprocess the CSV file and convert all date columns to the correct format.

Thanks for the responses. I’m looking at preprocessing options now. Should I convert the dates in the source system export, or is there a way to configure the import to handle the conversion automatically?

Check if your source system has multiple date formats mixed together. I’ve seen cases where manual entries had different formats than system-generated timestamps. Run a quick scan of your CSV to identify all unique date patterns before conversion. This saved me hours of troubleshooting when we had three different formats in one export.

One more thing to watch for - timezone handling. If your source data doesn’t include timezone info, Process Mining will assume UTC by default, which might skew your timeline analysis if your operations span multiple timezones.

The event log schema is strict about date formats for good reason - it ensures consistent timeline analysis across different data sources. Besides the timestamp format, make sure your CSV headers match exactly what Process Mining expects: CaseID, Activity, EventTimestamp, Resource, and any custom attributes you’re tracking. Even small variations in header names can cause validation failures.