We’re experiencing severe performance issues with our Process Mining dashboards in San Diego release. When loading dashboards that analyze event logs with over 500K records, the load time exceeds 60 seconds, sometimes timing out completely.
The problem is particularly acute when filtering by date ranges or specific process variants. We’ve noticed the event log tables don’t seem to have proper indexing on timestamp and case_id columns, which are heavily queried by the dashboard widgets.
Our data volume is growing rapidly - we’re adding about 100K events per week. We’re also concerned about long-term data archiving strategies since we need to retain historical data for compliance but can’t have it impact current dashboard performance.
Has anyone dealt with optimizing Process Mining performance for large datasets? Any recommendations on query optimization or data partitioning approaches?
We had similar issues on Tokyo and resolved it by implementing a three-tier archiving strategy. Active data (last 6 months) stays in main tables with full indexing. Historical data (6-24 months) moves to archive tables with read-only access. Older data exports to cold storage with on-demand retrieval. This cut our dashboard load times from 45 seconds to under 8 seconds. The archiving job runs weekly during maintenance windows.
The archiving approach sounds promising. How did you handle the historical data queries when users need to analyze trends across the full 24-month period? Do you have a separate dashboard for historical analysis, or does it dynamically query both active and archive tables?
Resource allocation is important, but indexing is critical here. The pa_event_log and pa_case tables need composite indexes on frequently queried column combinations. For date range queries, you want indexes on (timestamp, case_id) and (case_id, activity). If you’re filtering by process variants, add indexes on variant_id columns too.
Also consider implementing data partitioning by month or quarter. This dramatically improves query performance when users filter by date ranges since the database only scans relevant partitions. ServiceNow’s Process Mining supports this through table partitioning strategies.
For the 500K+ record datasets, make sure your dashboard widgets are using pagination and lazy loading rather than trying to render everything at once.
We created aggregated summary tables for historical trend analysis. The nightly ETL job calculates key metrics (average case duration, process variant distributions, bottleneck statistics) and stores them in summary tables. For detailed historical analysis, users can request specific date ranges which triggers a background job to merge active and archive data - results are cached for 48 hours. Most users (90%+) only need the aggregated trends, so dashboard performance stays fast.