We’re implementing process mining to analyze our order-to-cash process with 2+ million records. Need to extract event logs from Creatio for analysis in our process mining tool.
Two approaches being debated:
- Use Creatio’s OData API for extraction - cleaner, respects business logic, but concerns about performance and API throttling with large datasets
- Direct SQL queries against the Creatio database - faster extraction, full control, but bypasses business logic and might miss calculated fields
For process mining specifically, where you need bulk extraction of historical data, which approach has worked better? Main concerns are extraction speed and whether OData throttling becomes a bottleneck for real-time analysis updates.
Good points on the business logic aspect. My concern with direct SQL is ongoing maintenance - if Creatio’s schema changes during upgrades, our extraction queries break. With OData, the API contract is more stable across versions. Has anyone experienced schema changes that broke their SQL-based extraction after Creatio upgrades?
OData throttling is definitely a concern for large datasets. Creatio’s OData implementation has request limits and pagination caps. For 2 million records, you’ll need thousands of paginated requests, which takes hours. We tried this initially and extraction time was 6-8 hours for similar dataset sizes. Switched to direct SQL and got it down to 15-20 minutes. The trade-off is you need to understand Creatio’s database schema, but for process mining where you’re extracting raw event logs, that’s manageable.
Real-time analysis updates are where the architecture choice really matters. If you’re doing one-time historical analysis, even 8 hours with OData is acceptable. But for continuous process mining where you want dashboards updated hourly or daily, extraction speed is critical. We implemented incremental extraction using SQL with timestamp-based filtering - only extract records modified since last extraction. This brings extraction time down to minutes even for large datasets. OData pagination makes incremental extraction more complex because you can’t efficiently filter and paginate simultaneously.