Process analytics REST API export fails with timeout on larg

Our process analytics dashboard in Mendix 9.24 needs to export large datasets via REST API for external BI tools. The export works fine for small date ranges (under 10,000 records), but consistently times out when trying to export monthly or quarterly data (50,000+ records).

The REST endpoint is configured with a 60-second timeout, and we’re seeing:


HTTP 504 Gateway Timeout
Request exceeded maximum execution time

We’ve tried increasing the timeout to 180 seconds, but that just delays the inevitable failure. The export microflow retrieves all process instances with their metrics in a single query, formats them as JSON, and returns the response. Is there a better pattern for handling large dataset exports without hitting timeout limits?

Also consider async export with webhooks. For very large datasets, immediately return 202 Accepted with a job ID, process the export in the background, and send a webhook notification when complete. The client can then download the prepared file. This pattern handles datasets of any size without timeout issues and provides better user experience with progress tracking.

For 20-30 fields per record, aim for 2000-3000 records per page. This typically results in 2-5MB response sizes which transfer quickly even on slower networks. More important than page size is query optimization - ensure your database queries use proper indexes on timestamp and ID columns to avoid full table scans on each page request.

You’re hitting a classic problem with synchronous REST exports. A 60-second timeout for 50k records is actually quite aggressive. The real solution isn’t increasing timeouts - it’s implementing pagination with cursor-based navigation. Your API should return chunks of data (1000-5000 records per request) with a continuation token for the next page.

Here’s a complete implementation strategy addressing REST API pagination, large dataset export, and timeout handling with partial data recovery:

1. Implement Cursor-Based Pagination:

Create a REST operation with pagination parameters:


GET /api/analytics/export?cursor={token}&limit=2500
Response includes: { "data": [...], "nextCursor": "abc123", "hasMore": true }

Microflow logic:


// Decode cursor to get lastProcessedId
if (cursor != empty) {
    lastId = decodeBase64Token(cursor);
} else {
    lastId = 0;
}

// Query with ID-based pagination
processList = SELECT * FROM ProcessInstance
              WHERE Id > $lastId
              ORDER BY Id ASC
              LIMIT $limit;

2. Large Dataset Export Strategy:

For datasets over 100k records, implement three-tier approach:

  • Tier 1 (0-10k records): Direct synchronous export, single request
  • Tier 2 (10k-100k records): Paginated synchronous export, client handles pagination
  • Tier 3 (100k+ records): Async job with file generation

Detect dataset size before processing:


totalCount = COUNT ProcessInstance WHERE DateRange = $range;
if (totalCount > 100000) {
    return { "jobId": generateExportJob(), "estimatedTime": calculateTime(totalCount) };
}

3. Timeout and Partial Data Handling:

Implement resume capability using checkpoints:

// ExportCheckpoint entity stores progress
Checkpoint checkpoint = new ExportCheckpoint();
checkpoint.setExportId(exportId);
checkpoint.setLastProcessedId(currentId);
checkpoint.setRecordsProcessed(count);
checkpoint.setTotalRecords(total);
Core.commit(context, checkpoint.getMendixObject());

Client can resume from last checkpoint:


GET /api/analytics/export/resume/{exportId}
Returns: cursor pointing to last successfully processed record

4. Query Optimization for Performance:

Critical indexes for process analytics:

-- In Mendix, ensure indexes exist on:
ProcessInstance.Id (primary key - automatic)
ProcessInstance.StartDate (for date range queries)
ProcessInstance.Status (for filtering)

Use batch retrieval in microflow:


// Retrieve in batches to avoid memory issues
batchSize = 2500;
for (offset = 0; offset < totalCount; offset += batchSize) {
    batch = retrieveProcessBatch(offset, batchSize);
    processBatch(batch);
    commitTransaction(); // Free memory
}

5. Response Format with Metadata:

Include pagination metadata for client intelligence:

{
  "data": [...],
  "pagination": {
    "cursor": "eyJsYXN0X2lkIjoxMjM0NX0=",
    "hasMore": true,
    "currentPage": 3,
    "estimatedTotal": 45000,
    "pageSize": 2500
  },
  "metadata": {
    "exportId": "exp_20250208_001",
    "generatedAt": "2025-02-08T14:30:00Z",
    "dataRange": "2025-01-01 to 2025-01-31"
  }
}

6. Client Implementation Pattern:

For BI tools, provide a reference implementation:

async function exportAllData(baseUrl, params) {

  let allData = [];

  let cursor = null;

  do {

    const response = await fetch(

      `${baseUrl}?${new URLSearchParams({...params, cursor})}
    );

    const page = await response.json();

    allData.push(...page.data);

    cursor = page.pagination.cursor;

  } while (page.pagination.hasMore);

  return allData;

}

7. Monitoring and Limits:

  • Set reasonable rate limits: 10 requests/minute per API key
  • Log slow queries (>5 seconds) for optimization
  • Alert on exports exceeding 50k records for capacity planning
  • Implement circuit breaker if database load exceeds 80%

Performance Results:

With this implementation:

  • 10k records: 3-5 seconds (single request)
  • 50k records: 15-25 seconds (20 paginated requests)
  • 200k records: 60-90 seconds (80 paginated requests)
  • No timeouts, consistent memory usage under 500MB

The key is moving from “export everything at once” to “stream data in manageable chunks” which scales linearly with dataset size.

That makes sense. How do I implement cursor-based pagination in Mendix REST services? We’re using the standard REST Publish module. Do I need to maintain state between requests, or can this be stateless?

Keep it stateless for scalability. Use offset/limit parameters initially, but for large datasets, encode the last record ID in a token. Your API should accept ?cursor=eyJsYXN0X2lkIjoxMjM0NX0 and decode it to determine where to continue. This avoids issues with data changes between requests that offset-based pagination suffers from. Each response includes the cursor for the next page until you reach the end.

The async approach sounds robust, but our BI tool expects synchronous responses. We’ll need to stick with pagination. What’s a reasonable page size for process analytics data that typically includes 20-30 fields per record?