We’re experiencing severe performance degradation in our case management application when creating 100+ cases simultaneously during peak hours. The system becomes unresponsive and we’re seeing SLA breaches with case creation times exceeding 5 minutes.
Our current setup uses Power Automate flows triggered on case creation to populate related records and send notifications. During morning rush (8-10 AM), when field agents submit batch reports, we’re hitting what appears to be API throttling limits. The flows queue up and some timeout completely.
We’ve tried increasing the flow concurrency settings from default to 50, but that only made things worse. Error logs show “429 Too Many Requests” responses. We need guidance on proper batch processing strategies to handle this volume without compromising response times or hitting platform limits.
Has anyone dealt with similar high-volume scenarios in case management apps? What’s the recommended approach for batch case creation that respects Power Platform throttling while maintaining acceptable performance?
Let me provide a comprehensive solution addressing all three critical areas:
API Throttling Limits Management:
Power Platform enforces user-based throttling (6,000 requests per 5 minutes per user) and flow-based limits. Your current approach triggers individual flows per case, quickly exhausting these limits. Monitor your flow analytics to identify which operations consume the most API calls. Implement service principal authentication for backend flows instead of user context - this provides higher limits and isolates batch operations from user quotas.
Flow Concurrency Settings Optimization:
Reducing concurrency is actually the solution here, not increasing it. Set your flow concurrency to 1-5 maximum for batch operations. This seems counterintuitive, but it prevents the thundering herd problem you’re experiencing. High concurrency means multiple flow instances compete for the same throttling quota simultaneously, causing widespread failures. Lower concurrency with longer run times is more reliable than high concurrency with constant throttling.
Batch Processing Strategy Implementation:
Implement a three-tier architecture:
-
Intake Layer: Quick case creation that writes minimal data to Dataverse and immediately returns case ID to user. This satisfies the immediate feedback requirement and completes in under 2 seconds.
-
Queue Management: Create a “CaseProcessingQueue” table with fields: CaseID, Priority, Status, CreatedOn, ProcessedOn. The intake layer writes here after case creation.
-
Batch Processor: Scheduled flow running every 3-5 minutes that:
- Queries queue for unprocessed items (Status = ‘Pending’)
- Orders by Priority DESC, CreatedOn ASC
- Processes in batches of 25 cases
- Uses “Apply to each” with concurrency of 1
- Implements try-catch with exponential backoff (30s, 60s, 120s delays)
- Updates queue status to ‘Completed’ or ‘Failed’
Additional Optimizations:
- Use Dataverse batch requests for related record creation (single API call for multiple records)
- Cache lookup data in flow variables to avoid repeated queries
- Implement flow result history cleanup to prevent storage bloat
- Set up Application Insights monitoring to track throttling patterns and optimize batch timing
Expected Results:
This architecture should handle 500+ cases per hour while maintaining sub-3-second response times for case creation and processing all queued items within 10 minutes during peak load. You’ll eliminate SLA breaches and provide users with reliable, predictable performance.
Start with the queue table and batch processor, then gradually migrate your existing flows to this pattern. Test with 50 cases first, then scale to your full volume.
One more optimization - review what your flows are actually doing. Often I see flows making multiple API calls per case when they could batch operations. For example, if you’re creating related records, use the batch API endpoints where possible instead of individual create operations.
Also consider implementing exponential backoff in your flows. When you hit throttling, don’t just retry immediately. Use the “Configure run after” settings to add delays between retries. Start with 30 seconds, then 1 minute, then 2 minutes. This gives the API time to recover and prevents cascading failures during peak load periods.