We’re running unattended RPA bots in Pega Cloud for batch invoice processing, but they consistently timeout after 30 minutes during large batch runs (500+ invoices). The bot connects to our SAP system via REST API to pull invoice data, validates against our database, then posts results back.
I’ve checked the session timeout settings in our cloud instance, and they’re set to 60 minutes. The connection pool appears healthy with maxActive=50, but I’m wondering if there’s a heartbeat or keep-alive configuration I’m missing.
Current runtime config:
rpa.session.timeout=3600
rpa.batch.size=500
connection.pool.maxIdle=20
connection.pool.maxWait=30000
The bot runs fine for smaller batches (under 200 records), but anything larger fails with “Session expired” errors. Has anyone dealt with batch chunking strategies or connection pool tuning for long-running cloud RPA scenarios?
Here’s a comprehensive solution that addresses all the timeout issues in cloud RPA deployments:
1. Session Timeout Configuration
Your application-level timeout is correct at 60 minutes, but you need to configure the cloud infrastructure timeout. Work with Pega Cloud support to increase the API gateway timeout to at least 90 minutes, or better yet, implement a session refresh pattern.
2. Batch Chunking Strategy
Restructure your bot to process in manageable chunks with explicit session management:
// Pseudocode - Batch processing with session refresh:
1. Split 500 invoices into chunks of 100
2. For each chunk:
a. Process records with SAP REST calls
b. Commit transaction to database
c. Send heartbeat ping to maintain session
d. Log checkpoint for resume capability
3. If timeout occurs, resume from last checkpoint
3. Connection Pool Tuning
Update your connection pool configuration for cloud reliability:
connection.pool.maxActive=50
connection.pool.maxIdle=25
connection.pool.maxWait=60000
connection.pool.testOnBorrow=true
connection.pool.validationQuery=SELECT 1
The key changes: increased maxWait for cloud latency, added connection validation to prevent using stale connections, and balanced maxIdle to maintain ready connections without resource waste.
4. Heartbeat Interval Adjustment
Implement an active heartbeat mechanism that runs every 5 minutes during processing. This keeps both your Pega session and SAP connection alive. You can use a simple REST call to a lightweight Pega endpoint or a database ping query.
Additional Recommendations:
- Implement exponential backoff retry logic for SAP API calls
- Add comprehensive logging at chunk boundaries for troubleshooting
- Monitor cloud resource utilization - CPU/memory spikes can cause unexpected timeouts
- Consider using Pega’s queue management for better batch orchestration
With these changes, your bot should handle 500+ record batches reliably in the cloud environment. The chunking strategy provides natural breakpoints, the connection pool tuning handles cloud latency, and the heartbeat mechanism prevents premature session expiration.
Check your heartbeat interval settings. For unattended bots in cloud environments, you need an active heartbeat mechanism to keep the session alive during long processing runs. The default heartbeat might be too infrequent for your use case. Also verify that your SAP connection isn’t timing out independently - SAP has its own connection timeout settings that might be interfering.
Thanks for the suggestions. I checked the API gateway settings through Pega Cloud support - you’re right, there’s a 35-minute timeout at that level. I’m going to implement batch chunking, but I’m not clear on the best way to structure this. Should I create separate bot executions for each chunk, or handle the chunking logic within a single bot run with periodic reconnections?