Travel expense approval workflow times out during automated deployments

sandraking · August 12, 2025, 1:21pm

Our travel expense approval workflows are timing out consistently after automated deployments in ICS 2022. The issue manifests within 2-3 hours post-deployment when approval volumes pick up. We’re seeing database connection pool exhaustion and workflow batch optimization failures.

The timeout configuration appears correct at 120 seconds, but workflows are failing at around 90 seconds during the approval routing phase. Error logs show:


WorkflowTimeout: Approval routing exceeded 90s threshold
ConnectionPoolExhausted: Unable to acquire connection after 30s

We’ve run load testing pre-deployment that shows the workflow handles 200 concurrent approvals fine in staging, but production fails at around 150 concurrent workflows post-deployment. The database connection pooling settings seem adequate (maxActive=100, maxIdle=50), yet we’re hitting exhaustion.

Could this be related to how the deployment process reinitializes the workflow engine? Looking for insights on proper connection pool configuration for post-deployment workflow stability.

sandeep143 · August 28, 2025, 2:39pm

We had identical timeout issues in ICS 2022 travel expense workflows. The problem was workflow batch optimization settings not being compatible with our deployment process. The workflow engine batches approval requests for efficiency, but post-deployment the batch processor wasn’t scaling correctly. We modified the batch size from 50 to 20 for the first two hours after deployment, then gradually increased it. Also implemented a connection pool pre-warming script that runs before the workflow engine fully initializes. These two changes eliminated our timeout issues completely.

ronald_tech · August 17, 2025, 4:44am

Your load testing methodology might be flawed. Testing 200 concurrent approvals in staging doesn’t replicate the post-deployment state where the workflow engine cache is cold. The first few hours after deployment are always the most vulnerable because workflow definitions, user role caches, and approval routing tables all need to be rebuilt in memory. Your 150-workflow failure threshold suggests the cache warming is causing the bottleneck, not the raw processing capacity. Try load testing immediately after a staging deployment to see if you can reproduce the issue.

marie_ops · September 8, 2025, 9:24am

Excellent troubleshooting suggestions from everyone. After systematically testing each theory, I’ve identified and resolved the issue. It was a combination of connection pool initialization problems and workflow batch optimization misconfiguration.

Root Cause Analysis:

The core issue was that our deployment process wasn’t properly coordinating the workflow engine restart with database connection pool initialization. The workflow engine was accepting approval requests before the connection pool had scaled to operational capacity.

Database Connection Pooling Solution:

Implemented a pre-warming strategy in our deployment automation:

-- Connection pool warm-up script
SELECT COUNT(*) FROM workflow_definitions;
SELECT COUNT(*) FROM approval_routing_rules;
SELECT COUNT(*) FROM user_role_assignments;

These queries force connection establishment and cache population before the workflow engine goes live. We also adjusted the pool configuration:

minIdle increased from 20 to 40
initialSize set to 40 (was defaulting to 10)
maxWait reduced from 30s to 10s (fail faster rather than queue)

Workflow Batch Optimization:

Modified the post-deployment batch processing strategy. For the first 90 minutes after deployment:

Batch size: 15 (down from 50)
Batch interval: 10 seconds (up from 5 seconds)
Max concurrent batches: 5 (down from 10)

This gives the workflow engine breathing room to build its caches without overwhelming the connection pool.

Timeout Configuration:

Actually increased the timeout to 180 seconds for the first two hours post-deployment, then automatically reverts to 120 seconds. This accommodates the cold-cache performance profile without masking genuine issues once the system stabilizes.

Load Testing Improvements:

Revised our load testing to include a “post-deployment simulation” scenario. We now restart the workflow engine in staging, wait 60 seconds, then immediately hit it with production-level load. This revealed that our previous testing was unrealistic - we were testing against a warm system that had been running for hours.

Results:

After implementing these changes across three deployments:

Zero workflow timeouts in the critical first 3 hours post-deployment
Connection pool utilization peaks at 65% (was hitting 100%)
Average approval routing time: 2.3 seconds (was 45+ seconds post-deployment)
Successfully handled 220 concurrent approvals 30 minutes after deployment

The key insight: automated deployments need deployment-aware configuration profiles, not just static production settings. The first few hours after deployment represent a distinct operational state that requires specific tuning.

marie_ops · September 4, 2025, 9:22pm

Check if your deployment is properly shutting down existing workflow instances before starting new ones. We discovered that our automated deployment was leaving orphaned workflow threads that held onto database connections. These zombie threads weren’t visible in normal monitoring but consumed connection pool resources. After adding explicit workflow engine shutdown commands to our deployment script with a 60-second drain period, the connection pool exhaustion disappeared. The drain period lets active workflows complete before the engine restarts.

hanstech · August 15, 2025, 11:05am

Connection pool exhaustion post-deployment usually indicates the pool isn’t warming up properly. When the workflow engine restarts, it initializes with minimum connections (typically 10-20), not the configured maxActive. Under sudden load, the pool tries to scale up but can’t create connections fast enough. Add a post-deployment warm-up script that pre-establishes connections before enabling workflow processing. Also check your connection validation query - if it’s slow, connection acquisition becomes a bottleneck.

Topic		Replies	Views
Approval workflow delays due to background job bottleneck in change management SAP PLM question , performance-opt , change-mgmt , workflow , sap-2020 , approval-mgmt , job-scheduling , sm37 , background-job	4	0	June 10, 2025
CAPA approval workflow stuck at manager level in cloud deployment ETQ Reliance question , rest-api , approval-routing , capa , workflow-engine , etq-2021 , json , cloud-deployment , workflow-state	6	0	March 19, 2025
Approval workflow performance degradation in mf-25.3 - parallel vs sequential processing Micro Focus ALM / Quality Center discussion , workflow-automation , performance-degradation , database-tuning , compliance-validation , upgrade-mgmt , mf-25-3 , micro-focus-alm , workflow-latency	3	0	January 11, 2026
Manufacturing plan batch workflow stuck on approval step, no escalation triggered, halting production scheduling Infor SCM question , workflow-engine , approval-process , workflow-automation , workflow-stuck , escalation-rules , manufacturing-plan , is-2022-2 , production-delay	5	0	April 12, 2025
Workflow automation task stuck in 'In Progress' after approval completed SAP Customer Experience (SAP CX) question , java , workflow-engine , scx-2105 , workflow-automation , workflow-stuck , quote-configure-price , process-instance , backend-logs	5	0	March 19, 2025
Workflow approval data sync fails with SAP backend due to connection timeout OutSystems question , rest-api , sap-integration , data-integration , workflow-mgmt , connection-timeout , async-processing , integration-studio , outsystems-11	3	0	March 19, 2025
Purchase approval workflow bottlenecks in cloud: causes and optimization strategies SAP S/4HANA discussion , performance-opt , cloud-deploy , workflow , notification , sap-2020 , fiori , purchase-mgmt , approval-delays	6	0	July 18, 2025
Travel expense approval workflow stuck at director level for multi-level approvers after SuiteFlow cloud deployment Workday question , cloud-deploy , business-process , expense-mgmt , travel-mgmt , wd-r2-2023 , workflow-stuck , security-groups , workday-business	3	1	April 30, 2025
IP management workflow approval fails in Teamcenter 13.1 Teamcenter question , ip-mgmt , workflow-process , workflow-engine , tc-13-1 , user-roles , teamcenter-work , approval-failure , delayed-tracking	5	0	March 16, 2025

Travel expense approval workflow times out during automated deployments

Related topics