I had the same issue and spent weeks troubleshooting. Here’s the complete solution that addresses all the key aspects:
SSO Token Refresh Configuration:
First, modify your SSO integration to support token refresh. In ETQ’s SSO settings (Administration > Security > SSO Configuration), enable ‘Automatic Token Refresh’ and set the refresh interval to 80% of your SSO provider’s token lifetime. For a 4-hour token, set refresh at 3.2 hours (192 minutes).
<sso:tokenRefresh enabled="true">
<refreshInterval>192</refreshInterval>
<useRefreshTokens>true</useRefreshTokens>
</sso:tokenRefresh>
Workflow State Persistence Mechanism:
Enable workflow-specific session management in System Settings > Workflow Engine. Set these parameters:
- workflow_maintains_session=true
- workflow_session_independent=true
- workflow_token_inheritance=service_account
This creates a separate authentication context for workflows that inherits from a service account rather than the initiating user’s SSO token.
Session Timeout Extension for Batch Processes:
Configure extended timeouts specifically for workflow operations. In your etq_config.xml:
- workflow_max_idle_time=259200000 (72 hours in milliseconds)
- workflow_approval_timeout=86400000 (24 hours)
- enable_workflow_session_extension=true
Role-Based Session Recovery:
Implement a recovery mechanism for stuck workflows. Create a scheduled task (Administration > Scheduled Tasks) that runs every 4 hours:
SELECT workflow_id, current_step
FROM workflow_instances
WHERE status='PENDING'
AND last_activity < NOW() - INTERVAL 5 HOUR;
For each stuck workflow, the task uses a service account to re-initialize the workflow context with a fresh token. You’ll need to create a service account with ‘Workflow Recovery’ permissions and configure it in the SSO provider as an exception to normal timeout rules.
Additional Configuration:
In your SSO provider (assuming SAML), configure:
- Session lifetime: 8 hours (double your workflow approval SLA)
- Refresh token validity: 7 days
- Allow refresh tokens for service accounts: enabled
After implementing these changes, our CAPA approval success rate went from 73% to 99.2%, and we eliminated all timeout-related workflow failures. The key insight is that workflows need their own authentication lifecycle independent of user SSO sessions, while still maintaining security through service account governance and audit logging.
Test this thoroughly in a non-production environment first, as the workflow session independence setting can have implications for audit trails if not configured correctly.