We’re running into a persistent issue with our Process Automation flows in Utah release. The flow is designed to wait for a dependent incident record to reach ‘Resolved’ state before proceeding to the next stage. However, the wait condition seems to get stuck indefinitely even after the incident status changes.
The wait configuration checks for incident.state == 6 (Resolved) with a timeout set to 48 hours. We’ve verified the incident does get resolved within minutes, but the flow remains in waiting state. We’ve tried adjusting the timeout logic and even added error handling branches, but the flow still doesn’t resume. This is causing significant workflow delays across our IT operations team.
Has anyone encountered similar wait condition behavior in Flow Designer? Are there specific configurations or error handling patterns we should implement to prevent these stuck flows?
Based on your description, you’re hitting a known behavior with wait conditions on related records. Here’s a comprehensive solution addressing all three aspects:
Wait Condition Configuration:
The core issue is that your wait condition is monitoring a related incident record, but Flow Designer’s wait mechanism doesn’t always catch state changes on related tables efficiently. Instead of configuring the wait to check incident.state == 6, you need to restructure this. Create a subflow that explicitly queries the incident table using a ‘Look Up Record’ action, then use that queried record in your wait condition. This ensures you’re checking against live data, not a stale reference.
Error Handling in Flows:
Implement a dual-path approach. Add an ‘If’ condition immediately after your wait that validates whether the wait actually succeeded or timed out. Create two branches: one for successful wait completion and another for timeout scenarios. In the timeout branch, add a ‘Look Up Record’ action to manually check the incident state. If it’s already resolved, proceed with the flow; if not, send a notification to the assignment group and either retry the wait or route to manual intervention. This prevents flows from permanently stalling.
Timeout Logic:
Your 48-hour timeout is configured, but you need additional safeguards. Set up a scheduled job that runs every 4 hours to identify flows stuck in waiting state beyond their expected duration. Use a flow with a ‘Look Up Records’ action to find flow executions where state = ‘waiting’ and sys_created_on is older than your threshold. For each stuck flow, you can either programmatically resume it if conditions are met, or cancel and restart with proper context. Also, reduce your wait timeout to 24 hours and add escalation notifications at the 12-hour mark.
Implement proper flow versioning and test these changes in a sub-production instance first. Monitor your Flow Designer execution logs closely after deployment to validate the improvements.
Are you using dot-walking to check the incident state? I had similar timeout logic problems when referencing related records. The wait condition needs to be on the same table as your flow context, or you need to set up proper event subscriptions.
I’ve seen this before. Check if your wait condition is using the correct table reference. Sometimes the flow loses the record context if the wait is configured to monitor a related record rather than the flow’s primary record. Also verify that your flow trigger is set up to listen for the field updates properly.
I ran into this exact scenario last quarter. The issue often stems from how the wait condition is configured versus how Flow Designer actually monitors state changes. Instead of using a simple wait for condition, consider implementing a more robust pattern with explicit event registration. You might also want to add a parallel timeout branch that fires after a reasonable period to handle edge cases.
This sounds like a polling interval issue. The Flow Designer wait conditions check for state changes at specific intervals, not in real-time. If your incident resolves between polling cycles and something resets the state temporarily, the flow might miss it. I’d recommend adding explicit logging in your wait condition to track when it’s actually checking the state. You could also implement a custom event-based trigger instead of relying solely on the wait condition’s built-in polling mechanism.