After a network failover event at one of our remote manufacturing sites last week, several widgets on our IoT Operations dashboard stopped receiving updates. The dashboard shows stale data from before the failover, even though the backend systems are processing current telemetry normally. Refreshing the browser temporarily fixes it, but the widgets freeze again after 20-30 minutes. We’re running cciot-24 with the standard dashboard client. The network team confirmed the failover was clean with no packet loss during the transition. Has anyone experienced WebSocket reconnection issues after network changes? We need to understand if this is a dashboard client patching issue or something with how the system handles network failover events.
This sounds like a WebSocket connection state issue. When the network failover occurs, the WebSocket connections from the dashboard client to the backend likely aren’t being properly re-established. Check your browser’s developer console for WebSocket errors or connection state changes during the freeze periods.
The 1006 status indicates the connection closed without a proper WebSocket close handshake. This typically happens when network infrastructure changes occur. The dashboard client in cciot-24 has a known issue where the reconnection logic doesn’t properly invalidate stale widget subscriptions. You’ll need to apply the dashboard client patch that was released in cciot-24.3. This patch includes enhanced WebSocket reconnection handling with automatic subscription refresh.
Before patching, also verify your load balancer configuration. Some load balancers don’t handle WebSocket connections properly during failover events, especially if session affinity isn’t configured correctly. Make sure sticky sessions are enabled for the dashboard service endpoints.
Another consideration is implementing client-side connection health monitoring. Add a heartbeat mechanism that proactively detects stale connections and triggers full page reloads when necessary. This provides a better user experience than waiting for manual refresh.
Here’s a complete solution addressing all three key areas:
WebSocket Reconnection: The core issue is that cciot-24’s dashboard client has insufficient WebSocket reconnection logic. After network failover, the client attempts to reconnect but doesn’t properly re-establish widget data subscriptions. Apply the cciot-24.3 patch immediately - this includes the enhanced reconnection handler that:
- Detects abnormal closures (1006 status)
- Implements exponential backoff for reconnection attempts (1s, 2s, 4s, 8s intervals)
- Automatically resubscribes all active widgets after successful reconnection
- Provides visual indicators in the UI when connection is lost
Dashboard Client Patching: After applying cciot-24.3, configure the enhanced reconnection parameters in your dashboard configuration file:
- Set websocket.reconnect.max.attempts to 10 (up from default 5)
- Set websocket.ping.interval to 15000ms (down from 30000ms)
- Enable websocket.reconnect.preserve.state to maintain widget filter/sort settings across reconnections
The patch also includes a client-side connection health monitor that runs every 10 seconds and proactively detects zombie connections.
Network Failover Handling: On the infrastructure side, ensure your load balancer properly handles WebSocket connections during failover:
- Enable sticky sessions (session affinity) for the IoT Operations dashboard service
- Configure WebSocket-aware health checks that verify both HTTP and WebSocket endpoint availability
- Set connection draining timeout to at least 60 seconds during failover to allow graceful WebSocket closure
- Implement connection retry logic at the load balancer level with 3 retry attempts before marking backend as unhealthy
Additionally, configure your network monitoring to track WebSocket connection metrics (active connections, reconnection rate, abnormal closures) so you can proactively identify similar issues in the future. After implementing these changes, your dashboard widgets should maintain continuous updates even through network failover events, with automatic recovery in under 5 seconds.
Checked the console logs and found repeated WebSocket connection attempts with status 1006 (abnormal closure) about every 5 minutes after the widgets freeze. It looks like the client is trying to reconnect but failing silently without updating the UI.