Our monitoring agents deployed on edge devices disconnect from Cisco IoT Monitoring platform precisely every 4 hours. The pattern is consistent across all 150+ deployed agents, suggesting a configuration issue rather than network problems.
I suspect SSL certificate chain validation or certificate refresh timing, but I’m not seeing certificate errors in the logs. The agents use persistent connections with keep-alive enabled, but something is forcing reconnection every 240 minutes.
Agent Disconnect Event
Timestamp: 2025-06-11T04:52:33Z
Last Successful Heartbeat: 04:52:15Z
Connection Duration: 14,398 seconds (3h 59m 58s)
SSL Session: Terminated
The monitoring data gaps during reconnection (typically 30-45 seconds) are causing false alerts. Has anyone encountered similar periodic disconnection patterns? I’m running agent version 24.1.3.
Your keep-alive configuration might also need tuning. If the TCP keep-alive interval is too long, the platform might not detect stale connections promptly. Set TCP keep-alive to 60 seconds with 3 probes. This ensures dead connections are detected and cleaned up quickly, preventing false disconnection alerts.
Good points. I checked the agent config and found connection_pool_max_lifetime=14400 which matches the disconnect interval. However, I’m not sure if simply increasing this value is the right solution. Should I disable the lifetime limit entirely or set it to something higher? Also concerned about the certificate chain - how do I verify all intermediates are included?
I’ve seen this with connection pooling configurations. If your pool has a max connection lifetime set to 14400 seconds (4 hours), it will forcibly close and recreate connections. Check your agent’s connection pool settings. Also, verify the cloud platform’s load balancer isn’t rotating backend connections on a 4-hour schedule.
Don’t disable the connection lifetime entirely - stale connections can accumulate and cause other issues. Set it to 24 hours (86400 seconds) and implement graceful reconnection logic. For certificate chain validation, use openssl s_client to verify the complete chain from your agent’s perspective. The platform endpoint should present the full chain including intermediates.