Device shadow synchronization fails for offline devices when they reconnect to network

Our field devices frequently go offline due to poor cellular connectivity in remote locations. When they reconnect, the device shadow state doesn’t synchronize properly - the shadow shows stale data from before the disconnect. This causes our automation rules to fire incorrectly because they’re acting on outdated state.

We’re using MQTT for device communication, and the devices typically reconnect within 1-6 hours of going offline. During that time, the cloud-side shadow gets updated by our control system with desired states, but when the device comes back online, these desired states aren’t being applied to the device. The shadow reports show “last updated” timestamps from before the disconnect.

The automation rules misfire because they check shadow state to determine if a device needs configuration updates. With stale shadow data, devices don’t get the updates they need. We’ve verified that the devices are successfully reconnecting and can receive messages, but the shadow sync just doesn’t happen. Is there a configuration for MQTT persistent sessions or retained messages that we’re missing?

Beyond persistent sessions, you need to use retained messages for the shadow desired state topic. When you publish desired state changes to the shadow, set the MQTT retain flag to true. This ensures the latest desired state is immediately delivered when a device reconnects, even if it was offline when the update was published. The device should subscribe to both the shadow delta topic and the full shadow document topic on reconnection.

Are you using clean session or persistent session on the MQTT connection? For device shadow sync to work properly with offline devices, you need persistent sessions enabled. Also check if your devices are subscribing to the shadow delta topic when they reconnect.

We’re using clean session (cleanSession=true) because we thought that would prevent message queue buildup. Should we switch to persistent sessions? What about the message queue - won’t that grow unbounded if devices are offline for hours?