Device shadow synchronization fails after device reboot-MQTT

We’re experiencing a critical issue with our edge perception sensors in our manufacturing facility. After any device reboot (scheduled or unplanned), the device shadow in Cumulocity fails to synchronize properly. The MQTT client reconnects successfully, but shadow state updates aren’t reflected in the platform for 10-15 minutes, causing real-time data loss.

I’ve verified the MQTT persistent session configuration looks correct:


mqtt.cleanSession=false
mqtt.qos=1
mqtt.keepAlive=60

The device shadow update topic subscription seems to re-establish, but shadow deltas aren’t being processed. Our edge devices handle reboots by attempting immediate reconnection, but something in the shadow sync workflow breaks. Has anyone encountered similar behavior with perception sensors after reboot cycles? This is impacting our production line monitoring significantly.

The 10-15 minute delay suggests the MQTT broker’s session expiry might be involved. When a device reboots, even with persistent sessions, there’s a brief window where the broker needs to reconcile the old connection state. If your keepAlive is 60 seconds but the device doesn’t send a proper DISCONNECT before rebooting, the broker waits for 1.5x keepAlive before marking the session as expired. Try reducing keepAlive to 30 seconds and ensure your reboot handling sends a graceful disconnect. Also check if you’re subscribing to shadow topics BEFORE publishing the device’s current state on reconnect.

I experienced this exact issue three months ago with perception sensors on c8y-1020. The problem isn’t just MQTT configuration - it’s the interaction between edge device reboot handling and shadow state management. Here’s what you need to verify:

First, your subscription order matters critically. On reconnection, subscribe to shadow topics BEFORE publishing any state:


// Subscribe first
client.subscribe("shadow/update/delta", 1);
client.subscribe("shadow/get/accepted", 1);
// Then request current shadow
client.publish("shadow/get", "{}", 1, false);

Second, implement proper reboot handling with a reconnection strategy that accounts for shadow synchronization timing. The key is requesting the current shadow state immediately after subscription and waiting for the response before publishing device state updates.

Third, adjust your MQTT persistent session parameters. While cleanSession=false is correct, you need to tune the session expiry interval. Add this to your configuration:


mqtt.sessionExpiryInterval=3600
mqtt.receiveMaximum=10

The sessionExpiryInterval ensures the broker maintains your session for up to an hour during unexpected reboots. The receiveMaximum limits in-flight messages to prevent queue overflow during reconnection.

Fourth, implement a shadow synchronization verification step in your edge device reboot sequence. After reconnection, your device should:

  1. Subscribe to shadow/update/delta and shadow/get/accepted topics
  2. Publish to shadow/get to request current shadow state
  3. Wait for shadow/get/accepted response (with 5-second timeout)
  4. Compare received shadow version with local state
  5. Only then publish device reported state updates

This ensures your device and platform are synchronized before attempting state updates. The 10-15 minute delay you’re seeing is Cumulocity’s built-in shadow reconciliation mechanism detecting the desync and forcing a full resync.

Finally, verify your edge device’s MQTT client library properly implements MQTT 5.0 session management. Some older clients don’t handle session expiry correctly, causing the broker to treat reconnections as new sessions despite cleanSession=false.

Implementing this complete reboot handling workflow eliminated our shadow sync issues entirely. The key insight is that MQTT persistent sessions alone aren’t sufficient - you need explicit shadow state verification as part of your reconnection logic. This addresses all three critical areas: MQTT persistent session configuration, device shadow update topic subscription timing, and proper edge device reboot handling with state reconciliation.

I’ve seen this before with MQTT persistent sessions. The cleanSession=false setting is correct, but you need to verify the client ID remains consistent across reboots. If your edge devices generate new client IDs on restart, the broker treats them as new connections and the persistent session context is lost. Check your device initialization code to ensure the client ID is stored persistently (not generated dynamically). Also, what QoS level are you using for the shadow update topic subscription itself?

Good catch on the client ID! I checked and we are using a consistent ID based on the device serial number. The shadow update topic subscription uses QoS 1 as well. The strange part is that after 10-15 minutes, everything starts working normally again. It’s like there’s a timeout or retry mechanism kicking in. During that gap, we’re blind to sensor state changes which is unacceptable for our production monitoring.