We’re experiencing critical shadow synchronization failures after pushing firmware v2.8.1 to our industrial sensor fleet. Post-update, devices disconnect from MQTT with “connection refused” errors and shadow state stops syncing.
The issue appears related to MQTT persistent session settings - devices reconnect but shadow topics aren’t being subscribed correctly. We’ve verified device registry credentials are valid in Cloud IoT Core console, but wondering if credential validation timing changed with the new firmware.
MQTT Error: Connection lost (5)
Topic: /devices/{device-id}/state
Shadow sync timeout after 45 seconds
Our shadow topic structure follows Google’s recommended format, but concerned we might be missing compliance requirements for the updated firmware. Real-time monitoring dashboard shows 68% of updated devices affected. Need to understand root cause before rolling out to remaining fleet.
I’ve seen similar behavior with MQTT persistent sessions after firmware updates. The connection refused error typically indicates the device is trying to resume a session that the broker no longer recognizes. Check if your firmware update resets the client session state - Cloud IoT Core maintains sessions for 24 hours by default, but if the device reconnects with cleanSession=false and the session expired, you’ll get connection refused.
Thanks for the insights. I checked our firmware code and found the cleanSession flag was hardcoded to false in v2.8.1, but we’re not properly handling session resumption logic. The JWT tokens are being generated correctly with 1-hour expiry. The device registry shows all credentials as valid, but I’m wondering if there’s a timing issue between connection establishment and credential validation completing on the broker side.
Check your MQTT QoS settings for the shadow topics. If you’re using QoS 1 or 2 with persistent sessions, incomplete message acknowledgments from the old session can block new subscriptions. I’d recommend temporarily setting cleanSession=true in your firmware to force fresh sessions, then monitor if shadow sync resumes. Also, enable detailed MQTT logging in Cloud IoT Core to see exactly where the subscription handshake fails.
The shadow topic structure compliance is critical here. After firmware updates, I always verify the device is publishing to the correct state topic and subscribing to the config topic. Your error shows /devices/{device-id}/state - make sure the device ID matches exactly what’s registered in Cloud IoT Core, including case sensitivity. Also verify the JWT token generation in your new firmware hasn’t changed - credential validation happens on every connection attempt.
Had this exact issue last quarter. The problem was our device reconnection logic didn’t wait for the CONNACK before attempting to subscribe to shadow topics. After firmware update, the connection sequence timing changed slightly, causing subscription requests to arrive before the broker completed credential validation. We added a 500ms delay after CONNACK and before topic subscriptions - solved the issue completely. Also verify your device isn’t flooding reconnection attempts, as Cloud IoT Core has rate limits that can cause temporary credential validation failures.
I’ll provide a comprehensive solution addressing all three areas:
MQTT Persistent Session Settings:
The root cause is session state mismatch. Your firmware v2.8.1 needs proper session handling:
// Set cleanSession=true for first connection after update
MqttConnectOptions options = new MqttConnectOptions();
options.setCleanSession(true);
options.setConnectionTimeout(30);
After successful connection, subsequent reconnections can use cleanSession=false. Implement session state tracking in device memory to know when to force clean sessions.
Device Registry Credential Validation:
The timing issue you’re experiencing is real. Cloud IoT Core validates JWT tokens asynchronously. Your firmware must:
- Wait for CONNACK with return code 0 (connection accepted)
- Add 200-500ms delay before any publish/subscribe operations
- Implement exponential backoff for reconnection attempts (start at 1s, max 60s)
- Regenerate JWT tokens if connection fails with code 5 (not authorized)
Verify your JWT token claims include correct project ID, registry ID, and device ID. Token expiry should be 60 minutes maximum.
Shadow Topic Structure Compliance:
Ensure exact topic format compliance:
- State publishing: `/devices/{device-id}/state
- Config subscription: `/devices/{device-id}/config
- Commands subscription: `/devices/{device-id}/commands/#
Device ID must match registry exactly (case-sensitive). After firmware update, devices should:
- Connect with clean session
- Wait for CONNACK
- Subscribe to config topic with QoS 1
- Publish initial state message
- Wait for state acknowledgment before normal operation
Immediate Fix:
Push a hotfix firmware that sets cleanSession=true and adds the post-CONNACK delay. This will resolve 95% of your affected devices. For the remaining 5%, check Cloud IoT Core device logs for specific error codes - likely rate limiting or malformed JWT tokens.
Monitor your device connection metrics in Cloud Console - you should see connection success rate improve within 2 hours of hotfix deployment. Enable debug logging temporarily to capture the full MQTT handshake sequence for any devices that still fail.