Device telemetry data stream delays in Pub/Sub delivery impact real-time dashboard

anna_expert · March 9, 2025, 3:00pm

We’re experiencing significant delays in our telemetry data pipeline affecting real-time monitoring dashboards. Devices publish MQTT messages to IoT Core every 30 seconds, but our Pub/Sub subscribers are receiving messages with 2-5 minute delays during peak hours (8AM-6PM). This lag makes our operational dashboards nearly useless for real-time decision making.

Our setup has 3,500 active devices publishing to a single Pub/Sub topic with three subscribers processing different analytics workloads. The Pub/Sub ack deadline is currently set to default (10 seconds), and we’re seeing subscriber throughput drop to about 200 messages/sec during peak times despite much higher publishing rates.

I’m particularly concerned about the MQTT message flow from IoT Core to Pub/Sub and whether our subscriber configuration is causing bottlenecks. Has anyone dealt with similar telemetry delays and found effective tuning strategies?

scott_admin · March 29, 2025, 4:12am

Thanks for the suggestions. We’re using pull subscriptions but haven’t implemented flow control properly. I checked our subscriber logs and found that processing time averages 8-12 seconds per message during analytics operations, which explains the ack deadline issues. I’ll increase the deadline and implement batching. What about the MQTT message flow from IoT Core - could that be a bottleneck too?

garycoder · April 13, 2025, 5:53pm

One more thing to check - are your subscribers running on adequately sized instances? We had similar delays that disappeared when we upgraded from n1-standard-2 to n1-standard-4 instances. The CPU overhead of deserializing and processing IoT telemetry can be significant, especially if you’re doing any transformation before storing data.

jacob_cloud · March 12, 2025, 7:20am

I’ve seen this pattern before. Your ack deadline of 10 seconds is likely too short for processing complex analytics. When subscribers can’t acknowledge within the deadline, Pub/Sub redelivers messages, creating a cascading backlog. Try increasing your ack deadline to 60-120 seconds based on your actual processing time. Also check if your subscribers are CPU-bound during peak hours.

anna_expert · April 10, 2025, 10:36pm

IoT Core to Pub/Sub handoff is generally very fast (under 100ms) unless you’re hitting IoT Core quotas. Check your IoT Core metrics in Cloud Monitoring for any throttling. The real issue is usually subscriber-side. With 3,500 devices at 30-second intervals, you’re looking at roughly 117 messages/second baseline. If three subscribers are all pulling from the same topic, they might be competing for messages inefficiently. Consider using separate topics for different analytics workloads or implement message filtering at the subscriber level.

william_cloud · April 14, 2025, 6:43pm

I had almost identical symptoms last year. Here’s what worked:

First, increase your ack deadline to match actual processing time plus buffer (60-90 seconds for your 8-12 second processing). Second, configure proper flow control on subscribers - set maxOutstandingMessages to limit concurrent processing based on your instance capacity. Third, enable Pub/Sub message ordering if you need it, but be aware it can reduce throughput.

For the MQTT flow, verify your IoT Core registry isn’t hitting the 4000 messages/second per registry limit. If you’re close, consider sharding devices across multiple registries. Also check that your MQTT QoS settings align with your delivery requirements - QoS 1 provides at-least-once delivery which is usually sufficient for telemetry.

Most importantly, implement these Pub/Sub subscription settings:

Ack deadline: 90 seconds
Flow control: maxOutstandingMessages=1000, maxOutstandingBytes=100MB
Pull batch size: 500 messages
Number of concurrent pull streams: 4-8 depending on instance size

Monitor your subscriber lag metric closely. If lag remains high after these changes, you need to scale horizontally by adding more subscriber instances. We went from 2 to 6 subscriber instances and our p99 latency dropped from 4 minutes to under 30 seconds.

Finally, consider implementing a separate fast-path subscriber for real-time dashboard updates that does minimal processing, while heavy analytics run on a different subscription with its own throughput limits. This architecture pattern isolates critical real-time needs from batch processing workloads.

Topic		Views
Device shadow state updates delayed in Google Cloud IoT Core with Pub/Sub integration for real-time monitoring Google Cloud IoT question , perception , sync-lag , pub-sub , mqtt , device-shado , delayed-alerts , gcpiot-25 , qos-config	3	March 23, 2025
Pub/Sub data stream lags under high-throughput IIoT ingest, causing delayed analytics for production monitoring Google Cloud IoT question , performance-opt , dataflow , throughput , pub-sub , data-stream , iiot-support , stream-lag , pubsub-23	6	October 9, 2025
Pub/Sub integration fails for large payloads causing data loss Google Cloud IoT question , integration , pubsub , data-loss , telemetry , payload-size , mqtt , gcpiot-24	4	February 15, 2025
Pub/Sub message acknowledgement delay causes duplicate processing in downstream IoT data pipeline Google Cloud IoT question , integration , python , idempotency , duplicate-processing , pubsub-23 , ack-deadline , message-redelivery	3	December 31, 2024
Data stream latency spikes when processing high-throughput device telemetry in oiot-23 Oracle IoT Cloud question , performance-opt , real-time-analytics , latency-spike , stream-processing , kafka , data-stream , oiot-23 , consumer-groups	5	November 22, 2025
Protocol compatibility challenges for real-time IoT data visualization in financial dashboards Google Cloud IoT discussion , real-time , pubsub , dashboard , analytics-report , http , mqtt , protocol-compat , gcpiot-24	5	April 22, 2025
Pub/Sub message delivery lag causing delays in OTA firmware updates for large device fleets in pubsub-23 Google Cloud IoT question , performance-opt , pubsub , json , ota-updates , firmware-mgmt , pubsub-23 , delivery-lag , subscriber-tuning	3	May 22, 2025
Real-time KPI dashboards show delayed IoT data in performance analysis Siemens Opcenter Execution question , real-time-data , performance-analysis , iot-integration , kpi-dashboard , soc-4-0 , iot-data-lag , decision-delay	3	November 26, 2025
CloudWatch metrics delayed for IoT Core monitoring during high device connection bursts AWS IoT question , monitoring , performance-opt , real-time-monitoring , cloudwatch , metrics-delay , awsiot-25 , iot-core	6	November 15, 2025

Device telemetry data stream delays in Pub/Sub delivery impact real-time dashboard

Related topics