Monitoring device health: SNMP vs REST API approaches for large-scale deployments

johnninja · September 23, 2025, 6:09am

We’re evaluating monitoring strategies for our expanding IoT deployment (currently 600 devices, growing to 2000+ next year). Trying to decide between SNMP polling for device health monitoring versus REST API-based health checks. SNMP has lower overhead and is proven for network device monitoring, but REST API provides richer device context and integrates better with Oracle IoT Cloud Platform. What are the scalability trade-offs? Has anyone successfully deployed either approach at scale (1000+ devices) and can share lessons learned on polling intervals, data collection overhead, and monitoring infrastructure requirements?

gregoryadmin · September 23, 2025, 6:24am

We use REST API health checks for 1500 devices. The main advantage is tight integration with Oracle IoT Cloud Platform’s device registry and shadow state. Health check responses include not just connectivity status but also firmware version, last telemetry timestamp, error counts, and custom health metrics. The overhead is manageable if you implement smart polling - we check critical devices every 2 minutes, standard devices every 10 minutes, and low-priority devices every 30 minutes based on device criticality tiers.

dorothy_tech · September 27, 2025, 10:41pm

SNMP polling has served us well for 800 industrial devices. The key benefit is that SNMP is lightweight and doesn’t require application-layer processing on devices - most embedded systems support SNMP natively. We poll standard MIBs (sysUpTime, ifOperStatus) every 5 minutes and get reliable health indicators with minimal device CPU overhead. However, SNMP doesn’t integrate well with cloud-native monitoring tools, so we had to build custom bridges to feed SNMP data into our monitoring dashboard.

betty_expert · October 23, 2025, 6:07am

Great insights from everyone. Based on the discussion, I’m leaning toward Maria’s hybrid approach. For our deployment profile (mix of industrial gateways and edge sensors), SNMP makes sense for infrastructure health while REST APIs provide application-layer visibility. The key challenge is orchestrating both monitoring streams into a unified view. Has anyone implemented tooling to correlate SNMP and REST API health data?

raymond_cloud · October 28, 2025, 8:01pm

We built a monitoring aggregator service that collects both SNMP traps and REST API health responses, normalizes them into a common schema, and publishes to our central monitoring platform. The aggregator runs on Kubernetes and scales horizontally based on device count. For correlation, we use device ID as the common key across both data sources. The normalized health events feed into our alerting rules engine, which triggers alerts based on combined SNMP and REST health indicators.

Some practical considerations from our implementation:

SNMP Polling: We use 5-minute intervals for infrastructure metrics (CPU, memory, uptime). SNMP is great for detecting low-level issues like resource exhaustion or network connectivity problems. The overhead is minimal - our SNMP polling infrastructure handles 1500 devices with a single monitoring server consuming less than 2 CPU cores.

REST API Health Checks: We poll application health every 10 minutes for standard devices, 2 minutes for critical devices. The REST responses include custom health indicators like message queue depth, last successful telemetry upload timestamp, and device-specific error codes. This gives us application-layer visibility that SNMP can’t provide.

Scalability Trade-offs: The hybrid approach does add complexity, but it scales well. SNMP handles the high-frequency, low-overhead infrastructure monitoring, while REST APIs provide deep health insights at lower frequency. The key is using the right tool for each monitoring dimension rather than forcing one protocol to do everything.

For your 2000-device deployment, I’d recommend: SNMP for all devices (infrastructure health), REST API for critical devices and gateways (application health), and a monitoring aggregator to unify the data streams. This balances overhead, richness, and scalability effectively.

ryanninja · October 16, 2025, 9:13pm

Scalability wise, REST API health checks can become a bottleneck as you approach 2000+ devices. Each health check is an HTTP request with TLS handshake, authentication, and JSON parsing overhead. At 2000 devices with 5-minute polling, that’s 400 requests per minute sustained load. Make sure your API gateway and backend services can handle this. We had to implement connection pooling and request batching to avoid overwhelming the platform. SNMP scales better from a protocol perspective but lacks the semantic richness you need for IoT-specific health metrics.

helenadmin · October 10, 2025, 6:33am

Consider a hybrid approach. Use SNMP for basic connectivity and resource monitoring (CPU, memory, network stats) because it’s efficient and standardized. Layer REST API health checks on top for application-specific health metrics (queue depths, error rates, business logic status). This gives you the efficiency of SNMP for infrastructure monitoring plus the richness of REST APIs for application health. The trade-off is increased monitoring complexity, but it scales well if you automate the monitoring configuration.

Topic		Views
Comparing IoT device health monitoring via API SDK versus Cloud Console for production fleet management Google Cloud IoT discussion , monitoring , reporting-analytics , alerting , api-sdk , fleet-management , gcpiot-24 , cloud-console	4	November 23, 2025
Monitoring IoT device health: Cloud Logging vs third-party tools for real-time alerting and diagnostics Google Cloud IoT discussion , monitoring , connectivity , observability , alerting , cloud-logging , device-health , monitoring-strategy , gcpiot-24	7	October 23, 2025
Choosing between MQTTs and HTTPs for cross-platform integration architecture Oracle IoT Cloud discussion , integration , security , rest-api , https , architecture-design , protocol-compat , oiot-23 , mqtts	5	April 21, 2025
Edge monitoring vs central monitoring for IoT device health tracking SAP IoT discussion , monitoring , operations , edge-compute , network-resilience , alerting , device-health , sapiot-24	6	April 27, 2025
Protocol compatibility considerations for AWS IoT monitoring with multi-vendor devices: MQTT vs HTTPS and hybrid approaches AWS IoT discussion , monitoring , architecture , rules-engine , https , mqtt , awsiot-24 , protocol-compat , device-shadow	7	January 16, 2025
Comparing MQTT vs REST for gateway device communication: reliability and scalability Cumulocity IoT discussion , rest-api , scalability , reliability , mqtt , protocol-comparison , api-sdk , gateway-mgmt , c8y-1019	5	October 20, 2025
MQTT vs REST API for event ingestion: which protocol better handles high-frequency sensor data Oracle IoT Cloud discussion , performance-opt , rest-api , event-processing , mqtt , protocol-selection , data-stream , oiot-22 , iot-cloud-services	6	July 14, 2025
MQTT vs HTTP protocols for IoT gateway management: performance comparison Oracle IoT Cloud discussion , performance , http , mqtt , protocol-selection , gateway-mgmt , device-mgmt , oiot-pm , gateway-protocol	7	July 7, 2025
Firmware update delivery for asset tracking: MQTT vs REST API protocol comparison Oracle IoT Cloud discussion , rest-api , reliability , asset-mgmt , firmware-update , mqtt , protocol-comparison , device-agent , oiot-23	5	May 31, 2025

Monitoring device health: SNMP vs REST API approaches for large-scale deployments

Related topics