I’m designing an alerting architecture for a manufacturing facility with ThingWorx 9.7 and wanted to get the community’s perspective on edge vs cloud alerting strategies. We have Edge Gateways deployed on the factory floor connected to a cloud ThingWorx instance, and I’m trying to decide where alert evaluation and notification should happen.
Edge gateway alerting offers obvious latency advantages-alerts can be generated in milliseconds locally without round-trip to cloud. This is critical for safety-related alerts where every second counts. However, edge gateways have limited processing power and storage, and managing alert rules across dozens of distributed gateways becomes an operational challenge.
Cloud alerting centralizes all alert logic and makes management much simpler. We can correlate data from multiple facilities and apply sophisticated analytics. But there’s inherent latency from sending data to cloud, and we lose resilience to connectivity loss-if the network goes down, cloud alerting stops working entirely while edge alerting continues independently.
What’s the industry consensus on this trade-off? Are hybrid approaches practical where critical alerts run on edge and analytical alerts run in cloud? How do others handle the alert rule synchronization challenge in distributed edge deployments?
The latency argument for edge alerting is often overstated in my experience. With modern cloud infrastructure and proper network design, you can get data from edge to cloud and trigger alerts in under 500ms consistently. Unless you’re in a true safety-critical scenario requiring <100ms response, cloud alerting is usually sufficient and far easier to manage. The operational overhead of maintaining alert rules on distributed edge gateways is significant-every rule update requires deployment to multiple locations.
I’ve worked with both pure cloud and pure edge alerting deployments, and now advocate strongly for hybrid. The decision matrix we use: Edge alerting for anything requiring <1 second response, anything that must work during network outages, or anything driving local automation. Cloud alerting for cross-facility correlation, predictive analytics requiring ML models, or alerts requiring integration with enterprise systems. The synchronization challenge is manageable with proper DevOps practices-treat alert rules as code, version control them, and use automated deployment pipelines.
The 500ms latency you mention might be acceptable for many use cases, but our concern is more about resilience to connectivity loss. If the WAN link to cloud goes down (which happens several times per year in our industrial environment), cloud alerting becomes completely unavailable. Edge gateways can continue monitoring and alerting locally even during extended network outages. How do others handle this connectivity reliability issue?
Connectivity resilience is the real differentiator in my view. We implemented edge alerting specifically because of unreliable WAN connections at remote sites. The edge gateways buffer alerts locally during outages and forward them to cloud when connectivity restores. For immediate response, edge handles everything. Cloud provides the historical analysis and cross-site correlation after the fact. The management overhead is real, but we’ve automated rule deployment using ThingWorx’s remote management capabilities to push updates to all edge gateways from the central cloud instance.