Edge gateway management: comparing maintenance overhead for centralized vs local management strategies

Managing 50+ edge gateways across multiple factory locations with ThingWorx 9.7. Evaluating whether to continue with centralized management (all configuration and updates pushed from ThingWorx central) versus moving to more autonomous local management where gateways handle their own configuration and updates.

Centralized management gives us control but creates bottlenecks - every configuration change requires VPN connectivity and manual deployment. Local management could reduce overhead but raises concerns about consistency and monitoring.

What are your experiences with edge gateway management strategies? Specifically interested in maintenance overhead comparison, upgrade automation approaches, and monitoring strategies that work at scale. How do you balance control versus operational efficiency?

Don’t underestimate the importance of local resilience. Our gateways cache configuration locally and can operate for 7+ days without central connectivity. They queue telemetry data and sync when connection restored. This autonomous capability was crucial during network outages - production continued uninterrupted while centralized management would have failed completely.

We started with centralized management and it became unsustainable around 30 gateways. Switched to hybrid approach - centralized policy definition but local execution. Gateways pull configuration from central ThingWorx on schedule, apply locally, and report status back. This eliminated the VPN dependency for routine operations while maintaining central visibility.

For monitoring at scale, we implemented a health check framework where each gateway reports metrics to ThingWorx every 5 minutes: CPU, memory, disk space, service status, last successful data sync, configuration version. Central dashboard shows health status of all gateways with drill-down capability. Alerts trigger when gateways miss 2 consecutive health check reports or report degraded status. This gives visibility without requiring constant connectivity.