We migrated our ERP system to Azure and users are reporting occasional slowness during peak hours. Initial investigation suggests network latency between application servers and SQL databases might be the culprit, but we need better visibility.
What tools and metrics do you use to monitor network latency impact on ERP performance? Looking for recommendations on Azure Monitor setup, Network Watcher capabilities, and any third-party tools that provide deep insights. Particularly interested in identifying latency sources - is it inter-VNet communication, ExpressRoute, or database connection overhead?
Azure Monitor is essential but you need the right metrics. For VMs, track: Network In/Out Total, Network Bytes/sec. For SQL Database: Connection latency, Worker percent, DTU percent. Create correlation queries in Log Analytics to identify when high network latency coincides with performance degradation. We use Kusto queries to join network metrics with application logs and spot patterns. One insight: our latency spikes correlated with SQL DTU exhaustion, not network issues.
Start with Network Watcher Connection Monitor. Create connection tests between your app servers and SQL databases. It measures latency, packet loss, and path topology every 60 seconds. You can set up alerts when latency exceeds thresholds. We discovered our ExpressRoute circuit was adding 15ms during peak hours due to bandwidth saturation - Connection Monitor graphs made it obvious.
Don’t overlook Application Insights if your ERP has custom components. We instrumented our ERP APIs with App Insights SDK and track dependency calls to SQL. The dependency tracking shows exact latency breakdown: network time vs. SQL execution time vs. application processing. This granularity helped us identify that 70% of our slowness was inefficient SQL queries, not network latency. Network was only contributing 5-10ms, while queries took 500ms+.
For ExpressRoute monitoring, enable circuit metrics in Azure Monitor. Track: BitsInPerSecond, BitsOutPerSecond, ArpAvailability, BgpAvailability. Set up alerts when circuit utilization exceeds 70% - that’s when you start seeing latency increases. We also use Network Performance Monitor (part of Azure Monitor) to measure hop-by-hop latency from on-premises to Azure. It revealed our ISP was adding unexpected latency during certain hours.
Good suggestions. We’ve enabled Connection Monitor and are seeing 8-12ms baseline latency between app and database VNets. During peak hours it jumps to 25-35ms. Need to dig deeper into what’s causing those spikes - could be NSG processing or firewall inspection overhead?