Azure Analytics data gateway times out when connecting to on-premises SQL Server for reporting

We’re experiencing timeout issues with our Azure Data Gateway when trying to connect to our on-premises SQL Server for reporting. The gateway was working fine until last week, but now connections consistently fail after 30 seconds.

Our setup: Gateway version 3000.123.4, SQL Server 2017 on local network, Azure Analytics workspace in East US region. The gateway configuration shows as “Online” in the portal, but queries never complete.

Error from gateway logs:


Connection timeout expired. The timeout period elapsed
during attempt to consume pre-login handshake response.
at System.Data.SqlClient.SqlConnection.OnError()

We’ve checked SQL Server connectivity from the gateway machine using SSMS and that works perfectly. Network latency tests show ~45ms average to our SQL Server. Is there a specific gateway configuration setting we’re missing for handling network latency?

Before going the ExpressRoute route (expensive!), try these diagnostics: Run continuous ping tests from the gateway machine to SQL Server during a failure window to capture exact latency patterns. Also check if your corporate network has any proxy or packet inspection that might be interfering. We had a similar case where DPI was adding 150ms+ to SQL connections specifically.

Thanks for the quick response. Checked firewall rules - port 1433 is open and the gateway service has explicit allow rules in Windows Firewall. We also added the gateway service account to SQL Server logins with appropriate permissions. Still getting the same timeout after exactly 30 seconds. Could this be related to the gateway’s connection pool settings or some timeout configuration in Azure Analytics itself?

The latency spikes are definitely a red flag. 200ms bursts during the pre-login handshake can cause failures even with generous timeouts because SQL Server’s TDS protocol is sensitive to latency during authentication. Have you considered setting up a site-to-site VPN or ExpressRoute to stabilize the connection? Also, check if you have any QoS policies that might be throttling the gateway traffic during peak hours.

Based on your symptoms, this is a multi-factor issue involving gateway configuration, network latency, and SQL Server connectivity. Here’s a comprehensive solution:

1. Gateway Configuration Optimization: Edit the gateway configuration file (Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config):

  • Set ConnectionTimeout to 120 seconds minimum
  • Add CommandTimeout property set to 180 seconds
  • Enable connection pooling: MinPoolSize=5, MaxPoolSize=100
  • Set ConnectRetryCount=3 and ConnectRetryInterval=10

2. Network Latency Mitigation: Your 200ms latency spikes are the root cause. Implement these fixes:

  • Configure QoS on your network to prioritize SQL traffic (port 1433) with DSCP marking
  • If possible, implement Azure ExpressRoute or site-to-site VPN for consistent low-latency connectivity
  • As a temporary measure, schedule heavy reporting queries during off-peak hours when latency is stable

3. SQL Server Connectivity Hardening: On SQL Server side:


-- Increase remote query timeout
EXEC sp_configure 'remote query timeout', 300;
RECONFIGURE;
  • Enable TCP/IP protocol and disable named pipes to reduce handshake overhead
  • Set SQL Server network packet size to 8192 bytes (optimal for higher latency)
  • Configure connection resiliency in your connection strings: ConnectRetryCount=3

4. Gateway Service Account: Verify the service account has these SQL permissions:


GRANT CONNECT SQL TO [DOMAIN\GatewayServiceAccount];
GRANT VIEW SERVER STATE TO [DOMAIN\GatewayServiceAccount];

5. Monitoring and Validation:

  • Install Network Monitor on gateway machine to capture TDS handshake packets during failures
  • Enable gateway verbose logging: Set TraceLevel to “Verbose” in config
  • Monitor Windows Performance Counters: “Network Interface\Bytes Total/sec” during query execution

The combination of increased timeouts, connection retry logic, and network QoS should resolve your timeouts. If latency spikes persist above 150ms regularly, ExpressRoute becomes necessary for production reliability. Test each change incrementally and monitor the gateway logs for improvements.

I’ve seen similar issues before. The 30-second timeout is often the default connection timeout in the gateway configuration file. Look in the Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config file for ConnectionTimeout settings. You might need to increase it to 60 or 90 seconds, especially if you’re seeing 45ms latency which can add up during the handshake process.

Found the config file and saw ConnectionTimeout was set to 30. Increased it to 90 seconds and restarted the gateway service, but still timing out. Interestingly, I noticed our network monitoring shows periodic spikes in latency up to 200ms during business hours. Could these latency spikes be causing the handshake to fail even with the increased timeout? The gateway logs don’t show any retry attempts.