I’m evaluating different approaches for connecting Tableau to our 50GB+ data warehouse and would love to hear real-world experiences. We’re currently using native SQL Server connector with live connections, but dashboard load times are becoming problematic (15-30 seconds for complex views).
I’m considering three options: 1) Continue with live connections but optimize queries, 2) Switch to extracts with scheduled refreshes, or 3) Use Tableau’s data connector with incremental refresh strategies.
What performance benchmarking have others done comparing connector vs direct query approaches? Specifically interested in how different refresh strategies impact dashboard speed for large datasets. What’s worked best in your environments?
Historical data corrections are the Achilles heel of incremental refresh. You basically have two options: either accept that corrections won’t appear until the next full refresh, or implement a hybrid approach where you track modified records separately. We added a ‘last_modified’ timestamp to our warehouse tables, then configured incremental refresh to pull both new records (based on date) AND recently modified records (based on last_modified). This catches most corrections without requiring full refreshes.
Performance benchmarking really depends on your query patterns. Live connections work well if your queries hit indexed columns and return small result sets. But with large datasets, the network latency and database processing time kill performance. We found that aggregated extracts (pre-aggregating at source) combined with Tableau’s data engine gave us 10x faster dashboard rendering. The downside is complexity in maintaining the aggregation logic and refresh schedules.
Don’t overlook the database-side optimizations. Before switching to extracts, make sure your live queries are properly optimized. We implemented materialized views for common aggregations, added covering indexes for Tableau’s typical query patterns, and enabled query result caching on SQL Server. This brought our live connection performance from 15 seconds down to 4-5 seconds without the complexity of managing extracts. Sometimes the connector vs direct query debate misses the fact that the underlying database performance is the real bottleneck.