Your query performance issues require systematic optimization across all three areas:
Query Optimization: Avoid SELECT * and specify only required columns. Use the SDK’s query builder for automatic optimization:
var query = storageClient.CreateQuery()
.Select("deviceId", "timestamp", "temperature", "humidity")
.Where("deviceId", deviceId)
.WhereBetween("timestamp", startDate, endDate)
.OrderBy("timestamp")
.Limit(5000);
Indexing: Create composite indexes optimized for your query patterns. For time-range queries with device filtering:
CREATE INDEX idx_device_time
ON telemetry(deviceId, timestamp DESC)
INCLUDE (temperature, humidity);
Pagination: Implement cursor-based pagination for efficient large result set handling:
string continuationToken = null;
do {
var result = await query.ExecuteAsync(continuationToken);
ProcessBatch(result.Items);
continuationToken = result.ContinuationToken;
} while (continuationToken != null);
Detailed implementation strategy: First, verify index existence using the storage SDK’s metadata API or database EXPLAIN plans. Your query should show “Index Seek” on idx_device_time, not “Table Scan”. If indexes are missing, create composite index on (deviceId, timestamp) with included columns for frequently accessed fields. This eliminates the need for key lookups after index seek. Second, implement pagination with 5000 record chunks - this reduces memory pressure and enables progressive result processing. Use continuation tokens instead of OFFSET-based pagination to avoid performance degradation on later pages. Third, optimize column selection - if you need 10 columns out of 50, explicitly list them to reduce data transfer by 80%. Fourth, enable query result caching in the SDK for repeated queries:
var options = new QueryOptions {
EnableCache = true,
CacheDuration = TimeSpan.FromMinutes(5)
};
For 50M record tables, consider time-based partitioning (monthly or weekly) to improve query performance. Partition pruning eliminates 90%+ of data from scans when querying recent time ranges. Also implement query timeout handling and retry logic for long-running queries. Monitor query execution metrics via the SDK’s telemetry: track execution time, rows scanned vs returned, and cache hit rates. For your specific query pattern, properly indexed and paginated queries should complete in under 5 seconds for the first page and 2-3 seconds for subsequent pages. If performance doesn’t improve after indexing, check if statistics are updated (run ANALYZE on the table) and verify query planner is choosing optimal execution plans. Consider read replicas if you have high concurrent query load impacting write performance.