Azure Monitor storage metrics show gaps in observability for blob access patterns and latency tracking

lisaanalyst · January 19, 2025, 6:56am

Azure Monitor isn’t capturing detailed blob access metrics for our storage accounts. We’ve enabled diagnostic settings and configured Log Analytics workspace, but there are significant gaps in the observability data. Specifically, we’re missing per-blob access patterns, request latencies for individual operations, and caller IP information.

The diagnostic logs show aggregated metrics but don’t provide the granularity needed to troubleshoot performance issues or identify access anomalies. We’ve tried querying with KQL but the StorageBlobLogs table is incomplete:


StorageBlobLogs
| where TimeGenerated > ago(1h)
| where OperationName == "GetBlob"
| summarize count() by CallerIpAddress

This query returns sparse results even though we know there are thousands of blob reads happening. Application Insights integration isn’t helping either. What diagnostic configuration am I missing for complete storage observability?

jasonadmin · February 25, 2025, 7:11am

Here’s the complete picture of Azure Storage observability and how to achieve full visibility:

Understanding Diagnostic Settings Limitations: Azure Storage diagnostic logs are intentionally sampled to balance cost and performance. The sampling rate varies:

Low-volume operations (<100 req/min): 100% captured
Medium-volume (100-1000 req/min): ~50% sampled
High-volume (>1000 req/min): 5-10% sampled This is by design and cannot be disabled. Microsoft doesn’t expose sampling configuration because it’s dynamic based on backend load.

Diagnostic Settings Configuration: Ensure you’ve enabled ALL log categories in diagnostic settings. The configuration should include:

{
  "logs": [
    {"category": "StorageRead", "enabled": true},
    {"category": "StorageWrite", "enabled": true},
    {"category": "StorageDelete", "enabled": true}
  ],
  "metrics": [{"category": "Transaction", "enabled": true}]
}

Log Analytics KQL Query Optimization: Your KQL query needs to account for sampling and use the correct tables. StorageBlobLogs is correct, but you should query across multiple time ranges and use aggregation functions that work with sampled data:

// Pseudocode - Key implementation steps:

Query StorageBlobLogs with extended time range (4h minimum)
Join with StorageMetrics for volume validation
Use summarize with percentile functions for latency analysis
Filter by StatusCode to identify errors vs successful operations
Cross-reference CallerIpAddress with known client IPs // Account for 10-15 minute ingestion lag in time filters

Application Insights Integration: For complete observability, implement client-side telemetry:

Add Azure Storage SDK telemetry to your application
Configure Application Insights to capture dependency calls
Use custom events for critical storage operations
Enable distributed tracing to correlate app requests with storage calls

This gives you 100% visibility from the client perspective, which is often more valuable than server-side logs for troubleshooting.

Hybrid Monitoring Strategy:

Storage Metrics: Use for capacity planning, availability trends, and aggregate throughput (always 100% accurate)
Diagnostic Logs: Use for error analysis, security auditing, and sampling-acceptable use cases
Application Insights: Use for end-to-end request tracing, client-side latency, and critical path monitoring
Custom Logging: Implement for specific high-value blobs or containers where you need complete audit trails

Addressing Your Specific Gaps:

Per-blob access patterns: Not available in diagnostic logs due to sampling. Solution: Implement custom Event Grid triggers on blob operations that write to a dedicated tracking table in Log Analytics or Cosmos DB.
Request latencies: Available but sampled. For accurate latency percentiles, use Application Insights dependency tracking which captures client-perceived latency (more relevant than server-side anyway).
Caller IP information: This is logged but heavily sampled for high-volume operations. For security monitoring, enable Storage Analytics hour metrics which provide IP-level aggregates without sampling, or use Azure Firewall/NSG flow logs if IP tracking is security-critical.

Practical Implementation: For your immediate needs, enable Storage Analytics hour and minute metrics in addition to diagnostic logs. These provide different data granularity and are not sampled. Access via Azure portal Storage Analytics blade or query the $MetricsTransactionsBlob table directly.

The reality is Azure Storage observability requires a multi-tool approach. No single configuration gives you complete visibility due to the scale and cost implications of logging billions of operations. Design your monitoring strategy based on what you actually need to troubleshoot versus nice-to-have data.

anna_cloud · February 3, 2025, 2:59pm

We faced this exact issue last quarter. The sampling rate for storage logs varies based on operation type and volume. High-frequency operations like GetBlob on popular blobs are heavily sampled (sometimes down to 1-5% of actual requests). The solution is multi-layered: use Storage Analytics metrics for volume trends, enable detailed logging only for specific containers you need to monitor closely, and implement custom telemetry in your application for critical paths.

For KQL queries, you also need to account for the sampling rate in your calculations. Microsoft doesn’t document the exact sampling algorithm but it’s definitely not uniform across all operations.

ruthcoder · February 24, 2025, 7:25pm

The RBAC permissions were correct, but that’s a good point to verify. I’m going to implement the hybrid approach with Application Insights for critical operations and accept the sampled storage logs for general trends.

barbara_expert · January 28, 2025, 11:09pm

Azure Storage diagnostic logs are sampled by default for high-volume operations to manage costs and log volume. There’s no way to disable this sampling completely. For complete observability, you need to implement client-side logging in your application code using Application Insights SDK. This captures every operation from the application perspective before it hits Azure Storage.

Also, ensure you’re querying the correct time range - there can be 5-10 minute ingestion delays for storage logs in Log Analytics.

mary_sql · February 16, 2025, 7:19am

Check if you have the Storage Blob Data Reader role properly assigned to your Log Analytics workspace managed identity. Without correct RBAC permissions, diagnostic logs can fail silently and you won’t see errors - just missing data. This caught us out for weeks because the diagnostic setting showed as enabled but logs weren’t flowing.

ruthcoder · January 19, 2025, 9:17am

Storage diagnostic settings have multiple log categories. You need to enable all three: StorageRead, StorageWrite, and StorageDelete. By default, only high-level metrics are captured. Also check your Log Analytics workspace retention settings - logs might be getting dropped if you’re hitting quota limits.

Topic		Replies	Views
What are the pros and cons of using Azure Storage Analytics versus Azure Monitor for storage monitoring? Microsoft Azure discussion , monitoring , analytics , scalability , data-integration , observability , az-2020 , query-performance , cost-comparison	4	1	January 30, 2025
Azure Log Analytics query latency spikes during high-volume data ingestion Microsoft Azure question , monitoring , networking , observability , az-2021 , performance-tuning , latency , azure-log-analytics , kusto-query	6	0	February 4, 2025
Azure Log Analytics query latency spikes during high-volume ingestion Microsoft Azure question , monitoring , networking , query-optimization , observability , az-2021 , latency , kusto , azure-log-analytics	6	0	August 31, 2025
Audit reporting logs not retained for compliance period in ado-2023 Azure DevOps question , compliance , log-analytics , ci-cd-integration , audit-reporting , azure-storage , ado-2023 , log-retention-policy , audit-trail-loss	5	0	January 10, 2026
Azure Blob Storage analytics queries timeout when analyzing large datasets with index tags Microsoft Azure question , storage , analytics , query-timeout , az-2019 , azure-storage-analytics , blob-index-tags , synapse-integration , kql	3	0	March 20, 2025
Best practices for mapping user and service identities in Log Analytics workspace audit queries Microsoft Azure discussion , identity-access , compliance , observability , log-analytics , az-2021 , service-principal , kusto , diagnostic-settings	6	0	January 1, 2025
Choosing between metrics and logs for IoT device monitoring at scale - experiences and trade-offs Microsoft Azure discussion , iot-services , metrics , observability , cost-optimization , log-analytics , az-2020 , azure-monitor , monitoring-strategy	5	0	March 16, 2025
Azure Blob Storage monitoring alerts not triggering for large file uploads in production workloads Microsoft Azure question , storage , az-2019 , azure-monitor , alerts-missing , monitoring-mana , blob-storage , event-grid , missed-ingestion	4	0	July 18, 2025
What are the pros and cons of using Azure Storage Analytics versus Azure Monitor for storage metrics? Microsoft Azure discussion , monitoring , storage , analytics , az-2020 , query-performance , platform-selection , azure-monitor , cost-comparison	6	0	October 17, 2025

Azure Monitor storage metrics show gaps in observability for blob access patterns and latency tracking

Related topics