Let me provide a comprehensive solution covering all three key areas: alert rule configuration, the correct Usage metric query, and proper action group setup.
Alert Rule Configuration:
You need to create a log query alert rule, not a metric alert. In the Azure portal, go to your Log Analytics workspace > Alerts > New alert rule. The critical difference is that workspace capacity monitoring requires querying the Usage table directly.
For the alert query, use this KQL:
Usage
| where TimeGenerated > ago(1h)
| where IsBillable == true
| summarize TotalGB = sum(Quantity) / 1024
| extend DailyCap = 5.0
| extend PercentUsed = (TotalGB / DailyCap) * 100
| where PercentUsed > 90
This query calculates billable data ingestion over the last hour and projects it against your daily cap. The key is using IsBillable == true to exclude free data types from the calculation.
Alert Rule Settings:
- Measurement: Table rows (not metric measurement)
- Aggregation granularity (Period): 1 hour
- Frequency of evaluation: Every 15 minutes
- Threshold: Greater than 0 (because the query filters for >90% in the KQL itself)
- Severity: Sev 1 (Critical)
Understanding the Usage Metric:
The Usage table updates approximately every hour, which is why your alert rule needs appropriate timing. The table contains these key columns:
- DataType: The type of data ingested (SecurityEvent, Perf, Syslog, etc.)
- Quantity: Volume in MB
- IsBillable: Boolean indicating if this data counts toward your cap
- TimeGenerated: When the usage record was created
The workspace’s daily cap resets at midnight UTC, so your alert should account for this by checking cumulative ingestion since midnight:
Usage
| where TimeGenerated > startofday(now())
| where IsBillable == true
| summarize TotalGB = sum(Quantity) / 1024
| extend PercentOfDailyCap = (TotalGB / 5.0) * 100
Action Group Setup:
Verify your action group configuration:
- Action group name should be descriptive (e.g., “LogAnalytics-Capacity-Alerts”)
- Add multiple notification channels: Email, SMS, and consider adding a webhook to a ticketing system
- Test the action group using “Test action group” feature in the portal - this sends test notifications through all configured channels
- Check that email addresses are verified and SMS numbers include country codes
- Review the Alert History under the action group to see if notifications were attempted but failed
Common Issues:
-
Alert Suppression: If your workspace briefly exceeds the threshold then drops below it, the alert might fire but then immediately resolve. Configure “Alert state” to track state changes and set a proper “Check workspace data” delay.
-
Workspace Daily Cap Mode: Verify Settings > Usage and estimated costs > Daily cap is set to the correct value (5GB in your case) and that “Alert only” is selected, not “Stop ingestion when daily cap is reached.”
-
Action Group Rate Limiting: Azure limits SMS to 1 per 5 minutes per phone number. If alerts fire frequently, notifications get throttled. Add email notifications as backup.
-
RBAC Permissions: Ensure the user who created the alert rule has “Log Analytics Contributor” role on the workspace. Insufficient permissions can cause silent alert failures.
Monitoring Alert Effectiveness:
After implementing these changes, monitor the alert rule’s fire history:
- Go to Alerts > Alert rules > [Your rule] > History
- Verify “Fired” events appear when workspace ingestion exceeds 90%
- Check action group execution logs to confirm notifications were sent
- Review the Alert Processing Rules section to ensure no suppression rules are interfering
Implementing these configurations with the correct KQL query, appropriate evaluation frequency, and verified action group setup should resolve your alerting issues and provide early warning before hitting the daily cap.