Azure Monitor alerts not triggering for ML anomaly detection on ERP transaction data

We have Azure ML anomaly detection running on our ERP transaction data to identify unusual patterns that might indicate data quality issues or processing errors. The ML model outputs anomaly scores to a Log Analytics workspace every hour, and we’ve configured Azure Monitor alert rules to notify us when anomalies are detected. However, the alerts are not triggering even though I can see anomaly events in the logs when I manually query the workspace. The ML model is definitely detecting anomalies - when I run a KQL query for scores above our threshold of 0.85, I see multiple events from the past week that should have triggered alerts. Our alert rule is configured to check every 5 minutes and trigger when the anomaly score exceeds 0.85 in the last hour. I’ve verified the action group is configured correctly because test notifications work fine. Something seems wrong with either how the alert rule queries the ML anomaly detection output, or the Log Analytics query syntax isn’t matching the data format. This is critical because we’re missing anomaly notifications that could indicate serious ERP data issues.

Your query is missing a time filter which is critical for alert rules. Azure Monitor alert rules need an explicit time range in the query, typically using the ‘ago’ function. Try: ERPAnomalyScores_CL | where TimeGenerated > ago(1h) | where AnomalyScore_d > 0.85. Also, make sure your alert rule evaluation frequency (5 minutes) and lookback period (1 hour) are appropriate for data that arrives hourly. You might be checking between data arrivals.

Also check the Log Analytics workspace ingestion latency. Custom logs can have 5-15 minute ingestion delays. If your ML job writes at the top of each hour but data takes 10 minutes to appear in Log Analytics, and your alert looks back 1 hour, you might be missing the most recent data point. Use a 75-minute lookback window instead of 60 minutes to account for ingestion lag.

Yes, evaluation frequency should align with data arrival patterns. If your ML job writes hourly, set evaluation frequency to 60 minutes, not 5 minutes. For aggregation, you need to specify how to aggregate results within the lookback window. If you want to alert on ANY anomaly score above 0.85 in the past hour, use ‘count()’ aggregation with threshold > 0. If you want to alert on the maximum score exceeding 0.85, use ‘max(AnomalyScore_d)’ with threshold > 0.85.

When Azure Monitor alert rules fail to trigger despite data being visible in manual queries, it’s usually a query syntax issue or time aggregation mismatch. Can you share your alert rule query? Also, what’s the table name where your ML anomaly scores are being written? And are you using a custom log table or a standard Application Insights table?

Your Azure Monitor alert configuration issues stem from three common pitfalls: improper query structure, mismatched timing parameters, and lack of understanding of how alert rules evaluate ML anomaly detection output. Here’s the complete solution:

Azure Monitor Alert Rule Query Structure: Alert rules for Log Analytics require specific query patterns that differ from ad-hoc queries. Your original query ‘ERPAnomalyScores_CL | where AnomalyScore_d > 0.85’ lacks essential components. The correct query structure for anomaly detection alerts is:

ERPAnomalyScores_CL | where TimeGenerated > ago(75m) | where AnomalyScore_d > 0.85 | summarize AnomalyCount = count(), MaxScore = max(AnomalyScore_d) by TransactionType_s | where AnomalyCount > 0

Key elements: 1) TimeGenerated filter with 75-minute lookback accounts for hourly data arrival plus 15-minute ingestion latency buffer. 2) Summarize clause aggregates results - alert rules need aggregated metrics, not raw rows. 3) Count() function provides the number of anomalies detected. 4) Grouping by TransactionType_s allows you to identify which transaction types are anomalous. Without summarization, alert rules may not properly evaluate threshold conditions.

ML Anomaly Detection Output Handling: Your ML model writes anomaly scores hourly to the custom log table. Azure Monitor needs to understand this data cadence. In the alert rule configuration, set ‘Aggregation granularity (Period)’ to 60 minutes to match your data frequency. Set ‘Frequency of evaluation’ to 60 minutes as well - checking every 5 minutes for hourly data creates 11 unnecessary evaluations with no new data. This timing mismatch is likely why alerts weren’t triggering - the alert rule was evaluating before new data arrived. Additionally, ensure your ML output schema is consistent. Custom log tables with ‘_CL’ suffix automatically append ‘_d’ for numeric fields, ‘_s’ for strings, ‘_t’ for timestamps. Your AnomalyScore field becomes AnomalyScore_d. Verify field names in your alert query match the actual schema in Log Analytics.

Log Query Configuration and Ingestion Latency: The 75-minute lookback window is critical. Custom logs in Log Analytics have variable ingestion latency - typically 5-10 minutes but can spike to 15 minutes during high load. If your ML job completes at 14:00:00 and writes to Log Analytics, that data might not be queryable until 14:12:00. An alert rule with 60-minute lookback evaluating at 14:05:00 would look for data from 13:05-14:05, missing the 14:00 data point still in ingestion pipeline. Use ‘ingestion_time()’ function to debug this: ERPAnomalyScores_CL | extend IngestionDelay = ingestion_time() - TimeGenerated | summarize avg(IngestionDelay), max(IngestionDelay) by bin(TimeGenerated, 1h). This shows actual ingestion delays for your data. Set your lookback window to 1.25x your data interval (75 minutes for hourly data) to ensure you capture all recent anomalies despite ingestion variability.

Alert Rule Threshold Configuration: In the alert rule ‘Alert logic’ section, configure the threshold based on your summarize clause. If using ‘AnomalyCount’, set operator to ‘Greater than’ and threshold to ‘0’ to alert when ANY anomaly is detected. If using ‘MaxScore’, set operator to ‘Greater than’ and threshold to ‘0.85’ to alert when the highest anomaly score exceeds your critical threshold. You can create multiple alert rules with different severity levels: MaxScore > 0.95 = Critical (Severity 0), MaxScore > 0.85 = Warning (Severity 2). This provides graduated alerting based on anomaly severity.

Action Group and Notification Testing: You mentioned test notifications work, which confirms action group connectivity. However, test notifications bypass the query evaluation. To properly test, manually insert a test anomaly record into your Log Analytics workspace with current timestamp and anomaly score 0.95, then wait for the next evaluation cycle (60 minutes + processing time). Monitor the alert rule’s ‘Fired alerts’ section in Azure Portal to see if it triggers. If it doesn’t trigger even with test data, check the alert rule’s ‘Query results’ preview - this shows what data the query returns during evaluation. If preview shows no results despite data existing, your query has a logical error.

Complete Alert Rule Configuration: Create a new alert rule with these exact settings: Resource = your Log Analytics workspace, Condition = Custom log search, Query = the corrected query above, Aggregation granularity = 60 minutes, Frequency of evaluation = 60 minutes, Threshold operator = Greater than, Threshold value = 0, Evaluated based on = AnomalyCount (or MaxScore depending on your preference). Under ‘Advanced options’, disable ‘Automatically resolve alerts’ since anomaly detection is event-based, not state-based. Save and wait 60-90 minutes for the first evaluation cycle. Check ‘Alert processing rules’ to ensure no suppression rules are preventing notifications during your testing window.

After implementing these corrections, your alert rule will properly trigger when ML anomaly detection identifies unusual ERP transaction patterns, providing the critical monitoring coverage you need for data quality and processing error detection.

We’re writing to a custom log table called ‘ERPAnomalyScores_CL’. The alert rule query is: ERPAnomalyScores_CL | where AnomalyScore_d > 0.85. When I run this manually in Log Analytics, it returns results, but the alert never fires. The ML job writes a new row to this table every hour with timestamp, transaction type, and anomaly score.

I added the time filter but alerts still aren’t triggering. Could the issue be that the data arrives hourly but the alert checks every 5 minutes? Should I change the evaluation frequency to match the data arrival interval? Also, the alert rule configuration has options for aggregation granularity and I’m not sure if that’s set correctly.