Monitoring SDK alert rule not triggering on threshold breach in aziot-24 monitoring module

stevenwizard · April 27, 2025, 9:45am

Critical incident response delays are occurring because alert rules aren’t firing. We’ve configured threshold-based alerts via the Azure IoT SDK (aziot-24) to trigger when device temperature exceeds 85°C, but notifications aren’t being sent even when telemetry clearly shows breaches.

Alert rule configuration:

alertRule.condition = {
  metric: 'temperature',
  threshold: 85,
  operator: 'greaterThan'
};

Telemetry logs show multiple devices reporting 92-95°C for extended periods, but no alerts triggered. The telemetry mapping between our device schema and the monitoring system seems correct. We upgraded from aziot-23 to aziot-24 last month - could the SDK version update have changed alert rule behavior?

matthew_pro · May 18, 2025, 12:37am

The aggregation setting is in the condition object. You need to add aggregation: 'Maximum' as a property alongside metric and threshold. Also verify your telemetry field name exactly matches - aziot-24 is case-sensitive now. If your device sends ‘Temperature’ but your rule checks ‘temperature’, it won’t match.

matthewguru · June 11, 2025, 9:24am

Also check if your alert rule is actually enabled. After SDK upgrades, sometimes rules get disabled by default and need to be explicitly re-enabled. Use the SDK’s listAlertRules() method to verify the enabled status of all your rules.

stevenwizard · April 29, 2025, 5:47pm

There’s also a telemetry aggregation change in aziot-24. Alert rules now evaluate against the average value within the evaluation window, not the raw telemetry points. If your 95°C spike lasts only 30 seconds within a 5-minute window, the average might still be below 85°C. Switch to ‘maximum’ aggregation instead of ‘average’ for threshold alerts on spikes.

angelamaster · June 12, 2025, 12:45am

Your alert rule issues stem from multiple breaking changes in aziot-24. Here’s the complete solution covering all focus areas:

1. Alert Rule Configuration (Core Fix):

Update your alert rule with all required aziot-24 properties:

const alertRule = {
  name: 'HighTemperatureAlert',
  enabled: true,
  condition: {
    metric: 'temperature',
    threshold: 85,
    operator: 'greaterThan',
    aggregation: 'Maximum',
    unit: 'Celsius'
  },
  evaluationFrequency: 60,
  windowSize: 300,
  severity: 'Critical'
};

Key changes from aziot-23:

aggregation is now mandatory (default changed from ‘Maximum’ to ‘Average’)
unit must be specified for numeric metrics
evaluationFrequency default increased from 60s to 300s
enabled must be explicitly set (no longer defaults to true)

2. Telemetry Mapping (Schema Alignment):

Aziot-24 introduced strict schema validation. Ensure your device telemetry matches the alert rule metric name exactly:

// Device telemetry must use exact field names
const telemetry = {
  temperature: 92.5,  // lowercase to match alert rule
  temperatureUnit: 'Celsius',
  timestamp: Date.now()
};

The SDK now performs case-sensitive matching. If your devices send ‘Temperature’ (capitalized) but your rule checks ‘temperature’, alerts won’t trigger. Update either your device code or alert rule to match.

Telemetry Mapping Validation: Query the metric metadata to verify correct mapping:

const metrics = await iotClient.getAvailableMetrics(deviceId);
console.log('Available metrics:', metrics);
// Verify 'temperature' appears in the list

3. SDK Version Update (Migration Steps):

Aziot-24 changed alert rule persistence. Existing rules from aziot-23 need migration:

Export existing rules before upgrade
After upgrade, rules are disabled by default
Re-create rules with new schema
Test each rule individually

// Migration script
const oldRules = await client.listAlertRules();
for (const rule of oldRules) {
  const updated = {
    ...rule,
    condition: {
      ...rule.condition,
      aggregation: 'Maximum',
      unit: inferUnit(rule.condition.metric)
    },
    enabled: true
  };
  await client.updateAlertRule(rule.id, updated);
}

Additional Configuration Best Practices:

Window size: Set to 5x evaluation frequency (300s window with 60s evaluation)
Consecutive breaches: Add consecutiveBreaches: 2 to reduce false positives
Alert actions: Configure notification channels explicitly (email/webhook)
Metric units: Standardize on Celsius/Fahrenheit across all devices

Testing and Validation:

Use SDK debug mode to see alert evaluation logs
Manually trigger test alerts with simulated telemetry
Verify alert history shows evaluation attempts
Monitor alert rule metrics dashboard for evaluation count

Performance Impact: Evaluation frequency of 60s with Maximum aggregation increases compute load by ~20% compared to 300s/Average. Monitor your IoT Hub throttling metrics. If you hit limits, consider:

Increasing evaluation frequency to 120s for non-critical alerts
Using Average aggregation for gradual threshold breaches
Implementing device-side pre-filtering for extreme values

With these changes, your alert rules will trigger correctly on threshold breaches. The combination of Maximum aggregation and 60-second evaluation frequency ensures you catch brief temperature spikes that would be missed with default settings.

amanda_ninja · April 27, 2025, 10:04am

Check your alert rule evaluation frequency. In aziot-24, the default changed from 1 minute to 5 minutes. If your temperature spikes are brief, they might not be captured during evaluation windows. You need to explicitly set evaluationFrequency to 60 seconds in your rule configuration.

Topic		Views
Monitoring module alerts fail to trigger when sensor values exceed configured thresholds during rapid fluctuations Oracle IoT Cloud question , monitoring , missed-alerts , data-ingestion , threshold-monitoring , oiot-23 , alert-engine , sensor-alerts , critical-monitoring	4	June 11, 2025
Device alerts not triggering in rules engine for custom conditions on telemetry data Google Cloud IoT question , rules-engine , json , missed-alerts , telemetry , iot-core-rules , device-mgmt , gcpiot-25 , alerts-not-trigger	3	November 6, 2025
Firmware management alert rules not triggering on critical device failures after aziot-25 upgrade Microsoft Azure IoT question , json , alerting , azure-monitor , alert-missing , firmware-mgmt , event-schema , aziot-25 , device-failure	5	November 14, 2024
ML-based anomaly alerts not triggering in IoT Central monitoring dashboard for streaming sensor data Microsoft Azure IoT question , monitoring , diagnostics , iot-central , analytics-ml , alert-rules , aziotc , alerts-not-firing , device-telemetry	7	May 23, 2025
Device shadow alerts not triggering when temperature exceeds threshold despite correct attribute mapping Oracle IoT Cloud question , rest-api , json , alerting , device-shado , alert-not-trigg , oiot-23 , iot-console , missed-critical-aler	6	November 29, 2024
Monitoring data ingestion shows lag after upgrading to aziot-24, causing delayed alerts Microsoft Azure IoT question , monitoring , json , latency , alerting , batch-optimization , event-hubs , data-ingestion , aziot-24	6	September 12, 2025
Event monitoring alerts not triggering on threshold breach for temperature sensors in production environment SAP IoT question , monitoring , permissions , notification-setup , alert-configuration , incident-management , event-processing , sapiot-25 , temperature-sensors	3	March 17, 2025
Custom application event routing rules not triggering for specific device attributes Oracle IoT Cloud question , json , routing-rules , event-processing , rule-evaluation , app-enablement , oiot-23 , iot-cloud-services , device-attributes	5	February 7, 2025
Rules engine alerts not firing on threshold breach when using custom expressions in SAP IoT 2.4 SAP IoT question , rules-engine , alerting , alert-missing , expression-syntax , custom-rules , sapiot-24 , rules-editor , threshold-breach	6	July 20, 2025

Monitoring SDK alert rule not triggering on threshold breach in aziot-24 monitoring module

Related topics