Real-time anomaly detection for energy usage visualized in IoT Analytics dashboard

max_418 · July 4, 2025, 8:02am

We’ve implemented a real-time anomaly detection system for our commercial building energy management that’s reduced fault detection time from days to minutes. Sharing our architecture in case it helps others.

We have 200+ buildings with smart meters publishing energy consumption data every 5 minutes to AWS IoT Core. The challenge was detecting anomalies (equipment failures, unusual consumption patterns) quickly enough to prevent energy waste.

Our solution uses IoT Analytics pipeline with an integrated SageMaker ML model for anomaly detection, feeding results into QuickSight dashboards for our facilities team. The system automatically flags anomalies and creates maintenance tickets through our ERP integration.

Key benefits: 85% reduction in energy waste from equipment failures, automated alerting replaced manual meter review, maintenance team can prioritize issues by predicted impact. The dashboard shows real-time anomaly scores across all buildings with drill-down to individual equipment.

arch_arch · July 7, 2025, 3:11am

How did you integrate the ML model with the IoT Analytics pipeline? Did you use a Lambda activity in the pipeline, or is the model invoked separately? I’m particularly interested in the latency - you mentioned real-time detection, so I assume the inference happens inline with data ingestion?

sanjay_code · July 8, 2025, 10:05pm

What’s your false positive rate? Anomaly detection can be noisy, and I’m concerned about alert fatigue for the facilities team. Do you have any post-processing to filter out low-confidence anomalies before they hit the dashboard?

charles_api · July 9, 2025, 7:34pm

Let me provide the detailed implementation addressing all three focus areas:

IoT Analytics Pipeline Design: The pipeline has four activities: Channel (ingests MQTT messages from IoT Core), Lambda (enrichment with temporal features and building metadata), Lambda (ML inference via SageMaker), and Datastore (stores enriched data with anomaly scores). The first Lambda adds day_of_week, hour_of_day, is_holiday, and building_type fields. This enrichment is critical because the ML model needs temporal context to distinguish between normal variations and true anomalies. The pipeline processes data in micro-batches every minute, giving near-real-time detection while managing Lambda costs.

ML Model Integration: The SageMaker Random Cut Forest model is deployed as a real-time endpoint with auto-scaling. The inference Lambda batches up to 50 records per invocation to optimize throughput. Each record includes current consumption, historical 24-hour average, and the enrichment features. The model outputs an anomaly score (0-1) and a confidence level. We’ve tuned the threshold to 0.75 to balance detection sensitivity with false positives - anything above this triggers an alert. The model is versioned using SageMaker Model Registry, and we maintain two endpoints (production and canary) to test new model versions before full deployment.

Dashboard Visualization: QuickSight connects directly to the IoT Analytics dataset using SPICE for fast queries. The main dashboard has three views: fleet overview (heatmap of all buildings colored by anomaly score), building detail (time series of consumption with anomaly flags), and alert queue (table of active anomalies sorted by predicted impact). We calculate predicted impact by multiplying the anomaly score by the building’s typical daily energy cost. The dashboard refreshes every 5 minutes via scheduled SPICE refresh. For critical anomalies (score > 0.9), we trigger SNS notifications to the facilities team via an EventBridge rule that monitors the datastore for high-score records.

To address false positives: we implemented a 15-minute persistence threshold. An anomaly must appear in three consecutive data points before generating an alert. This filters transient spikes while catching sustained issues. We also maintain an exclusion list for buildings undergoing maintenance. The false positive rate is now under 5%, which the facilities team finds manageable.

The ERP integration uses a Lambda function triggered by EventBridge that creates maintenance work orders in our system via REST API. The work order includes the building ID, equipment suspected (based on meter location), anomaly score, and estimated energy waste rate.

Total cost for the system is approximately $800/month for 200 buildings, covering IoT Core messages, Analytics pipeline, SageMaker endpoint, and QuickSight. The ROI is significant - we’re preventing about $15,000/month in energy waste from early detection of equipment failures.

solver_thinker · July 5, 2025, 8:10am

Great question. We retrain weekly using the previous 90 days of data, which captures seasonal patterns. The model is a Random Cut Forest algorithm in SageMaker that automatically adjusts for seasonal trends. We also maintain separate baseline profiles for weekday vs weekend consumption patterns. The IoT Analytics pipeline enriches incoming data with day-of-week and holiday flags before feeding to the model.

arch_arch · July 7, 2025, 10:23am

We use a Lambda activity in the IoT Analytics pipeline that invokes the SageMaker endpoint. Latency is around 200-300ms for inference, which is acceptable for our 5-minute data intervals. The Lambda function batches records from the same building to reduce endpoint invocations and costs. Results are written back to the pipeline as an anomaly_score field before data reaches the datastore.

joshua_ace · July 4, 2025, 8:43am

This is impressive. How frequently does your ML model retrain? Anomaly detection for energy can be tricky because normal patterns change seasonally. Are you handling that in the model or through separate baseline adjustments?

Topic		Views
Predictive maintenance integration between IoT sensor data streams and ERP work orders Google Cloud IoT use-case , dataflow , perception , machine-learning , predictive-maintenance , downtime-reduction , data-stream , gcpiot-24 , unplanned-downt	6	November 22, 2025
Deployed predictive maintenance using edge-compute ML models to reduce equipment downtime by 40% in manufacturing facility Oracle IoT Cloud use-case , edge-compute , python , machine-learning , predictive-maintenance , anomaly-detection , security-pol , oiot-23 , ml-deployment	5	June 22, 2025
Automated device health monitoring and predictive maintenance implementation Oracle IoT Cloud use-case , analytics , automation , cost-reduction , rules-engine , predictive-maintenance , anomaly-detection , device-mgmt , oiot-pm	6	August 27, 2025
Implemented predictive maintenance using event correlation and anomaly detection Oracle IoT Cloud use-case , analytics , machine-learning , predictive-maintenance , anomaly-detection , event-processing , gateway-mgmt , oiot-22 , iot-production-monitoring	6	February 16, 2025
Automated anomaly alerts in IoT dashboard using ML models with Dataflow and Pub/Sub integration Google Cloud IoT use-case , dataflow , pubsub , automation , python , vertex-ai , real-time-alerts , viz-dashboar , analytics-ml	4	February 26, 2025
Automated real-time sensor data pipeline from IoT devices to dashboards Google Cloud IoT use-case , connectivity , python , cloud-functions , bigquery , data-studio , viz-dashboard , gcpiot-25 , real-time-pipeline	7	August 5, 2025
Predictive maintenance app built with edge analytics reduced unplanned downtime by 73% in manufacturing IBM Watson IoT use-case , manufacturing , downtime-reduction , app-enableme , edge-analytics , analytics-ml , wiot-24 , node-red , predictive-main	4	April 26, 2025
Real-time shop floor KPI dashboard with automated downtime root cause analysis AVEVA MES use-case , reporting-analytics , real-time-monitoring , shop-floor-control , machine-learning , kpi-dashboard , downtime-analysis , am-2023-1 , aveva-mes-shop-floor	7	January 9, 2026
Integrating Greengrass v2 edge ML with ERP for predictive maintenance automation AWS IoT use-case , integration , automation , lambda , predictive-maintenance , gg-v2 , analytics-ml , greengrass-stream-manager , erp-api	5	September 13, 2025

Real-time anomaly detection for energy usage visualized in IoT Analytics dashboard

Related topics