Let me provide the detailed implementation addressing all three focus areas:
IoT Analytics Pipeline Design: The pipeline has four activities: Channel (ingests MQTT messages from IoT Core), Lambda (enrichment with temporal features and building metadata), Lambda (ML inference via SageMaker), and Datastore (stores enriched data with anomaly scores). The first Lambda adds day_of_week, hour_of_day, is_holiday, and building_type fields. This enrichment is critical because the ML model needs temporal context to distinguish between normal variations and true anomalies. The pipeline processes data in micro-batches every minute, giving near-real-time detection while managing Lambda costs.
ML Model Integration: The SageMaker Random Cut Forest model is deployed as a real-time endpoint with auto-scaling. The inference Lambda batches up to 50 records per invocation to optimize throughput. Each record includes current consumption, historical 24-hour average, and the enrichment features. The model outputs an anomaly score (0-1) and a confidence level. We’ve tuned the threshold to 0.75 to balance detection sensitivity with false positives - anything above this triggers an alert. The model is versioned using SageMaker Model Registry, and we maintain two endpoints (production and canary) to test new model versions before full deployment.
Dashboard Visualization: QuickSight connects directly to the IoT Analytics dataset using SPICE for fast queries. The main dashboard has three views: fleet overview (heatmap of all buildings colored by anomaly score), building detail (time series of consumption with anomaly flags), and alert queue (table of active anomalies sorted by predicted impact). We calculate predicted impact by multiplying the anomaly score by the building’s typical daily energy cost. The dashboard refreshes every 5 minutes via scheduled SPICE refresh. For critical anomalies (score > 0.9), we trigger SNS notifications to the facilities team via an EventBridge rule that monitors the datastore for high-score records.
To address false positives: we implemented a 15-minute persistence threshold. An anomaly must appear in three consecutive data points before generating an alert. This filters transient spikes while catching sustained issues. We also maintain an exclusion list for buildings undergoing maintenance. The false positive rate is now under 5%, which the facilities team finds manageable.
The ERP integration uses a Lambda function triggered by EventBridge that creates maintenance work orders in our system via REST API. The work order includes the building ID, equipment suspected (based on meter location), anomaly score, and estimated energy waste rate.
Total cost for the system is approximately $800/month for 200 buildings, covering IoT Core messages, Analytics pipeline, SageMaker endpoint, and QuickSight. The ROI is significant - we’re preventing about $15,000/month in energy waste from early detection of equipment failures.