ML model fails to predict anomalies in monitoring data despite high training accuracy

creatortech · December 1, 2024, 8:03am

We’ve trained an ML model in ThingWorx Analytics for anomaly detection on monitoring data with 94% training accuracy, but it’s failing to detect real anomalies in production. The model returns either no anomalies or floods us with false positives depending on threshold settings.

Our training dataset includes 6 months of sensor data from manufacturing equipment, and we’re seeing issues with data schema consistency - some sensors changed their output format 3 months ago. We’re also concerned about data drift since production patterns have shifted.

model.train(historical_data, epochs=100)
anomalies = model.predict(live_stream)
# Returns: [] or 500+ false positives

The false positive rate makes the system unusable. Has anyone dealt with similar prediction failures in ThingWorx Analytics? Should we focus on model retraining strategies or is this a data quality issue?

cynthiaace · January 10, 2025, 11:47pm

I had almost identical issues and solved them systematically. Here’s what worked:

Data Schema Consistency: First, create a schema validation layer that normalizes all sensor inputs. Map old sensor formats to new ones using transformation rules:

if sensor.version < 2.0:
    normalized_value = sensor.raw * conversion_factor
else:
    normalized_value = sensor.standardized_output

Data Drift Detection: Implement continuous monitoring using ThingWorx Analytics’ statistical comparison services. Track key metrics like mean, standard deviation, and distribution shape for each feature. Set up alerts when drift exceeds 15% from baseline. This gives you early warning before prediction quality degrades.

Model Retraining Strategy: This is critical. Don’t rely on a single static model. Implement automated retraining on a schedule that matches your data volatility:

Weekly retraining using rolling 60-day windows for stable processes
Daily retraining for highly variable processes
Triggered retraining when drift detection exceeds thresholds

For your specific case, I’d recommend:

Separate your historical data into pre-schema-change and post-schema-change datasets
Retrain using only post-change data (last 3 months) to ensure consistency
Implement feature-level drift monitoring for each sensor
Set up ensemble models that combine predictions from models trained on different time windows
Use adaptive thresholds that adjust based on recent false positive rates

Threshold Tuning: Instead of static thresholds, implement dynamic thresholding based on recent prediction confidence scores. Start conservative (higher threshold = fewer alerts) and gradually lower it while monitoring precision/recall metrics. Track business impact - how many real issues are caught vs. how many false alarms interrupt operations.

The key insight is that anomaly detection in production requires continuous adaptation. Your 94% training accuracy means nothing if the model is trained on data that no longer represents current reality. Focus on data quality first, then model currency through regular retraining, then threshold optimization based on operational feedback.

After implementing this approach, our false positive rate dropped from unusable levels to under 5%, and we’re catching real anomalies that would have caused equipment failures.

meeradata · December 15, 2024, 7:01am

The schema consistency issue is probably killing your predictions. We had similar problems and found that even minor changes in sensor output ranges or units completely threw off our models. Your 94% training accuracy might be misleading if it was calculated on mixed schema data. I’d recommend standardizing all sensor inputs through a preprocessing layer that handles both old and new formats, then retraining from scratch with clean, consistent data.

raj_guru · December 28, 2024, 11:29am

Beyond the schema issues, you need a robust model retraining strategy. Production patterns shifting is normal in manufacturing - seasonal changes, new product lines, equipment wear. We retrain our anomaly detection models weekly using a sliding window of the most recent 30 days of data. This keeps the model current with production reality. Also, implement A/B testing where you run old and new model versions in parallel to compare performance before full deployment.

francesca_22 · December 1, 2024, 1:47pm

I’ve seen this exact issue. The problem is your model has no visibility into the schema changes that happened mid-training. You need to implement data drift detection before the prediction step. ThingWorx Analytics has built-in drift detection services - configure them to monitor feature distributions and alert when statistical properties shift beyond thresholds. Also, separate your training data into pre-change and post-change segments to see which performs better.

helen_tech · December 29, 2024, 7:00pm

Have you checked your feature selection? Sometimes the model trains well but fails in production because the features that worked historically aren’t predictive anymore. Review which sensors and features contribute most to your predictions. Also, your threshold settings mention is important - anomaly detection thresholds need continuous tuning based on business impact. A static threshold won’t work with drifting data patterns.

sophie140 · December 1, 2024, 8:43am

This sounds like a classic data drift problem combined with schema inconsistency. When you mention sensors changed output format 3 months ago, that’s a red flag. Your model learned patterns from one data structure but is now seeing a different one in production. Check if your feature engineering pipeline is handling the schema changes correctly - are you normalizing the new sensor formats to match training data expectations?

Topic		Replies	Views
Comparing ML model performance for real-time data streams in ThingWorx Analytics PTC ThingWorx discussion , best-practices , performance-benchmarks , real-time-processing , data-stream , analytics-ml , twx-97 , thingworx-analytics , model-evaluation	3	0	December 20, 2024
Tackling alarm fatigue in MES: how are you balancing false positives vs missed failures? AI Adoption in MES discussion , scaling , predictive-maintenance , anomaly-detection , ai-adoption , mes-ai , alarm-management , false-positives , iiot	4	1	October 29, 2025
High false positive rate in ML-based anomaly detection for asset tracking with custom sensor types Oracle IoT Cloud question , ml-analytics , asset-tracking , model-training , anomaly-detection , feature-engineering , alert-management , analytics-ml , oiot-pm	4	2	September 15, 2025
Implemented ML-based predictive maintenance for asset tracking using ThingWorx Analytics 9.5 PTC ThingWorx use-case , iot , python , predictive-maintenance , automated-scheduling , asset-tracki , analytics-ml , twx-95 , thingworx-analytics	5	0	December 15, 2024
Comparing ML-based vs rule-based anomaly detection for IoT alerts Google Cloud IoT discussion , custom-logic , operational-efficiency , vertex-ai , anomaly-detection , model-drift , asset-tracki , analytics-ml , gcpiot-25	4	4	September 24, 2025
AI defect prediction letting critical bugs slip through—how to catch false negatives before production? AI Adoption in ALM question , ci-cd , ai-adoption , piloting , model-drift , release-gates , alm-ai , defect-prediction , false-negatives	7	1	February 18, 2025
Tackling Model Drift in Production Vision Systems – Your Strategies? AI Adoption in QMS discussion , data-quality , edge-computing , ai-adoption , piloting , model-drift , qms-ai , computer-vision , defect-detection	4	2	October 6, 2025
Recalibrating AI defect prediction after false-negative spike in production AI Adoption in ALM use-case , ci-cd , scaling , ai-adoption , model-drift , quality-gates , alm-ai , defect-prediction , false-negatives	6	2	February 15, 2025
How do you fix sensor drift before it kills your predictive maintenance project? AI Adoption in MES question , data-governance , real-time-monitoring , predictive-maintenance , ai-adoption , exploring , mes-ai , sensor-calibration	2	2	September 9, 2025

ML model fails to predict anomalies in monitoring data despite high training accuracy

Related topics