Data storage query performance degrades when retrieving measurements from 100K+ device time series

We’re experiencing severe query performance degradation in our Cumulocity IoT c8y-1020 deployment when retrieving measurement data from our time-series storage. Dashboard queries filtering measurements across 100K+ devices now take 30-45 seconds to return results.

Our current query pattern aggregates hourly averages:


GET /measurement/measurements
  ?dateFrom=2025-03-01T00:00:00Z
  &dateTo=2025-03-15T00:00:00Z
  &source={deviceId}
  &type=c8y_TemperatureMeasurement

We’ve noticed the aggregation pipeline is hitting memory limits, and connection pool exhaustion occurs during peak loads. The time-series partitioning strategy seems inefficient for our data volume. We need guidance on composite indexing approaches and query result caching mechanisms to improve response times to under 5 seconds.

Your connection pool sizing is definitely a bottleneck. For your load profile, increase to 300-500 connections. More importantly, implement a tiered caching layer. Cache aggregated hourly/daily metrics in Redis with 15-minute TTL. This eliminates repeated time-series scans. Also consider pre-computing common dashboard aggregations through scheduled Smart Rules that populate summary collections.

For cross-partition aggregations, use a map-reduce pattern or Cumulocity’s Streaming Analytics service to maintain materialized views. The cache invalidation concern is valid - implement event-driven invalidation using Cumulocity’s real-time notification API to bust cache entries when measurements arrive.

Time-series partitioning is critical here. Partition your measurement collections by device group and time bucket (monthly or weekly depending on volume). This allows queries to target specific partitions rather than scanning the entire dataset. Combined with proper composite indexes on (source, type, time), you should see 10x improvement. What’s your current collection sharding strategy?

Here’s a comprehensive optimization strategy addressing all your performance bottlenecks:

Composite Indexing Strategy Implement compound indexes on your measurement collections:


db.measurements.createIndex(
  {"source.id": 1, "type": 1, "time": -1},
  {background: true}
)

This enables efficient filtering by device, measurement type, and time range in a single index lookup.

Time-Series Data Partitioning Migrate to monthly partitioned collections using Cumulocity’s DataBroker. Configure retention policies:


POST /tenant/options
{
  "category": "configuration",
  "key": "data.measurements.retention",
  "value": "90"
}

Partition naming: measurements_2025_03, measurements_2025_04. Query router logic targets specific partitions based on date range.

Query Result Caching Implement Redis-backed caching layer:

  • Cache aggregated metrics with 10-15 minute TTL
  • Use cache keys: `agg:{deviceGroup}:{metricType}:{timeWindow}
  • Invalidate on measurement arrival via Cumulocity real-time notifications
  • Serve 80%+ of dashboard queries from cache

Connection Pool Tuning Increase MongoDB connection pool in microservice configuration:


C8Y_MONGODB_POOL_SIZE=400
C8Y_MONGODB_MAX_WAIT_TIME=120000

Monitor connection utilization - aim for 60-70% peak usage.

Aggregation Pipeline Optimization Replace client-side aggregation with server-side pipeline:


db.measurements.aggregate([
  {$match: {"source.id": {$in: deviceIds}, "time": {$gte: start, $lte: end}}},
  {$group: {_id: {hour: {$hour: "$time"}}, avgTemp: {$avg: "$c8y_TemperatureMeasurement.T.value"}}},
  {$sort: {"_id.hour": 1}}
])

Use $project to limit field projection and reduce memory footprint.

Pre-Aggregation Strategy Implement Smart Rules that compute hourly/daily aggregates:

  • Trigger on measurement creation events
  • Maintain summary collections: measurements_hourly, `measurements_daily
  • Dashboard queries target pre-aggregated data for 90%+ of use cases

Implementation Order

  1. Deploy compound indexes (immediate 3-5x improvement)
  2. Increase connection pool (resolves timeout issues)
  3. Implement Redis caching (reduces database load by 70%)
  4. Migrate to partitioned collections (enables horizontal scaling)
  5. Deploy pre-aggregation Smart Rules (long-term maintenance)

Expected Results

  • Query response time: 30-45s → 2-4s
  • Cache hit rate: 75-85%
  • Database CPU utilization: reduced by 60%
  • Support for 500K+ devices with same infrastructure

Monitor using Cumulocity’s built-in performance metrics and adjust TTLs based on your data freshness requirements.

Have you analyzed your current indexing strategy? With 100K+ devices, the default single-field indexes won’t scale. Check your MongoDB storage metrics - look at index usage stats and slow query logs. What’s your data retention policy? Time-series data older than 90 days should be archived or aggregated into summary collections.