After implementing device registry caching across multiple Cumulocity deployments, here’s what works well:
Tiered Caching Architecture
Implement three cache layers with different characteristics:
L1 - Application Memory Cache:
- Size: 10K most-accessed devices per application instance
- TTL: 30 seconds
- Hit rate: 40-50% of total queries
- Technology: Caffeine/Guava cache with LRU eviction
- Benefit: Zero network latency, sub-millisecond response
L2 - Regional Redis Cache:
- Size: Full device registry (200K+ devices)
- TTL: 5 minutes for standard properties, 30 seconds for frequently-updated fields
- Hit rate: 35-40% of queries (after L1 miss)
- Technology: Redis Cluster with read replicas per region
- Benefit: Sub-10ms network latency within region
L3 - Cumulocity Database:
- Source of truth, no TTL
- 10-15% of queries reach this layer
- Response time: 50-200ms depending on query complexity
TTL-Based Invalidation Strategy
Differentiate TTL by data volatility:
{
"deviceId": {"ttl": 300},
"name": {"ttl": 300},
"type": {"ttl": 300},
"lastUpdated": {"ttl": 30},
"connectionStatus": {"ttl": 30},
"firmware.version": {"ttl": 300},
"config.*": {"ttl": 60}
}
Static properties (device type, hardware version) get longer TTL. Dynamic properties (connection status, last message time) get shorter TTL. This balances consistency with performance.
Event-Driven Cache Updates
For critical device lifecycle events, implement immediate cache invalidation:
- Device commissioning/decommissioning
- Firmware updates
- Configuration changes
- Ownership transfers
Publish invalidation events to Cumulocity notification API. Each region subscribes and invalidates local cache. Use versioned cache entries to handle race conditions - only accept invalidations with version >= current cached version.
Multi-Region Synchronization
Challenges with cross-region consistency:
- Network latency: 50-150ms between regions
- Event propagation delay: 2-5 seconds via Cumulocity notifications
- Clock skew: Can cause ordering issues
Solution - Implement vector clocks or logical timestamps:
CacheEntry {
deviceId: "device_12345",
data: {...},
version: 47,
timestamp: 1685612340123,
region: "EU"
}
On cache update, increment version. On invalidation event from another region, compare versions. Only invalidate if incoming version > cached version. This ensures causally consistent cache state.
Cache Hit Rate Monitoring
Track metrics per cache layer:
- L1 hit rate: Target 45-50%
- L2 hit rate: Target 35-40%
- Overall hit rate: Target 80-85%
- Stale data rate: Track queries serving data older than TTL
Alert if overall hit rate drops below 75% - indicates cache sizing or TTL issues.
Cold Start Optimization
When adding new regions:
- Pre-warm cache from existing region using Redis DUMP/RESTORE
- Identify hot keys (top 20% accessed devices = 80% of queries)
- Bulk transfer hot keys to new region’s Redis
- Gradually direct traffic as cache warms: 10% → 25% → 50% → 100%
- Monitor hit rate during ramp-up, pause if drops below 70%
Reduces cold start impact from 15 minutes to 2-3 minutes.
Consistency Trade-offs
Different data types require different consistency guarantees:
Strong consistency (bypass cache):
- Device commissioning operations
- Security credential updates
- Billing-related property changes
Eventual consistency (serve from cache):
- Device telemetry metadata
- Descriptive properties (name, location)
- Aggregate statistics
Implement cache bypass header: X-Cache-Control: no-cache for operations requiring strong consistency.
Recommended Configuration
For 200K device deployment:
- Redis Cluster: 6 nodes per region (3 primary, 3 replica)
- Memory: 32GB per node (16GB for cache, 16GB overhead)
- L1 cache: 512MB per application instance
- Network: Dedicated Redis VPC for low latency
- Monitoring: Track cache hit rate, eviction rate, memory usage
This architecture delivers <100ms query response for 85% of requests while maintaining acceptable consistency across regions. The tiered approach and selective event-driven invalidation balance performance with data freshness requirements.