Data stream API rate limit throttles real-time event delivery with 429 Too Many Requests

Our real-time event delivery system is experiencing severe throttling due to API rate limits on the data-stream API. Devices are sending telemetry every 5 seconds, but we’re getting 429 Too Many Requests errors, causing event loss and delayed processing.

We have 150 devices reporting temperature, pressure, and vibration measurements. Each device sends 3 events per interval (one per measurement type), resulting in 450 events every 5 seconds (5400 events/minute). The API starts returning 429 after about 2-3 minutes of operation:

{"error": "Too Many Requests", "message": "Rate limit exceeded", "status": 429}

We’re using individual POST requests to /event/events for each measurement. Is there an API rate limit per tenant or per device? What’s the recommended approach for high-frequency telemetry - should we implement event batching, or adjust device reporting intervals? The current setup is causing data gaps in our real-time monitoring dashboard.

Comprehensive solution addressing all three focus areas:

API Rate Limits: Cumulocity enforces tenant-level rate limits:

  • Standard tier: ~100 requests/second
  • Enterprise tier: ~200 requests/second (configurable)
  • Limits are cumulative across ALL API operations (REST, MQTT processing, etc.)
  • 429 responses include Retry-After header indicating when to retry

Your current load:

  • 450 events / 5 seconds = 90 requests/second
  • Near threshold with zero margin for other operations
  • Bursts or concurrent operations easily exceed limits

Rate limits apply to:

  • REST API calls (POST, GET, PUT, DELETE)
  • MQTT messages processed by platform (each publish = 1 API operation)
  • Bulk operations (count as 1 request regardless of batch size)

Event Batching Strategy: Implement batching at multiple levels:

1. Device-Level Batching (MQTT SmartREST):


// SmartREST template for multi-measurement
211,temperature,25.5,pressure,101.3,vibration,0.05

Single MQTT publish sends 3 measurements = 1 API operation (vs 3 separate operations)

2. Gateway-Level Batching: If using gateway architecture:

  • Gateway collects events from multiple devices
  • Batches into groups of 100-500 events
  • Posts via bulk events API every 5-10 seconds

POST /event/events/bulk
[
  {"type": "c8y_Temperature", "source": {"id": "device1"}, "text": "25.5°C"},
  {"type": "c8y_Pressure", "source": {"id": "device1"}, "text": "101.3 kPa"},
  ... (up to 1000 events)
]

3. Measurement API Alternative: For telemetry, use measurements API instead of events:


POST /measurement/measurements/bulk
[
  {"source": {"id": "device1"}, "type": "c8y_MultiSensor",
   "c8y_Temperature": {"T": {"value": 25.5, "unit": "C"}},
   "c8y_Pressure": {"P": {"value": 101.3, "unit": "kPa"}}}
]

Measurements are optimized for time-series data and have better performance characteristics.

Device Reporting Interval Optimization: Optimize intervals based on measurement criticality:

Tiered Reporting Strategy:


1. Critical measurements (temperature): 5s interval
2. Important measurements (pressure): 10s interval
3. Monitoring measurements (vibration): 30s interval
4. Status/diagnostic data: 5min interval

This reduces from 450 events/5s to:

  • Temperature: 150 events/5s (30 req/sec)
  • Pressure: 150 events/10s (15 req/sec)
  • Vibration: 150 events/30s (5 req/sec)
  • Total: 50 req/sec (44% reduction)

Dynamic Interval Adjustment:

# Pseudocode for adaptive reporting
if measurement_in_normal_range:
    interval = 30s  # Slow reporting
elif measurement_near_threshold:
    interval = 10s  # Increased monitoring
elif measurement_in_alert:
    interval = 5s   # Real-time reporting

Edge Aggregation Pattern: Implement edge aggregator for fleet management:


1. Devices → Edge Gateway (local network, high frequency)
2. Edge Gateway batches events (every 5-10s)
3. Gateway → Cloud (bulk API, low request count)
4. Cloud processes batch as single transaction

Benefits:

  • Reduces cloud API calls by 100-500x
  • Maintains local real-time monitoring
  • Resilient to network interruptions (local buffering)
  • Stays well under rate limits

Implementation Recommendations:

Immediate Fix (reduce by 90%):

  • Switch from individual POST to bulk events API
  • Batch 450 events into single POST every 5s
  • Request rate: 90 req/sec → 0.2 req/sec

Short-term Optimization (reduce by 95%):

  • Implement SmartREST templates for MQTT devices
  • Combine 3 measurements per device into 1 message
  • Request rate: 90 req/sec → 30 req/sec
  • Then apply bulk API: 30 req/sec → 0.1 req/sec

Long-term Architecture:

  • Deploy edge gateway/aggregator
  • Implement tiered reporting intervals
  • Use measurements API for telemetry (not events API)
  • Reserve events API for state changes and alerts
  • Monitor rate limit headers in responses

Rate Limit Handling Code:

# Pseudocode for handling 429 responses
def send_events_with_retry(events_batch):
    max_retries = 3
    for attempt in range(max_retries):
        response = post('/event/events/bulk', events_batch)
        if response.status == 429:
            retry_after = response.headers.get('Retry-After', 60)
            sleep(retry_after)
            continue
        elif response.status == 200:
            return success
        else:
            log_error(response)
            break
    return failure

Your 429 errors are caused by excessive individual API calls (90 req/sec) approaching tenant rate limits. Implement event batching using bulk APIs to reduce request rate by 99% while maintaining data throughput. Combine with optimized device reporting intervals for maximum efficiency.

The 429 error indicates you’re hitting tenant-level rate limits, not device-level. Cumulocity has default limits around 100-200 requests per second per tenant, depending on your plan. With 450 events every 5 seconds (90 requests/second), you’re close to the threshold. Add in other API activity and you’ll exceed limits quickly.

Event batching is the answer. Instead of 450 individual POST requests every 5 seconds, batch them into a single request using the bulk events API. You can send up to 1000 events in one POST to /event/events/bulk. This reduces your request rate from 90 req/sec to less than 1 req/sec while maintaining the same data throughput. Much more efficient and stays well under rate limits.

Good points on both batching and interval optimization. For batching, do devices send batches directly, or should we implement an edge aggregator that collects events from multiple devices and batches them? We’re using MQTT for device communication, so I’m not sure how bulk events API integrates with MQTT publish.

MQTT has built-in batching support via SmartREST templates. You can define templates that send multiple measurements in a single MQTT message. For example, template 211 sends multiple measurements: “211,temperature,25.5,pressure,101.3,vibration,0.05”. This gets processed as a single API call on the platform side. Much more efficient than individual MQTT publishes per measurement.

Batching is definitely the right approach, but also consider adjusting device reporting intervals. Do you really need 5-second granularity for all three measurement types? You could stagger reporting - temperature every 5s, pressure every 10s, vibration every 15s - to spread the load. Or use different intervals for normal vs alert conditions. This reduces total event volume while maintaining responsiveness for critical measurements.