Data storage bottleneck: High ingest latency in SAP IoT with HANA backend during peak sensor data loads

We’re facing significant ingestion delays with our SAP IoT implementation on HANA. Processing 50k+ sensor readings per minute results in 8-12 second latency, which breaks our real-time analytics requirements.

Our current setup uses single-threaded ingestion via REST API with default HANA table configurations. I’ve noticed CPU spikes to 85% during peak loads and disk I/O bottlenecks. We haven’t implemented any partitioning strategy or batch processing yet.


POST /iot/core/api/v1/tenant/{tenantId}/measures
Payload: Single sensor reading per call
Response time: 180-250ms per request

The monitoring dashboard shows memory usage is stable at 60%, but historical data from 6+ months ago seems to impact query performance. Has anyone optimized HANA for high-volume IoT ingestion with parallel pipelines and proper indexing strategies?

Here’s a comprehensive solution addressing all the optimization areas:

1. HANA Partitioning and Indexing: Implement range partitioning on your measurement table by timestamp with daily granularity. This isolates recent data and enables efficient partition pruning:

ALTER TABLE IOT_MEASUREMENTS
PARTITION BY RANGE (MEASUREMENT_TIMESTAMP)
(PARTITION p_day VALUES LESS THAN '2025-03-24');

Create a composite index on (DEVICE_ID, MEASUREMENT_TIMESTAMP) and ensure column store delta merge is scheduled appropriately.

2. Batch Ingestion API Optimization: Switch from single-record POST to bulk ingestion endpoint. Optimal batch size is 800-1200 records based on your volume:


POST /iot/core/api/v1/tenant/{tenantId}/measures/bulk
Payload: Array of 1000 sensor readings
Expected response: <500ms for entire batch

Configure 10-12 parallel ingestion threads to handle 50k records/minute efficiently. This distributes load across HANA cores and prevents single-thread bottlenecks.

3. Resource Monitoring Implementation: Set up comprehensive monitoring with thresholds:

  • CPU: Alert at 75%, critical at 85%
  • Memory: Alert at 80%, critical at 90%
  • Disk I/O: Monitor wait times, alert if >50ms average
  • Ingestion queue depth: Alert if backlog exceeds 5000 records

Use SAP HANA Cockpit or Cloud ALM for real-time dashboards and automated alerting.

4. Cold Data Offloading Strategy: Implement automatic offloading of data older than 90 days to SAP HANA Native Storage Extension (NSE) or nearline storage. Configure lifecycle management:


ALTER TABLE IOT_MEASUREMENTS
PARTITION BY RANGE (MEASUREMENT_TIMESTAMP)
WITH PARAMETERS ('AUTO_MERGE_DECISION_FUNC'='ALLOW_PAGEABLE');

This moves cold partitions to extended storage while keeping them queryable, freeing up main memory for hot data.

5. Parallel Pipeline Configuration: Optimize your IoT service configuration:

  • Increase max concurrent connections to 12
  • Set connection pool size to 15 (allows headroom)
  • Configure batch timeout to 30 seconds
  • Enable connection keep-alive to reduce overhead
  • Implement exponential backoff retry logic for failed batches

Additional Optimizations:

  • Enable HANA compression for historical partitions (2-3x space savings)
  • Schedule delta merge during low-traffic windows
  • Use prepared statements for insert operations
  • Implement client-side buffering to smooth out traffic bursts

Expected Results: With these changes, you should achieve:

  • Ingestion latency: <2 seconds for 99th percentile
  • CPU utilization: 45-65% during peak loads
  • Memory usage: Stable at 55-70%
  • Query performance: 40-60% improvement on recent data

Monitor for 72 hours after implementation and adjust parallel thread count if needed. The key is balancing parallelism with HANA resource capacity.

Single-threaded REST calls are definitely your bottleneck here. You’re making 50k individual API calls per minute when you should be batching. Switch to bulk ingestion endpoints that accept arrays of measurements. Also, your HANA table likely needs column store optimization and time-based partitioning. Check your table type first - row store will kill performance at this scale.

Thanks for the quick response. I confirmed we’re using column store tables, but no partitioning is configured. What’s the recommended partition strategy for time-series IoT data? Should I partition by day, week, or month? Also, how large should the batch sizes be for the bulk API - I’ve seen recommendations ranging from 100 to 5000 records per batch.

I’ve started implementing daily partitioning and switched to batch ingestion with 750 records per batch. Initial testing shows latency dropped to 3-4 seconds, which is better but still not meeting our sub-2-second target. The CPU usage is now more distributed, but I’m still seeing occasional spikes. Should I increase the number of parallel ingestion threads? Currently running with 4 concurrent connections.

For IoT time-series data, daily or weekly partitioning works best depending on your retention policy. If you’re keeping 6+ months of hot data, definitely implement cold data offloading to nearline storage after 60-90 days. This will dramatically improve query performance on recent data.

Regarding batch sizes, start with 500-1000 records per batch and monitor HANA memory consumption. Too large batches can cause memory pressure. Also verify your parallel pipeline configuration - you should have multiple ingestion threads processing batches concurrently. Check the IoT service configuration for max concurrent connections and adjust based on your HANA resource capacity.

Don’t forget indexing strategy. Create composite indexes on device_id + timestamp columns for your most common query patterns. Also, check if you have any unnecessary indexes that slow down inserts. One often-overlooked aspect is the HANA statistics - make sure auto-statistics are enabled and running regularly. Stale statistics can cause the optimizer to choose terrible execution plans.