Here’s a complete solution covering all three critical areas:
API Gateway Payload Limits:
The Oracle IoT Cloud Platform API gateway enforces a 2MB payload limit per request by default, with a maximum configurable limit of 10MB. However, increasing the limit isn’t recommended for bulk ingestion scenarios due to timeout and memory constraints. Instead, work within the 2MB limit using proper chunking strategies.
To optimize within this constraint, enable gzip compression on your HTTP requests. Add the header Content-Encoding: gzip and compress your JSON payload before transmission. Example implementation:
import gzip
import json
payload = json.dumps(events).encode('utf-8')
compressed = gzip.compress(payload)
headers = {'Content-Encoding': 'gzip'}
Compression typically achieves 3-5x reduction for JSON event data, allowing you to send 6000-10000 events in a single 2MB request. Monitor the actual compressed payload size and adjust batch sizes dynamically based on compression ratios observed in your data.
Chunked Upload Strategy:
Implement intelligent chunking based on payload size, not event count. Calculate chunk boundaries dynamically: serialize events to JSON incrementally, tracking cumulative size, and create a new chunk when size approaches 1.5MB (leaving 500KB headroom for HTTP overhead and compression variability).
Here’s the chunking algorithm:
chunks = []
current_chunk = []
current_size = 0
for event in events:
event_json = json.dumps(event)
event_size = len(event_json.encode('utf-8'))
if current_size + event_size > 1.5 * 1024 * 1024:
chunks.append(current_chunk)
current_chunk = [event]
current_size = event_size
else:
current_chunk.append(event)
current_size += event_size
Use the bulk ingestion transaction API for atomicity. Before uploading chunks, create a transaction: POST /iot/api/v2/apps/events/bulk/transaction which returns a transaction ID. Upload each chunk with the transaction ID in the header: X-Transaction-ID: {txnId}. After all chunks upload successfully, commit the transaction: POST /iot/api/v2/apps/events/bulk/transaction/{txnId}/commit. If any chunk fails, rollback: POST /iot/api/v2/apps/events/bulk/transaction/{txnId}/rollback.
Implement parallel chunk uploads with concurrency limits. Upload 3-5 chunks concurrently to improve throughput while avoiding API rate limits. Use a semaphore or worker pool pattern to control concurrency. Monitor response times and adjust concurrency based on observed latency.
Bulk Data Ingestion Best Practices:
Implement comprehensive retry logic with exponential backoff. For transient failures (503, 429, 502), retry with delays: 1s, 2s, 4s, 8s, up to 60s max. For 413 errors, reduce chunk size by 25% and retry. For other 4xx errors (except 429), don’t retry as they indicate client errors.
Add rate limit awareness: parse the X-RateLimit-Remaining and X-RateLimit-Reset response headers. If remaining quota is low, add delays between chunk uploads. Calculate optimal upload rate: chunks_per_window = X-RateLimit-Remaining / chunks_remaining. Insert delays to spread uploads evenly across the rate limit window.
Implement idempotency for safe retries. Include a unique event ID in each event and set X-Idempotency-Key header per chunk. The API gateway deduplicates requests with the same idempotency key within a 24-hour window, preventing duplicate ingestion on retry.
Add comprehensive logging and monitoring. Log chunk upload attempts, sizes, durations, and outcomes. Track metrics: total events uploaded, chunks sent, retry counts, transaction commit/rollback rates. Set up alerts for high failure rates or transaction timeouts.
For your specific scenario with 10k-15k events per batch: expect to create 8-12 chunks of ~1.5MB each (with compression). With parallel uploads (5 concurrent), total upload time should be 30-60 seconds. Implement this in your nightly batch job and you’ll eliminate the data loss while staying within API limits.