Your issue is multifaceted - it’s not just one setting but the interaction between Streaming, connector configuration, and Autonomous Database capacity. Let me address all the key integration points:
OCI Streaming to Autonomous Database Integration:
The connector architecture needs proper error isolation. Implement a dead-letter queue pattern where malformed messages are routed separately rather than blocking the entire batch. This prevents one bad sensor reading from stalling your pipeline.
Batch Size and Commit Interval Tuning:
Your original settings were creating back-pressure. Optimal configuration for edge IoT workloads:
streaming.batch.size=200
streaming.commit.interval=45000
streaming.max.poll.records=500
streaming.session.timeout=90000
The key is balancing batch size with commit frequency. Smaller batches (200) with moderate commit intervals (45s) provide better fault tolerance. The session timeout must exceed commit interval to prevent consumer group rebalancing during heavy processing.
Connector Serialization and Retry Logic:
Implement exponential backoff for retries and add explicit JSON validation:
// Pseudocode - Enhanced error handling:
1. Validate JSON schema before deserialization
2. Catch SerializationException and log to DLQ
3. Implement exponential backoff: 1s, 2s, 4s, 8s
4. After 4 retries, route to error topic
5. Continue processing next batch without blocking
For nested JSON payloads, pre-flatten at the edge device or use a transformation layer before Streaming ingestion. ADB performs better with normalized data structures.
Also critical: Monitor your Autonomous Database OCPU utilization. If you’re hitting 80%+ during ingestion peaks, the database itself is the bottleneck, not the connector. Scale up ADB or implement time-based throttling at the edge to smooth traffic patterns.
Finally, enable connector metrics in OCI Monitoring to track batch processing latency, retry rates, and DLQ message counts. This visibility is essential for tuning edge-to-cloud pipelines at scale.