Here’s the comprehensive solution combining all the best practices for robust timecard synchronization:
Retry Logic Implementation:
Implement exponential backoff with jitter: first retry after 2 minutes, then 5, 10, and 20 minutes. Maximum 4 retry attempts before marking as failed and triggering manual review. Include circuit breaker pattern - if 3 consecutive batches fail, pause all sync operations for 15 minutes to prevent cascading failures.
Connection Pooling Configuration:
Set up dedicated connection pool for payroll API: minimum 5 connections, maximum 15, with 120-second timeout. Enable TCP keepalive and connection validation before use. Configure connection reuse and implement connection health checks every 60 seconds.
HttpClient client = HttpClient.newBuilder()
.connectTimeout(Duration.ofSeconds(30))
.executor(Executors.newFixedThreadPool(10))
.build();
Idempotency Keys Workaround:
Since your payroll system doesn’t support native idempotency, implement local duplicate prevention via staging queue. Create a timecard_sync_staging table:
CREATE TABLE timecard_sync_staging (
sync_id UUID PRIMARY KEY,
employee_id INT, timecard_date DATE,
hours DECIMAL, status VARCHAR(20),
attempt_count INT, last_attempt TIMESTAMP
);
Generate sync_id using MD5 hash of (employee_id + date + hours). Before each API call, check if this sync_id already exists with status ‘completed’. This prevents duplicates even across system restarts.
Staging Queue Pattern:
Implement three-phase processing: (1) Insert timecards into staging with status=‘pending’, (2) Select pending records, update to ‘in_progress’, attempt API call, (3) On success update to ‘completed’, on timeout leave as ‘in_progress’ with incremented attempt_count. Separate cleanup job runs hourly to reconcile ‘in_progress’ records older than 30 minutes by querying payroll system’s API for confirmation.
Batch Size Optimization:
Reduce batch size from 500 to 75-100 timecards per API call. Smaller batches complete faster, reducing timeout probability. Implement parallel batch processing with 3-5 concurrent threads, each handling separate date ranges or employee groups.
Reconciliation Process:
Schedule hourly reconciliation job during business hours: query payroll system for all timecards submitted in last 2 hours, compare against staging table, update any mismatched statuses. Flag discrepancies for manual review. This catches edge cases where response was lost but submission succeeded.
This combination eliminates duplicate submissions while maintaining reliability even with an uncooperative external API. The staging queue provides local control, while reconciliation ensures eventual consistency. Your failure rate should drop from 15-20% to under 1%, with zero duplicate payroll entries.