Here’s the complete solution for handling duplicates in vendor invoice batch imports:
1. API Idempotency Pattern Implementation:
Since D365 doesn’t natively support idempotency keys, implement a two-tier duplicate detection strategy:
- Client-side cache: Maintain a Redis or in-memory cache of invoice keys (vendor + invoice number) for the last 48 hours. Check against this cache before API submission.
- Pre-flight validation: Use OData $batch with GET requests to check existence in D365:
GET /api/data/v9.2/VendorInvoices?$filter=VendorAccount eq 'V001' and InvoiceNumber eq 'INV-2024-001'&$select=InvoiceId
2. Batch Processing Logic - Independent Operations:
Structure your batch request to allow partial success. Critical: Do NOT use atomicityGroup for duplicate-prone operations.
POST /api/data/v9.2/$batch
Content-Type: multipart/mixed; boundary=batch_boundary
--batch_boundary
Content-Type: application/http
POST /api/data/v9.2/VendorInvoices HTTP/1.1
Content-Type: application/json
{invoice1_data}
--batch_boundary
Content-Type: application/http
POST /api/data/v9.2/VendorInvoices HTTP/1.1
Content-Type: application/json
{invoice2_data}
Each operation succeeds or fails independently without affecting others.
3. Error Handling for 409 Conflict:
Implement sophisticated response parsing to handle mixed success/failure:
- Parse multipart batch response
- Log 409 responses as warnings (not errors) with invoice identifier
- Track success count versus duplicate count versus actual errors
- Retry only genuine failures (500, 503), not duplicates
- Update client-side cache with successfully imported invoice keys
4. Optimized Duplicate Detection:
For high-volume processing, implement bulk duplicate checking:
- Batch GET requests: Group up to 20 filters using ‘or’ operators
- Cache optimization: Use bloom filters for fast negative lookups (99% of invoices aren’t duplicates)
- Database-side: If you have direct database access, query staging tables before API submission
5. Batch Size Optimization:
Balance throughput versus failure impact:
- Optimal batch size: 15-20 invoices per $batch request
- Smaller batches = more resilient to duplicates but more API calls
- Larger batches = fewer calls but duplicate impact increases
- Monitor your duplicate rate and adjust accordingly
6. Retry Logic Best Practices:
Implement exponential backoff for genuine failures, but handle duplicates differently:
// Pseudocode for retry logic:
1. Submit batch via API
2. Parse response and categorize results:
- 201 Created: Success, add to cache
- 409 Conflict: Log as duplicate, skip retry
- 500/503: Add to retry queue with backoff
3. For retry queue: exponential backoff (1s, 2s, 4s, 8s)
4. After 3 retries: move to manual review queue
7. Monitoring and Alerting:
Track these metrics:
- Duplicate rate (should be <5% in healthy systems)
- Batch success rate
- Average processing time per batch
- 409 versus genuine error ratios
High duplicate rates indicate upstream system issues that need addressing.
8. Alternative Approach - Upsert Pattern:
If your D365 version supports it, use PATCH with upsert semantics:
- Query for existing invoice first
- If exists: PATCH to update (if allowed)
- If not exists: POST to create
- This requires alternate keys configured in D365 (vendor + invoice number)
Root Cause Prevention:
Address the upstream retry logic causing duplicates:
- Implement proper idempotency in your source system
- Use message queuing with deduplication (Azure Service Bus, RabbitMQ)
- Add unique constraint checks before queuing for D365 import
The combination of client-side caching, independent batch operations, and proper 409 handling will give you resilient duplicate management without sacrificing throughput.