We’re importing 50K+ contacts daily via the REST API and duplicate detection is taking 4-6 hours. Our batch processing calls the /contacts endpoint with 500 records per request. The duplicate matching runs synchronously which blocks our import pipeline.
We’ve tried adjusting batch sizes (200-1000 records) but performance doesn’t improve much. The API response shows it’s running fuzzy matching on name/email/phone for every contact. Is there a way to optimize duplicate detection or use async patterns?
Current approach:
POST /crmRestApi/resources/11.13.18.05/contacts
Payload: [{"FirstName":"John","LastName":"Smith","EmailAddress":"john@example.com"},...]
Header: DuplicateDetection: enabled
Our indexes on Contact objects look standard. Any recommendations for batch processing optimization or tuning duplicate detection rules?