We’re building a bidirectional sync between Salesforce and our legacy CRM system for account data. About 15K accounts need to sync daily with updates happening on both sides. I’m looking for best practices on field mapping strategies, external ID upsert logic, and error handling for sync operations.
My main concerns are:
- How to handle field mapping when field names and data types don’t match exactly between systems
- What’s the best approach for external ID management to prevent duplicates
- How to handle sync conflicts when the same account is updated in both systems simultaneously
Would appreciate hearing from others who’ve implemented similar Account API integrations. What patterns worked well for maintaining data quality during bidirectional sync?
For external ID management, create a custom field on Account like Legacy_CRM_ID__c and mark it as External ID. Use upsert operations with this field - Salesforce will automatically match existing records or create new ones. This prevents duplicate account creation.
Don’t forget about error handling for sync failures. Network issues, API limits, and validation errors will happen. Build a retry queue for failed updates with exponential backoff. Also implement a reconciliation process that runs weekly to compare record counts and checksums between systems. We discovered thousands of missed updates because our error handling wasn’t robust enough initially.
Excellent question about Account API integration patterns. Here’s a comprehensive approach based on implementations I’ve led for several enterprise clients:
Field Mapping Strategies:
The key is treating field mapping as configuration, not code. Create three components:
-
Field Mapping Registry: A custom metadata type or external configuration file that defines:
- Source field → Target field mappings
- Data type transformations (string to picklist, number formatting, etc.)
- Default values when source is null
- Validation rules specific to each field
-
Transformation Layer: Build reusable transformation functions for common patterns:
- Picklist value mapping (legacy codes → Salesforce values)
- Phone number formatting (various formats → Salesforce standard)
- Address standardization (street/city/state/zip normalization)
- Currency conversion if systems use different currencies
-
Field-Level Conflict Resolution: Different fields may need different strategies:
- System of Record Fields: Some fields are authoritative in one system (e.g., Account Owner always from Salesforce, Billing Terms always from legacy CRM)
- Timestamp-Based Fields: For fields that change frequently (Annual Revenue, Employee Count), use last-modified timestamp
- Manual Review Fields: Critical fields like Account Name trigger alerts for manual resolution rather than auto-updating
External ID Upsert Logic:
For preventing duplicates and maintaining referential integrity:
-
Composite External ID: If your legacy CRM has compound keys, create a formula field that concatenates them:
`Legacy_System__c + ‘_’ + Legacy_Account_ID__c
-
Upsert Operation Pattern: Always use upsert with the external ID field, never query-then-insert/update. This is atomic and prevents race conditions.
-
Orphan Detection: Run periodic reconciliation jobs to find accounts in either system that don’t have matching external IDs. These indicate sync failures or manual record creation that bypassed integration.
-
External ID Immutability: Once set, never change an external ID. If you need to merge accounts, update the external ID mapping in your middleware, not in Salesforce.
Error Handling for Sync Operations:
Robust error handling is what separates reliable integrations from fragile ones:
-
Categorize Errors:
- Transient: Network timeouts, API limits → Retry with exponential backoff
- Data Quality: Validation errors, required fields missing → Send to error queue for data cleansing
- Conflict: Simultaneous updates → Flag for manual review
- System: Salesforce maintenance, authentication failures → Pause sync, alert operations team
-
Sync State Management: Track sync status for each account:
- Last_Successful_Sync__c (datetime)
- Sync_Status__c (picklist: ‘Synced’, ‘Pending’, ‘Error’, ‘Conflict’)
- Sync_Error_Message__c (long text for debugging)
-
Retry Logic: Failed updates go into a retry queue:
- Attempt 1: Immediate retry (catches transient network issues)
- Attempt 2: 5-minute delay
- Attempt 3: 30-minute delay
- After 3 failures: Move to manual review queue
-
Reconciliation Process: Daily job that compares:
- Record counts between systems
- Hash of key fields to detect data drift
- Accounts modified in one system but not synced to the other
Bidirectional Sync Conflict Resolution:
For handling simultaneous updates:
-
Timestamp Comparison: Before any update, check if the target record is newer than the source. If so, skip update and log conflict.
-
Field-Level Mastering: Configure which system is authoritative for each field. Example:
- Salesforce masters: Owner, Stage, Opportunity data
- Legacy CRM masters: Billing information, payment terms, credit limit
- Shared fields: Use timestamp-based resolution
-
Conflict Queue: When conflicts occur, create a record in a custom Sync_Conflict__c object with:
- Account ID
- Field name
- Salesforce value
- External system value
- Both timestamps
- Recommended resolution (based on business rules)
-
Business Rules Engine: Implement rules like:
- If Annual Revenue differs by less than 10%, accept Salesforce value (likely just rounding)
- If Account Name differs, always require manual review
- If Industry changes, accept the change from whichever system updated most recently
Data Quality Maintenance:
-
Pre-Sync Validation: Before sending data to Salesforce, validate:
- Required fields are populated
- Picklist values exist in target system
- Lookup relationships can be resolved
- Data formats match (phone, email, date)
-
Post-Sync Verification: After successful sync:
- Query the record back from Salesforce to confirm values persisted
- Compare critical fields to ensure no unexpected transformations
- Update sync status and timestamp
-
Data Quality Metrics Dashboard: Track:
- Sync success rate (target: >99%)
- Average sync latency (target: <5 minutes)
- Conflict rate by field
- Error categories and trends
This architecture has successfully handled bidirectional syncs for organizations with 100K+ accounts. The key is treating data quality and error handling as first-class concerns, not afterthoughts.
The mapping configuration table approach sounds good. How do you handle field-level conflicts though? If Account Name is updated in both systems between sync cycles, which value should win?