I’ve completed several large-scale recruiting data migrations and the duplicate email issue is always the most challenging aspect. Here’s a comprehensive solution that addresses all three focus areas:
Understanding Duplicate Email Validation
SuccessFactors Recruiting enforces email uniqueness at the candidate profile level because it’s designed around a candidate-centric model. Each person (email) has ONE candidate profile, with multiple applications linked to it. This is actually better for recruiting analytics and candidate experience, but requires careful migration planning.
Using the Recruiting Import Template Correctly
The standard template has multiple tabs that must be imported in sequence:
-
Candidate Import Tab: Import deduplicated candidates first
- Extract unique email addresses from your legacy data
- Include: email, firstName, lastName, phone (minimum required)
- Optional: source, referrer, tags for better candidate tracking
-
Application Import Tab: Import applications with candidate references
- After candidate import, export to get system-generated candidateIds
- Create mapping: legacy_candidate_id → sf_candidate_id
- Import applications with candidateId, jobReqId, applicationDate, status
OData API Workaround for Complex Scenarios
For your specific case with 15,000 records and potential data quality issues, the OData API provides more control:
// First, check if candidate exists
GET /odata/v2/Candidate?$filter=email eq 'john.smith@email.com'
// If not exists, create candidate
POST /odata/v2/Candidate
{
"email": "john.smith@email.com",
"firstName": "John",
"lastName": "Smith"
}
// Then create application
POST /odata/v2/JobApplication
{
"candidateId": "12345",
"jobReqId": "67890",
"applicationDate": "/Date(1577836800000)/"
}
Recommended Migration Approach
Phase 1 - Data Preparation:
- Deduplicate candidates by email in your source data
- For each unique email, consolidate candidate information (use most recent or most complete record)
- Create a mapping table: legacy_email → all_legacy_application_ids
Phase 2 - Candidate Import:
- Import unique candidates via CSV template (Candidate tab)
- Export imported candidates to retrieve SuccessFactors candidateIds
- Update your mapping table: legacy_email → sf_candidate_id → legacy_application_ids
Phase 3 - Application Import:
- Transform application data to reference sf_candidate_id instead of legacy candidate data
- Validate that all applications have corresponding candidates in SuccessFactors
- Import applications via CSV template (Application tab) or OData API
Phase 4 - Validation:
- Run candidate count report: should match unique emails from legacy
- Run application count report: should match total applications from legacy
- Spot check 50-100 candidates with multiple applications to verify data integrity
Handling Edge Cases
For orphaned applications (no candidate data), create minimal candidate records:
- Use application email as candidate email
- Parse firstName/lastName from application if available
- If no name data, use “Legacy” as firstName and application ID as lastName
- Tag these with “data-migration-minimal” for future cleanup
For applications with missing emails:
- Generate placeholder emails: legacy_app_id@migration.placeholder
- Tag with “email-missing” for post-migration cleanup
- Follow up with hiring managers to get actual candidate emails
This approach has successfully migrated 50K+ candidate records across multiple implementations while maintaining data integrity and application history.