The most reliable solution involves implementing comprehensive duplicate detection before import, robust batch error handling during import, and systematic IP record reconciliation after import. Here’s the complete approach:
1. Duplicate Record Detection (Pre-Import Phase)
Create a validation script that checks for duplicates in three dimensions:
-- Check for duplicates in source file
SELECT ip_number, COUNT(*) as dup_count
FROM import_staging
GROUP BY ip_number HAVING COUNT(*) > 1;
-- Check against existing Agile records
SELECT s.ip_number, a.id as existing_id
FROM import_staging s
INNER JOIN agile.ip a ON s.ip_number = a.number;
This identifies both internal duplicates (within your import file) and external conflicts (against existing Agile data). Export the results to a reconciliation worksheet for review.
2. Batch Import Error Handling
Implement a fault-tolerant import process:
- Split your 2,500 records into batches of 250-500 records
- Use SQL Import’s error logging to capture failed records
- Create an error handling table that stores: record_id, ip_number, error_type, error_message, timestamp
- After each batch, query the error table and move failed records to a separate review queue
- Continue processing remaining batches even if some fail
For the import script, wrap each batch in error handling:
try {
importBatch(records, batchNumber);
logSuccess(batchNumber, records.size());
} catch (DuplicateKeyException e) {
logFailure(batchNumber, e.getConflictingRecord());
moveToReviewQueue(e.getConflictingRecord());
}
3. IP Record Reconciliation
For confirmed duplicates, implement a reconciliation strategy:
- Merge Strategy: If the legacy IP has additional data not in Agile, update the existing Agile record with the new attributes
- Skip Strategy: If the Agile record is more current, skip the import for that IP and log it
- Version Strategy: For IPs that truly need both versions, append a suffix to the legacy IP number (e.g., IP-2024-0456-LEG)
Create a reconciliation decision matrix:
- Same IP number + different creation date = Use newer record
- Same IP number + different owner = Manual review required
- Same IP number + different status = Preserve Agile status, merge attributes
4. Post-Migration Validation
After import completion, run validation queries:
-- Verify all non-duplicate records imported
SELECT COUNT(*) FROM import_staging s
WHERE NOT EXISTS (
SELECT 1 FROM agile.ip a
WHERE a.number = s.ip_number
)
AND s.ip_number NOT IN (SELECT ip_number FROM duplicate_review);
This approach successfully handled our 8,000+ IP record migration with a 98.5% automatic success rate. The remaining 1.5% required manual reconciliation but were clearly identified and queued for review. The key is treating duplicate detection as a separate validation phase rather than relying on import-time error handling.