Labor management user sync fails for bulk Azure AD import in SOC 4.1

We’re trying to import 450 users from Azure AD into Opcenter labor management module (SOC 4.1) but the bulk sync job fails after processing about 180-200 users. The Azure AD connector shows error “Duplicate user principal name detected” but we’ve verified all UPNs in Azure AD are unique.

The sync job doesn’t provide detailed information about which specific users are causing conflicts. We’re using the standard Azure AD connector configuration with user principal name as the unique identifier.


AzureADSync Job: SYNC-2024-1208-001
Status: FAILED
Processed: 187/450 users
Error: Duplicate UPN constraint violation
Table: LaborUser

Shift scheduling is incomplete because supervisors and operators aren’t fully imported. Has anyone successfully completed large Azure AD bulk imports? We need to understand if this is a database cleanup issue or connector configuration problem.

Also check your Azure AD connector configuration for the batch size setting. The default batch size of 200 users might be causing transaction timeouts on large imports. Try reducing it to 100 users per batch.

In the connector config file, set: `azure.ad.sync.batch.size=100 This gives each batch more time to complete and commit before moving to the next set of users. Combined with cleaning up the duplicate records, this should resolve your bulk import issues.

Excellent guidance from everyone. We successfully completed the bulk import after following these steps:

1. Azure AD Connector Configuration: First, we reviewed and corrected our connector configuration. The key settings that needed adjustment:

Connector Config File (azure-ad-connector.properties):


azure.ad.sync.batch.size=100
azure.ad.sync.unique.identifier=userPrincipalName
azure.ad.sync.retry.on.conflict=false
azure.ad.sync.transaction.timeout=300

The critical change was reducing batch size from 200 to 100 users. This prevents transaction timeouts and makes it easier to identify problematic users if a batch fails. We also disabled automatic retry on conflicts to avoid creating more duplicate records.

2. User Principal Name Uniqueness Check: We ran the SQL query Mike suggested and found 23 duplicate UPNs:


SELECT UserPrincipalName, COUNT(*) as DuplicateCount,
       STRING_AGG(CAST(UserId AS VARCHAR), ',') as UserIds
FROM LaborUser
GROUP BY UserPrincipalName
HAVING COUNT(*) > 1;

This showed us exactly which users had duplicates and their internal IDs. Most duplicates had null values for EmployeeNumber, Department, or ShiftGroup - clear indicators they were incomplete records from failed sync attempts.

3. Database Cleanup Process: We used the Labor Management API to safely remove duplicate records rather than direct database deletes:

Step 1 - Identify incomplete records: For each duplicate UPN, we called the API to get full user details:


GET /api/v1/labor/users?upn={userPrincipalName}

This returned all records with that UPN. Incomplete records were missing required fields like EmployeeNumber or had null Department values.

Step 2 - Deactivate duplicates: For each incomplete user ID, we deactivated first:


PATCH /api/v1/labor/users/{userId}
{
  "status": "INACTIVE",
  "reason": "Duplicate record cleanup"
}

Step 3 - Remove duplicates: After deactivation, we deleted the incomplete records:


DELETE /api/v1/labor/users/{userId}?force=false

Using force=false ensures the API validates there are no dependent records (shift assignments, attendance logs) before deletion. If dependencies exist, the API returns a 409 conflict and you need to handle those manually.

We cleaned up all 23 duplicate UPNs using this process, which took about 2 hours with proper validation at each step.

4. Pre-Import Validation: Before running the bulk import again, we added a validation step:

Check Azure AD users against existing LaborUser table:


SELECT azad.UserPrincipalName
FROM AzureADSyncStaging azad
INNER JOIN LaborUser lu ON azad.UserPrincipalName = lu.UserPrincipalName
WHERE lu.Status = 'ACTIVE';

This query (using the staging table the connector populates before sync) identified 12 users who already existed as active records. We excluded these from the bulk import to avoid conflicts.

5. Successful Bulk Import: With clean database and optimized connector config, the bulk import completed successfully:

  • Total users imported: 438 (450 minus 12 existing active users)
  • Batches processed: 5 batches of 100 users, 1 batch of 38 users
  • Duration: 18 minutes
  • Errors: 0

Post-Import Verification: We validated the import by checking:

  1. All users have unique UPNs in LaborUser table
  2. EmployeeNumber, Department, and ShiftGroup populated for all users
  3. User status is ACTIVE for all imported records
  4. Shift scheduling can now assign imported supervisors and operators

Key Lessons:

  • Always clean up failed import attempts before retrying
  • Use the Labor Management API for cleanup, never direct database deletes
  • Reduce batch size for large imports to prevent transaction timeouts
  • Validate Azure AD users against existing records before bulk import
  • The Azure AD connector doesn’t automatically handle cleanup of partial imports - you must do this manually

Our shift scheduling is now fully operational with all 450 users properly imported and assigned to their respective departments and shift groups.

Don’t delete directly from the database - you’ll break referential integrity. Use the Labor Management API to deactivate and then remove the duplicate users. The API handles cascade deletes for related records like shift assignments and attendance logs.

First identify which duplicate entries are incomplete (usually the ones with null EmployeeNumber or missing Department values). Those are safe to remove through the API. Keep the complete records.

This typically happens when previous failed sync attempts left partial user records in the labor management database. The Azure AD connector checks for UPN uniqueness but doesn’t automatically clean up orphaned records from failed imports.

Query the LaborUser table to find duplicate or incomplete entries. You’ll likely see users with matching UPNs but different internal IDs from the failed sync attempts.

I agree with the orphaned records theory. We had this exact issue during our migration. Run a SQL query to identify duplicate UPNs in the LaborUser table:


SELECT UserPrincipalName, COUNT(*)
FROM LaborUser
GROUP BY UserPrincipalName
HAVING COUNT(*) > 1;

This will show which UPNs have multiple entries. Before bulk import, you need to clean these up or the connector will hit constraint violations.