Bulk data import job intermittently drops records in workflow automation

camila_analyst · March 14, 2025, 9:23am

We’re experiencing intermittent record loss during scheduled bulk imports via Experience Platform workflow automation. The workflow processes customer data files (typically 50-80K records) every 6 hours, but we’re seeing random gaps - sometimes 200-300 records just vanish without errors in the workflow logs.

The workflow shows successful completion status, but downstream validation reveals missing customer records. We’ve checked file size limits in the workflow configuration, but can’t find documentation on hard limits for bulk imports. The logging level is set to INFO, which might not capture silent data truncation issues.

This creates significant customer data gaps affecting our marketing campaigns. Has anyone dealt with similar silent failures in AEC 2021 bulk import workflows? Need guidance on proper logging configuration and validation checkpoints.

edward_869 · April 1, 2025, 5:17pm

For proper visibility, you need to enable DEBUG level logging in your workflow configuration and implement pre-ingestion row count validation. The workflow should compare source file row counts against ingested record counts. Also, Experience Platform’s Data Ingestion API provides batch status endpoints that show detailed failure reasons - integrate those checks into your workflow error handling. Without explicit validation logic, silent failures will continue because the workflow considers partial ingestion as success.

francoisking · March 19, 2025, 12:10pm

Check your workflow’s batch processing configuration. In AEC 2021, the default batch size is 10,000 records, and if your workflow isn’t properly chunking larger files, you’ll get partial ingestion without clear error messages. Also verify that your dataset schema doesn’t have strict validation rules that could silently reject records without logging failures at INFO level.

diegosys · March 22, 2025, 1:28pm

Thanks for the insights. I checked the batch processing settings - we’re using default chunking. The schema does have some required fields, but I’d expect validation errors to show up somewhere. How do I enable more detailed logging to catch these silent truncations? And what’s the recommended approach for downstream validation checkpoints?

karen_ops · April 16, 2025, 7:02am

We had nearly identical issues in our AEC 2021 implementation. The problem stems from multiple factors working together, and you need to address all four focus areas systematically.

Bulk Import File Size Limits: Experience Platform has a 100MB per file recommendation, but the real constraint is memory allocation per workflow execution. Files with complex nested schemas can hit memory limits well before 100MB. Split your 50-80K record files into smaller batches of 20-25K records maximum. This prevents memory-related silent truncation.

Workflow Logging Configuration: Change your logging level from INFO to DEBUG in the workflow settings. Add explicit logging statements at key checkpoints: pre-ingestion row count, post-validation count, and failed record count. Enable Data Ingestion API audit logs through Platform’s monitoring interface - this captures rejection reasons that workflow logs miss.

Silent Data Truncation: This happens when schema validation fails without throwing exceptions. Add a pre-ingestion validation step using Platform’s Schema Registry API to validate each record against your dataset schema before batch submission. Records failing validation should be logged to a separate error dataset with rejection reasons. Implement a row count reconciliation check: source_count == ingested_count + rejected_count.

Downstream Data Validation: Implement a post-ingestion validation workflow that runs 15 minutes after each bulk import. Query the target dataset for the batch ID and compare record counts. Set up alerts when discrepancies exceed 1%. Create a reconciliation report showing: source file name, expected count, ingested count, missing count, and sample missing record IDs. This provides audit trail for data governance.

Also configure your workflow’s error handling to treat partial success as failure. Use the batch ingestion status API endpoint to verify complete ingestion before marking workflow as successful. We reduced our data loss from 2-3% to under 0.01% after implementing these changes.

rajeshapi · March 17, 2025, 1:44am

I’ve seen this behavior before. Experience Platform has default file size thresholds that aren’t always obvious. For batch ingestion, there’s a 100MB per file soft limit, and records can get silently dropped if individual batches exceed memory allocation during processing. Your 50-80K records might be hitting edge cases depending on record complexity and schema size.

rajeshapi · April 13, 2025, 2:55am

One thing that caught us was network timeout settings during large batch uploads. If your workflow uses REST API calls for ingestion and the connection times out mid-upload, some platforms will accept partial data without flagging it as an error. Check your API timeout configurations and consider implementing resume-on-failure logic with batch identifiers to track partial uploads.

Topic		Replies	Views
Bulk goal imports in performance management fail during peak review cycle-timeout and data mismatch errors Workday HCM question , data-migration , testing-qa , timeout , performance-mgmt , wd-r1-2023 , bulk-operations , workday-import , employee-validation	4	0	April 10, 2025
Employee Central data validation rules fail silently during bulk uploads SAP SuccessFactors question , core-hr , data-quality , analytics-insights , xml , api , sf-h2-2023 , data-validation , bulk-upload	6	0	June 8, 2025
Analytics reporting API data sync fails on scheduled imports Adobe Experience Cloud question , rest-api , analytics-reporting , batch-processing , aec-2021 , scheduled-jobs , gateway-timeout , api-sync , integration-frameworks	6	0	March 20, 2025
Bulk contact imports to email marketing module fail with 'Invalid field mapping' errors Adobe Experience Cloud question , data-migration , schema-validation , email-marketing , aec-2022 , bulk-import , field-mapping , csv-processing , integration-frameworks	6	0	May 20, 2025
Audit management master data sync fails during bulk import with duplicate key errors Sparta Systems TrackWise question , audit-mgmt , sql , master-data , bulk-operations , duplicate-key , data-import , controlled-master-data , tw-9-0	5	0	August 20, 2025
Bulk price list upload fails in pricing management with CSV validation errors Infor CloudSuite question , cloud-deploy , error-handling , csv , pricing-mgmt , ics-2021 , bulk-import , data-import , csv-validation	6	1	August 27, 2025
Service case import fails due to CSV header mismatch and field mapping errors Adobe Experience Cloud question , data-migration , csv , aec-2021 , bulk-import , field-mapping , data-import-migration , service-case , csv-validation	4	0	March 19, 2025
Bulk import of non-conformance records fails with timeout errors Arena QMS (by PTC) question , timeout , error-handling , batch-processing , data-loss , csv-import , cloud-deployment , non-conformance , aqp-2022-2	6	0	April 11, 2025
Bulk import via workforce-planning API vs individual record updates for quarterly forecasts ADP Workforce Now discussion , api-development , error-handling , batch-processing , data-validation , workforce-planning , adp-2023-2 , bulk-vs-individual , update-performance	5	0	July 9, 2025

Bulk data import job intermittently drops records in workflow automation

Related topics