We’re running into persistent UNABLE_TO_LOCK_ROW errors when bulk updating service cases through the REST API. Our integration processes about 500-800 case updates per hour during peak times, and we’re seeing roughly 15-20% failure rate on bulk API calls.
The error happens inconsistently - same batch might succeed on retry but we’re concerned about the retry logic overhead and batch size optimization. Here’s a typical error response:
[
{"success": false, "errors": [
{"statusCode": "UNABLE_TO_LOCK_ROW",
"message": "unable to obtain exclusive access to this record"}
]}
]
We’re using batch sizes of 200 records per API call. The cases being updated often have related records (contacts, accounts) that might be getting locked by other processes. Has anyone dealt with bulk API row locking issues at scale? What batch size and retry strategies work best for high-volume case updates?
The UNABLE_TO_LOCK_ROW error typically occurs when Salesforce can’t acquire an exclusive lock on a record within the timeout period. With service cases, this often happens because:
- Multiple automation processes (workflows, process builders, flows) are firing simultaneously
- The case is being updated by an end user at the exact same time
- Related records (Account, Contact) are locked by another transaction
For bulk operations, I recommend implementing exponential backoff in your retry logic. Start with a 2-second delay, then 4, 8, up to a maximum of 32 seconds. Also consider adding jitter to prevent thundering herd problems when multiple failed batches retry simultaneously.
I’ve seen this before with bulk updates. The 200 record batch size might be too aggressive for cases with complex relationships. Try reducing to 50-100 records per batch - it helps reduce lock contention when multiple records share parent objects.
Immediate retry is actually counterproductive with lock errors. The lock might still be held when you retry. Here’s what works well for handling bulk API row locking with proper retry logic and batch size optimization:
Batch Size Strategy:
Reduce your batch size to 50 records maximum for case updates. Cases with relationships to frequently-updated parent objects need smaller batches to minimize lock windows. Monitor your success rate and adjust - if you’re consistently below 95% success, go smaller.
Retry Logic Implementation:
Implement exponential backoff with jitter:
- First retry: Wait 2-4 seconds (random)
- Second retry: Wait 4-8 seconds
- Third retry: Wait 8-16 seconds
- Maximum 3 retries before logging as permanent failure
The jitter prevents synchronized retry storms when multiple batches fail simultaneously.
Code approach:
function retryWithBackoff(batch, attempt = 0) {
if (attempt >= 3) return logFailure(batch);
const delay = Math.pow(2, attempt) * 1000 + Math.random() * 1000;
setTimeout(() => executeBatch(batch, attempt + 1), delay);
}
Additional optimizations:
-
Sorting Strategy: Sort cases by AccountId before batching (you’re doing this - good!). This ensures related cases are processed sequentially, not concurrently.
-
Peak Time Handling: During high-activity periods (8-11 AM typically), reduce batch size further to 25-30 records. You can detect this by monitoring your failure rate in real-time.
-
Lock Detection: Check if the case has been recently updated before including it in a batch. Add a filter: LastModifiedDate < NOW() - 5 minutes to avoid cases currently being edited.
-
Parallel Processing Limits: Don’t run more than 3-4 parallel batch processes. Too many concurrent updates increase lock contention exponentially.
-
Failed Record Isolation: When a batch fails, break it into individual records and retry each separately after a longer delay (30-60 seconds). This prevents one problematic record from blocking an entire batch.
With these strategies, you should see your failure rate drop to under 2%. The key is accepting that some lock contention is unavoidable in a multi-user environment - your retry logic needs to be patient and intelligent rather than aggressive.
Another thing to consider is the order of your updates. If you’re updating cases in random order, you might be hitting the same parent records from different batches simultaneously. Try sorting your cases by AccountId before batching - this way related cases get processed together and you reduce the chance of lock conflicts across batches. We saw our failure rate drop from 18% to about 3% after implementing this sorting strategy.
Thanks for the suggestions. We implemented sorting by AccountId and that definitely helped. Still seeing some failures though, especially during our morning sync window when users are most active. How aggressive should our retry logic be? Currently we retry immediately once, then give up.
Have you checked if there are any long-running processes or reports that might be locking the parent Account or Contact records? We had a similar issue where a scheduled report was causing cascading locks. Run the Setup Audit Trail and check for any bulk operations or data exports running during your peak update times.