Our ERP system provisions new users by making batch API calls to various GCP services (Cloud Storage, BigQuery, Pub/Sub) using a service account. The batch job runs nightly and processes 200-500 new user accounts. Recently, we’ve seen intermittent failures where API calls fail with authentication errors halfway through the batch.
We’re using service account JSON key authentication. The token appears to expire during long-running batch operations, causing subsequent API calls to fail with 401 Unauthorized errors. This creates incomplete user provisioning and delays employee onboarding by 24+ hours until the next batch run.
Error: Request had invalid authentication
HTTP 401: Invalid authentication credentials
Occurs after ~45-60 minutes into batch job
The IAM token lifetime seems shorter than our batch processing time. How should we handle API authentication for reliable batch operation execution?