Cloud SQL instance access denied after rotating service account keys for database connections

Following our security policy, I rotated service account keys for our production application that connects to Cloud SQL. Immediately after the rotation, all database connections started failing with access denied errors.

Connection error:


ERROR: Access denied for user 'app-sa'@'cloudsqlproxy'
Connection refused: authentication failed
Error code: 28000

The service account key rotation was done through IAM console, and I updated the key file in our application’s configuration. I’ve verified IAM role verification shows the service account still has Cloud SQL Client role. The application credential update seems correct, but something is clearly wrong. Did I miss a step in the rotation process?

Here’s the complete zero-downtime rotation procedure covering all three focus areas:

Service Account Key Rotation (Proper Sequence):

Step 1 - Create new key WITHOUT deleting the old one:


gcloud iam service-accounts keys create new-key.json --iam-account=SERVICE_ACCOUNT_EMAIL

Step 2 - Verify the new key is valid:


gcloud auth activate-service-account --key-file=new-key.json
gcloud sql instances list

Step 3 - Keep both keys active during transition period.

Application Credential Update (Kubernetes Example):

Update the Kubernetes secret:


kubectl create secret generic cloudsql-credentials --from-file=key.json=new-key.json --dry-run=client -o yaml | kubectl apply -f -

Perform rolling restart to pick up new credentials:


kubectl rollout restart deployment/your-app-deployment
kubectl rollout status deployment/your-app-deployment

Verify connectivity from pods:


kubectl exec -it POD_NAME -- curl -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/email

IAM Role Verification:

Confirm the service account has necessary roles:


gcloud projects get-iam-policy PROJECT_ID --flatten="bindings[].members" --filter="bindings.members:serviceAccount:SERVICE_ACCOUNT_EMAIL"

Required roles for Cloud SQL:

  • roles/cloudsql.client (minimum for Cloud SQL Proxy)
  • roles/cloudsql.instanceUser (if using IAM database authentication)

Verify at database level if using IAM authentication:


GRANT ALL PRIVILEGES ON DATABASE dbname TO "SERVICE_ACCOUNT_EMAIL";

Complete Zero-Downtime Rotation Process:

  1. Pre-rotation: Verify current connectivity and document current key ID
  2. Create new key: Generate new key via console or gcloud, keep old key active
  3. Update credentials: Update all credential stores (secrets, config files, environment variables)
  4. Rolling restart: Restart applications/services in controlled manner:
    • For Kubernetes: `kubectl rollout restart
    • For Compute Engine: Update instance metadata and restart service
    • For Cloud Run: Deploy new revision with updated secret
  5. Verify connectivity: Test database connections from all services
  6. Monitor: Check application logs for any authentication errors (wait 15-30 minutes)
  7. Delete old key: Only after confirming new key works everywhere:
    
    gcloud iam service-accounts keys delete KEY_ID --iam-account=SERVICE_ACCOUNT_EMAIL
    

Troubleshooting Your Specific Issue:

Your immediate problem is pods using deleted key. Fix it:

  1. Create a new service account key (the old one is gone)
  2. Update your Kubernetes secret with the new key
  3. Force pod restart: `kubectl rollout restart deployment/your-deployment
  4. Watch pod logs: `kubectl logs -f deployment/your-deployment
  5. Verify Cloud SQL Proxy logs if using proxy sidecar

Best Practices to Prevent Future Issues:

  • Implement automated key rotation with tools like Berglas or External Secrets Operator
  • Use Workload Identity instead of service account keys when possible (eliminates key rotation entirely)
  • Set up monitoring alerts for authentication failures
  • Document rotation procedures in runbooks
  • Test rotation process in non-production environments first
  • Never delete old keys until new keys are verified working in all environments
  • Use Secret Manager for centralized credential management instead of Kubernetes secrets

The root cause was deleting the old key before ensuring all applications successfully transitioned to the new key, combined with not performing a rolling restart of Kubernetes pods to reload the updated secret. Always follow the create-update-verify-delete sequence for zero-downtime rotations.

Another thing to verify - when you say you updated the key file, did you update the actual JSON key file that the application is reading? Sometimes there are multiple copies of credential files in different locations (local config, mounted volumes, secrets managers). Make sure you updated the one the application is actually using at runtime.

Ah, I think that’s it. I updated the secret but didn’t do a rolling restart of the deployment. The pods are still using the old mounted secret with the deleted key. How do I properly do this rotation to avoid downtime?

That’s likely your issue. Best practice for key rotation is to create the new key first, update all applications to use it, verify connectivity, and only then delete the old key. If you deleted the old key before confirming the new one was working everywhere, you might have locked yourself out. Check if the new key is actually valid and properly formatted.

Did you restart your application after updating the key file? Service accounts and credential files are typically loaded at application startup, not dynamically. Also, check if your application is using the Cloud SQL Proxy - if so, the proxy process also needs to be restarted to pick up the new credentials.

Also check your Kubernetes secret update. Did you update the secret and then restart the pods? Or did you restart expecting the pods to automatically pick up the new secret? Most Kubernetes deployments don’t automatically reload secrets - you need to do a rolling restart of the deployment after updating secrets.

I did restart the application and I’m pretty sure I updated the right file - it’s mounted from a Kubernetes secret. But now that you mention it, I created a new key but did I delete the old one immediately? Could there be a timing issue?