GKE node pool autoscaling not triggering when pods exceed CPU requests

We’re running a GKE cluster (v1.14) with autoscaling enabled on our primary node pool. The node pool is configured with min 3 nodes and max 15 nodes. Over the past week, we’ve noticed that when our application pods request additional CPU resources during peak hours, the cluster autoscaler isn’t adding new nodes as expected.

Our pods have CPU requests set to 500m and limits at 1000m. When we deploy additional replicas (scaling from 20 to 35 pods), many pods stay in Pending state for 10-15 minutes before nodes are finally added. The GKE cluster autoscaler config shows enabled status, but the trigger timing seems broken.

Has anyone experienced delayed autoscaling in GKE? We need to understand if this is a configuration issue with our pod resource requests, node pool settings, or the autoscaler itself.

I’ve seen similar behavior. First thing to check - are your pod resource requests accurately reflecting actual usage? If requests are too low, the scheduler might pack too many pods per node, leaving no room for the autoscaler to recognize capacity issues until it’s too late.

Thanks both. I checked the configmap and found some interesting entries about “scale up not needed” even though we had pending pods. Our actual CPU usage is around 350m per pod on average, so the 500m request seems reasonable. Could there be an issue with how the autoscaler calculates available capacity?

This might be related to the autoscaler’s scan interval combined with the time it takes to provision new nodes. GCP typically takes 3-5 minutes to spin up a new node, and if your traffic spike happens faster than that, you’ll see pending pods. Consider using Horizontal Pod Autoscaler with a buffer of pre-scaled pods, or look into node auto-provisioning if you’re on GKE 1.15+.