BigQuery API query jobs stuck in pending state for long-running analytics workloads

specialist · May 23, 2025, 9:42am

Our BigQuery API query jobs are getting stuck in pending state for 15-20 minutes before executing. These are complex analytical queries scanning multiple tables, and the delays are making our analytics pipeline unusable.

I suspect we’re hitting job quotas, but the quota monitoring doesn’t show any limits reached. Query optimization might help, but we’re already using partitioned tables where possible. The jobs eventually run, but the pending time is unacceptable.


POST /bigquery/v2/projects/{project}/jobs
query: SELECT * FROM analytics.user_events
       WHERE event_date >= '2024-01-01'
jobReference.jobId: job_xyz123
Status: PENDING (stuck for 18 minutes)

Anyone experienced similar pending state delays with BigQuery API jobs?

alex_data · July 16, 2025, 6:04pm

Your pending state delays stem from three interrelated issues that need coordinated fixes. Let me walk through each systematically.

Job quotas and slot contention: You’re hitting soft limits, not hard quota errors. With 20-30 concurrent queries, you’re competing for available slots in the shared on-demand pool. Each complex analytical query can consume 2000+ slots, and BigQuery’s scheduler queues jobs when slot demand exceeds availability. Check actual slot usage:


SELECT

  job_id,

  total_slot_ms,

  total_bytes_processed

FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT

WHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)

ORDER BY total_slot_ms DESC

If your top queries show 5M+ total_slot_ms, they’re monopolizing slots. Solution: either purchase slot reservations (guaranteed capacity) or implement application-level throttling to limit concurrent jobs to 10-12 for complex queries.

Query optimization: Your query has multiple inefficiencies. First, SELECT * is scanning all columns unnecessarily. Second, the WHERE clause isn’t leveraging partitioning properly. Optimize:


SELECT event_id, user_id, event_type, event_timestamp
FROM analytics.user_events
WHERE DATE(_PARTITIONTIME) >= '2024-01-01'
  AND event_date >= '2024-01-01'
LIMIT 1000000

Using _PARTITIONTIME enables partition pruning (scans only relevant partitions). Adding event_date filter helps with additional filtering. Selecting specific columns reduces data transfer. Adding LIMIT prevents runaway queries.

Partitioned tables configuration: Verify your table is actually partitioned correctly. Many tables claim to be partitioned but aren’t configured properly:


SELECT
  table_name,
  partition_expiration_days,
  require_partition_filter
FROM analytics.INFORMATION_SCHEMA.PARTITIONS
WHERE table_name = 'user_events'

If require_partition_filter is false, enable it - this forces all queries to include partition filters, preventing accidental full scans. If your table isn’t showing partition info, it’s not actually partitioned. Recreate it with proper partitioning:


CREATE TABLE analytics.user_events_partitioned
PARTITION BY DATE(_PARTITIONTIME)
CLUSTER BY user_id, event_type
AS SELECT * FROM analytics.user_events

Clustering by frequently filtered columns (user_id, event_type) further reduces data scanned. After implementing these changes, your pending times should drop to under 30 seconds. For immediate relief, reduce concurrent job submissions to 10-12 max while you optimize queries and consider purchasing 500-1000 slot reservations if your workload consistently needs high concurrency.

george_sys · June 19, 2025, 9:13am

Yes, 20-30 concurrent jobs will definitely cause queueing. The default concurrent query limit is 100 for on-demand pricing, but the actual scheduling depends on slot availability and query complexity. Your long-running analytical queries consume more slots, so fewer can run concurrently. Consider using reservations for predictable performance, or implement query throttling in your application to limit concurrent submissions.

kevinsys · May 29, 2025, 1:29pm

Your query is doing a full table scan without partition filtering. Even though you have partitioned tables, your WHERE clause on event_date doesn’t use the partition column correctly. You need to filter using the _PARTITIONTIME pseudo-column for partition pruning to work. Without that, BigQuery treats it as a full scan which takes longer to schedule.

Topic		Replies	Views
BigQuery load job stuck in pending state for large dataset import, delaying downstream analytics Google Cloud Platform (GCP) question , analytics , devops-auto , gcp-2020 , data-load , bigquery , job-status , quota-limits , parallel-loading	5	0	November 17, 2024
BigQuery analytics on CDN logs delayed due to slow data ingestion pipeline Google Cloud Platform (GCP) question , analytics , gcp-2019 , real-time-monitoring , cloud-storage , cloud-functions , bigquery , delayed-ingestion	5	0	September 1, 2025
BigQuery streaming inserts fail from Kubernetes connector, causing reporting delays Google Cloud Platform (GCP) question , analytics , streaming , kubernetes , gcp-2019 , python , containers-ctn , bigquery , quota-exceeded	5	1	October 12, 2025
BigQuery API slow response times for complex ERP reporting queries impacting dashboard performance Google Cloud Platform (GCP) question , analytics , sql , materialized-views , query-optimization , erp-integration , gcp-2020 , apis , bigquery	5	0	August 18, 2025
BigQuery Dataflow pipeline fails on large datasets with memory quota exceeded errors Google Cloud Platform (GCP) question , data-warehousing , analytics , dataflow , java , etl-pipeline , gcp-2019 , pipeline-optimization , bigquery	3	0	December 2, 2025
CloudWatch Logs Insights API batch query limits and performance tuning for high-volume log analytics Amazon Web Services (AWS) discussion , timeout , observability , batch-processing , aws-2020 , rate-limit , apis , analytics-delay , cloudwatch-logs	5	0	May 13, 2025
BigQuery export to Cloud Storage fails for large dataset with rate limit errors Google Cloud Platform (GCP) question , analytics , database , sql , gcp-2019 , cloud-storage , rate-limit , data-export , etl-failure	6	0	December 6, 2025
Custom dashboard loads slowly in app enablement module when querying device telemetry Google Cloud IoT question , performance-opt , sql , analytics-report , slow-query , dashboard-performance , bigquery , table-partitioning , app-enablement	6	0	September 2, 2025
Aurora PostgreSQL slow query performance during BI reporting peak hours Amazon Web Services (AWS) question , analytics , database , aws-2021 , performance-tuning , bi-tools , reporting-delays , aurora-postgresql , slow-queries	3	1	September 24, 2025

BigQuery API query jobs stuck in pending state for long-running analytics workloads

Related topics