VPC peering blocks cross-region backup replication traffic for Cloud Storage

Our disaster recovery strategy involves replicating Cloud Storage backups from us-central1 to europe-west1 using a custom application running on Compute Engine instances. We recently implemented VPC peering between our production VPC and a shared services VPC, and now our cross-region backup replication is failing.

The error we’re seeing suggests routing issues:


Error: Connection timeout to storage.googleapis.com
Failed to replicate backup: gs://prod-backups-us/data.tar.gz

Our setup uses Private Google Access to reach Cloud Storage APIs without going through the public internet. The VPC peering was added to allow our production workloads to access services in the shared VPC (monitoring tools, centralized logging). I suspect the peering configuration is interfering with the Private Google Access routes, but I’m not sure how to diagnose or fix this without breaking the peering connection that other teams depend on.

Has anyone encountered routing conflicts between VPC peering and Private Google Access for backup traffic?

Check if your shared services VPC has custom routes that are being exported to your production VPC. You can see this in the VPC peering configuration under “Import custom routes” and “Export custom routes”. If the shared services VPC is exporting a default route or broad subnet routes, those will override your Private Google Access routes because peered routes have higher priority than default internet gateway routes.

We solved a similar issue by creating explicit routes for the Private Google Access IP ranges with higher priority than the peered routes. You can create custom static routes in your production VPC that point to the default internet gateway specifically for the Google API ranges. This ensures your backup traffic uses Private Google Access regardless of what routes are being imported from the peered VPC. The key is setting the route priority correctly so it takes precedence.

Let me provide a complete solution addressing VPC peering route configuration, Private Google Access setup, and firewall rules for your backup traffic.

Understanding the Root Cause: VPC peering routes have a priority of 0 (highest), which means they take precedence over the default route to the internet gateway (priority 1000) that Private Google Access uses. When your shared services VPC exports custom routes, these can override the path your backup traffic should take to reach storage.googleapis.com.

Solution Part 1: VPC Peering Route Configuration

First, identify what routes are being imported from the peered VPC:


gcloud compute routes list --filter="network:PRODUCTION_VPC"

Check your VPC peering configuration:


gcloud compute networks peerings list --network=PRODUCTION_VPC

If the shared services VPC is exporting custom routes that conflict with Private Google Access, you have two options:

Option A - Disable custom route import (if other teams don’t need it):


gcloud compute networks peerings update PEERING_NAME \
  --network=PRODUCTION_VPC \
  --no-import-custom-routes

Option B - Create explicit routes with higher priority for Google API ranges:


gcloud compute routes create private-google-access-route \
  --network=PRODUCTION_VPC \
  --destination-range=199.36.153.8/30 \
  --next-hop-gateway=default-internet-gateway \
  --priority=100

gcloud compute routes create private-google-access-route-2 \
  --network=PRODUCTION_VPC \
  --destination-range=199.36.153.4/30 \
  --next-hop-gateway=default-internet-gateway \
  --priority=100

These explicit routes (priority 100) will take precedence over imported peered routes while maintaining your VPC peering functionality.

Solution Part 2: Private Google Access Configuration

Verify Private Google Access is properly enabled in your production VPC subnet:


gcloud compute networks subnets update SUBNET_NAME \
  --region=us-central1 \
  --enable-private-ip-google-access

Do the same for your disaster recovery region:


gcloud compute networks subnets update SUBNET_NAME \
  --region=europe-west1 \
  --enable-private-ip-google-access

For cross-region replication, ensure your Compute Engine instances are using the correct regional endpoints. You can verify connectivity:


curl -I https://storage.googleapis.com

Solution Part 3: Firewall Rules for Backup Traffic

Even with an allow-all egress rule, create specific rules for Private Google Access to ensure proper logging and troubleshooting:


gcloud compute firewall-rules create allow-private-google-access \
  --network=PRODUCTION_VPC \
  --action=ALLOW \
  --rules=tcp:443 \
  --destination-ranges=199.36.153.8/30,199.36.153.4/30 \
  --priority=1000 \
  --direction=EGRESS \
  --target-tags=backup-replication

Apply the target tag to your backup replication instances:


gcloud compute instances add-tags INSTANCE_NAME \
  --tags=backup-replication \
  --zone=us-central1-a

Validation and Testing:

  1. Test connectivity from your Compute Engine instance:

ping -c 3 storage.googleapis.com
telnet storage.googleapis.com 443
  1. Verify the route being used:

ip route get 199.36.153.8

This should show your explicit route, not a peered route.

  1. Test your backup replication:

gsutil -D cp gs://prod-backups-us/test.txt gs://prod-backups-europe/test.txt

The -D flag provides debug output showing which IP addresses are being used.

Additional Recommendations:

  1. Use Cloud Storage Transfer Service: For production backup replication between regions, consider using Cloud Storage Transfer Service instead of custom applications. It handles networking automatically and provides better reliability.

  2. Monitor Route Changes: Set up Cloud Monitoring alerts for route table changes:


Resource: GCE Network
Metric: Route changes
Alert when: Any route modification occurs
  1. Document Network Architecture: Create a network diagram showing VPC peering relationships and Private Google Access paths to help troubleshoot future issues.

This solution maintains your VPC peering functionality while ensuring backup replication traffic correctly uses Private Google Access paths to reach Cloud Storage APIs across regions.

Thanks for the pointers. I verified that Private Google Access is enabled in both VPCs. I checked the firewall rules and we do have an allow-all egress rule, so that shouldn’t be blocking traffic. The issue seems to be specifically with routing - when I trace the route from a Compute Engine instance, it’s trying to send traffic through the peered VPC instead of directly to the Private Google Access endpoint. How do I force traffic to use the Private Google Access route instead of the peering route?