Docker image deployment fails for compliance module in CI/CD

We’re experiencing persistent failures when deploying our Arena QMS compliance module through our Jenkins CI/CD pipeline. The Docker image push to our private registry consistently times out during the final stage.

Our Jenkins pipeline uses credential binding for registry authentication, but the push operation fails after approximately 5 minutes. We’ve verified the credentials work manually using docker login. The timeout occurs specifically during the compliance module image push (2.3GB size).


docker push registry.company.com/arena-compliance:latest
The push refers to repository [registry.company.com/arena-compliance]
timeout: network connection lost after 300s

This blocks our automated compliance deployments completely. We need to validate registry endpoint connectivity and implement proper timeout configurations. Has anyone configured retry logic with exponential backoff for large Arena QMS images in Jenkins?

For credential binding issues in Jenkins, make sure you’re using the correct credential type. We switched from Username/Password to Docker Registry credential type in Jenkins and that resolved authentication timeouts. Also verify your registry endpoint is reachable from the Jenkins agent - run a curl test to the registry API endpoint before attempting the push. Network ACLs or firewall rules between Jenkins and your private registry could be causing intermittent connectivity issues that manifest as timeouts.

I recommend implementing a comprehensive retry strategy with exponential backoff in your Jenkinsfile. Configure network timeout settings to match your actual network conditions, and add proper validation steps.

For Docker registry credential binding in Jenkins, use the withDockerRegistry step with proper credential ID:

withDockerRegistry([credentialsId: 'registry-creds', url: 'https://registry.company.com']) {
  sh 'docker push registry.company.com/arena-compliance:latest'
}

For network timeout configuration, add these Docker daemon options in your Jenkins agent configuration or pipeline:

environment {
  DOCKER_CLIENT_TIMEOUT = '600'
  COMPOSE_HTTP_TIMEOUT = '600'
}

Implement retry logic with exponential backoff using Jenkins retry step:

retry(3) {
  timeout(time: 15, unit: 'MINUTES') {
    script {
      def backoffSeconds = 30
      try {
        sh 'docker push registry.company.com/arena-compliance:latest'
      } catch (Exception e) {
        sleep(backoffSeconds)
        backoffSeconds *= 2
        throw e
      }
    }
  }
}

For registry endpoint connectivity validation, add a pre-flight check:

curl -f --max-time 10 https://registry.company.com/v2/ || exit 1
docker pull registry.company.com/hello-world:latest

Key configuration points:

  1. Credential Binding: Use withDockerRegistry instead of manual docker login to ensure proper credential lifecycle management

  2. Timeout Configuration: Set both DOCKER_CLIENT_TIMEOUT and COMPOSE_HTTP_TIMEOUT to at least 600 seconds (10 minutes) for large images. Adjust based on your network speed - calculate expected time as (image_size_GB * 8) / (network_speed_Mbps) * 1.5 for safety margin

  3. Retry Strategy: Implement 3 retry attempts with exponential backoff starting at 30 seconds. This handles transient network issues without overwhelming the registry

  4. Endpoint Validation: Always validate registry connectivity before attempting push operations. Test with both HTTP API endpoint and a small image pull

Additional recommendations: Enable Docker BuildKit for better layer caching, use --compress flag for network-constrained environments, and monitor Jenkins agent disk I/O during push operations. If issues persist, check your registry logs for rate limiting or resource constraints on the registry side.

For Arena QMS deployments specifically, we’ve found that splitting the compliance module into smaller microservices helps with image sizes. But if that’s not feasible, implement proper health checks in your pipeline. Add a pre-push validation step that tests registry connectivity with a small test image first. This quickly identifies if the issue is authentication versus network capacity. We use a 100MB test image push as a canary before attempting the full compliance module push.