Description of problem: A build created on the free-int cluster failed unexpectedly to push to the registry. Subsequent builds worked fine. Version-Release number of selected component (if applicable): oc v3.7.0-0.104.0 kubernetes v1.7.0+695f48a16f features: Basic-Auth GSSAPI Kerberos SPNEGO How reproducible: Just once so far. Several builds, following identical steps have not reproduced this failure. Steps to Reproduce: 1. Create new ruby application through web console using default github URL Actual results: Build was pending for a long time (15 min?) and push to registry failed. Expected results: Expected build push to succeed quickly. Additional info: [root@free-int-master-3c664 ~]# oc project jmp-test-3 Now using project "jmp-test-3" on server "https://internal.api.free-int.openshift.com:443". [root@free-int-master-3c664 ~]# oc get pods NAME READY STATUS RESTARTS AGE jmp-test-3-1-build 0/1 Error 0 2h [root@free-int-master-3c664 ~]# oc logs jmp-test-3-1-build Pulling image "registry.access.redhat.com/rhscl/ruby-23-rhel7@sha256:9c9adb4882400df6f050960bdd029eb1a5fc26ab5df483742b3edee9b76ccf82" ... ---> Installing application source ... ---> Building your Ruby application from source ... ---> Running 'bundle install --deployment --without development:test' ... Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`. Fetching gem metadata from https://rubygems.org/............... Fetching version metadata from https://rubygems.org/.. Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`. Installing puma 3.4.0 with native extensions Installing rack 1.6.4 Using bundler 1.10.6 Bundle complete! 2 Gemfile dependencies, 3 gems now installed. Gems in the groups development and test were not installed. Bundled gems are installed into ./bundle. ---> Cleaning up unused ruby gems ... Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`. Pushing image 172.30.215.46:5000/jmp-test-3/jmp-test-3:latest ... Warning: Push failed, retrying in 5s ... Warning: Push failed, retrying in 5s ... Warning: Push failed, retrying in 5s ... Warning: Push failed, retrying in 5s ... Warning: Push failed, retrying in 5s ... Warning: Push failed, retrying in 5s ... Warning: Push failed, retrying in 5s ... Registry server Address: Registry server User Name: serviceaccount Registry server Email: serviceaccount Registry server Password: <<non-empty>> error: build error: Failed to push image: After retrying 6 times, Push image still failed ** At this point, I try to reproduce with another build ** [root@free-int-master-3c664 ~]# oc get bc NAME TYPE FROM LATEST jmp-test-3 Source Git@master 1 [root@free-int-master-3c664 ~]# oc start-build jmp-test-3 build "jmp-test-3-2" started [root@free-int-master-3c664 ~]# oc get builds NAME TYPE FROM STATUS STARTED DURATION jmp-test-3-1 Source Git@1f3d5b7 Failed (PushImageToRegistryFailed) 3 hours ago 7m57s jmp-test-3-2 Source Git@1f3d5b7 Running 22 seconds ago [root@free-int-master-3c664 ~]# oc get pods NAME READY STATUS RESTARTS AGE jmp-test-3-1-build 0/1 Error 0 2h jmp-test-3-2-build 1/1 Running 0 32s [root@free-int-master-3c664 ~]# oc logs -f jmp-test-3-2-build Pulling image "registry.access.redhat.com/rhscl/ruby-23-rhel7@sha256:9c9adb4882400df6f050960bdd029eb1a5fc26ab5df483742b3edee9b76ccf82" ... ---> Installing application source ... ---> Building your Ruby application from source ... ---> Running 'bundle install --deployment --without development:test' ... Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`. Fetching gem metadata from https://rubygems.org/............... Fetching version metadata from https://rubygems.org/.. Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`. Installing puma 3.4.0 with native extensions Installing rack 1.6.4 Using bundler 1.10.6 Bundle complete! 2 Gemfile dependencies, 3 gems now installed. Gems in the groups development and test were not installed. Bundled gems are installed into ./bundle. ---> Cleaning up unused ruby gems ... Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`. Pushing image 172.30.215.46:5000/jmp-test-3/jmp-test-3:latest ... Pushed 0/6 layers, 2% complete Pushed 1/6 layers, 33% complete Pushed 2/6 layers, 33% complete Pushed 3/6 layers, 59% complete Pushed 4/6 layers, 79% complete Pushed 5/6 layers, 100% complete Pushed 6/6 layers, 100% complete Push successful
Registry logs: http://file.rdu.redhat.com/~jupierce/share/free-int-failed-docker-push.tgz [root@free-int-master-3c664 ~]# oc project default Now using project "default" on server "https://internal.api.free-int.openshift.com:443". [root@free-int-master-3c664 ~]# oc get pods NAME READY STATUS RESTARTS AGE docker-registry-21-cb9hd 1/1 Running 0 4d docker-registry-21-frppl 1/1 Running 0 4d
Alexey have you had a chance to take look at the logs Justin provided?
Isn't that comment 4?
Oops. Yes, I miss them.
I do not see any errors on the registry side. Does this error continue to be reproduced?
I have not seen it recently, no. However, if there is not enough logging to determine the problem, can we use this BZ to add what is necessary to diagnose it next time?
doesn't seem to be recurring, so low severity.
I need to check what we're logging in the build logs on failed push, might be the case that we weren't even able to reach the registry (hence nothing in the registry logs) and we didn't log a useful error in the build to indicate why the push failed.
We now log the error that lead to the retries/failure so if we hit this again we'll have more information: return fmt.Errorf("After retrying %d times, %s image still failed due to error: %v", DefaultPushOrPullRetryCount, actionName, err) Nothing else we can do with it for now, i'm going to close it and it can be reopened if further push issues are seen.