Bug 1490500

Summary: [free-int] Spurious registry push failure
Product: OpenShift Online Reporter: Justin Pierce <jupierce>
Component: Image RegistryAssignee: Ben Parees <bparees>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Dongbo Yan <dyan>
Severity: low Docs Contact:
Priority: unspecified    
Version: 3.xCC: agladkov, aos-bugs, bparees, jupierce, mfojtik, xtian
Target Milestone: ---Keywords: OnlineStarter
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-19 17:00:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Justin Pierce 2017-09-11 19:06:49 UTC
Description of problem:
A build created on the free-int cluster failed unexpectedly to push to the registry. Subsequent builds worked fine.

Version-Release number of selected component (if applicable):
oc v3.7.0-0.104.0
kubernetes v1.7.0+695f48a16f
features: Basic-Auth GSSAPI Kerberos SPNEGO

How reproducible:
Just once so far. Several builds, following identical steps have not reproduced this failure. 

Steps to Reproduce:
1. Create new ruby application through web console using default github URL

Actual results:
Build was pending for a long time (15 min?) and push to registry failed.

Expected results:
Expected build push to succeed quickly. 


Additional info:

[root@free-int-master-3c664 ~]# oc project jmp-test-3
Now using project "jmp-test-3" on server "https://internal.api.free-int.openshift.com:443".

[root@free-int-master-3c664 ~]# oc get pods
NAME                 READY     STATUS    RESTARTS   AGE
jmp-test-3-1-build   0/1       Error     0          2h

[root@free-int-master-3c664 ~]# oc logs jmp-test-3-1-build
Pulling image "registry.access.redhat.com/rhscl/ruby-23-rhel7@sha256:9c9adb4882400df6f050960bdd029eb1a5fc26ab5df483742b3edee9b76ccf82" ...
---> Installing application source ...
---> Building your Ruby application from source ...
---> Running 'bundle install --deployment --without development:test' ...
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
Fetching gem metadata from https://rubygems.org/...............
Fetching version metadata from https://rubygems.org/..
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
Installing puma 3.4.0 with native extensions
Installing rack 1.6.4
Using bundler 1.10.6
Bundle complete! 2 Gemfile dependencies, 3 gems now installed.
Gems in the groups development and test were not installed.
Bundled gems are installed into ./bundle.
---> Cleaning up unused ruby gems ...
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.

Pushing image 172.30.215.46:5000/jmp-test-3/jmp-test-3:latest ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Registry server Address: 
Registry server User Name: serviceaccount
Registry server Email: serviceaccount
Registry server Password: <<non-empty>>
error: build error: Failed to push image: After retrying 6 times, Push image still failed


** At this point, I try to reproduce with another build **
 

[root@free-int-master-3c664 ~]# oc get bc
NAME         TYPE      FROM         LATEST
jmp-test-3   Source    Git@master   1

[root@free-int-master-3c664 ~]# oc start-build jmp-test-3
build "jmp-test-3-2" started

[root@free-int-master-3c664 ~]# oc get builds
NAME           TYPE      FROM          STATUS                               STARTED          DURATION
jmp-test-3-1   Source    Git@1f3d5b7   Failed (PushImageToRegistryFailed)   3 hours ago      7m57s
jmp-test-3-2   Source    Git@1f3d5b7   Running                              22 seconds ago   

[root@free-int-master-3c664 ~]# oc get pods
NAME                 READY     STATUS    RESTARTS   AGE
jmp-test-3-1-build   0/1       Error     0          2h
jmp-test-3-2-build   1/1       Running   0          32s

[root@free-int-master-3c664 ~]# oc logs -f jmp-test-3-2-build
Pulling image "registry.access.redhat.com/rhscl/ruby-23-rhel7@sha256:9c9adb4882400df6f050960bdd029eb1a5fc26ab5df483742b3edee9b76ccf82" ...
---> Installing application source ...
---> Building your Ruby application from source ...
---> Running 'bundle install --deployment --without development:test' ...
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
Fetching gem metadata from https://rubygems.org/...............
Fetching version metadata from https://rubygems.org/..
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
Installing puma 3.4.0 with native extensions
Installing rack 1.6.4
Using bundler 1.10.6
Bundle complete! 2 Gemfile dependencies, 3 gems now installed.
Gems in the groups development and test were not installed.
Bundled gems are installed into ./bundle.
---> Cleaning up unused ruby gems ...
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.

Pushing image 172.30.215.46:5000/jmp-test-3/jmp-test-3:latest ...
Pushed 0/6 layers, 2% complete
Pushed 1/6 layers, 33% complete
Pushed 2/6 layers, 33% complete
Pushed 3/6 layers, 59% complete
Pushed 4/6 layers, 79% complete
Pushed 5/6 layers, 100% complete
Pushed 6/6 layers, 100% complete
Push successful

Comment 4 Justin Pierce 2017-09-11 19:32:22 UTC
Registry logs:
http://file.rdu.redhat.com/~jupierce/share/free-int-failed-docker-push.tgz

[root@free-int-master-3c664 ~]# oc project default
Now using project "default" on server "https://internal.api.free-int.openshift.com:443".
[root@free-int-master-3c664 ~]# oc get pods
NAME                           READY     STATUS    RESTARTS   AGE
docker-registry-21-cb9hd       1/1       Running   0          4d
docker-registry-21-frppl       1/1       Running   0          4d

Comment 5 Ben Parees 2017-10-05 03:59:07 UTC
Alexey have you had a chance to take look at the logs Justin provided?

Comment 7 Ben Parees 2017-10-10 12:48:17 UTC
Isn't that comment 4?

Comment 8 Alexey Gladkov 2017-10-10 13:16:47 UTC
Oops. Yes, I miss them.

Comment 9 Alexey Gladkov 2017-10-10 15:04:29 UTC
I do not see any errors on the registry side.

Does this error continue to be reproduced?

Comment 10 Justin Pierce 2017-10-13 17:55:06 UTC
I have not seen it recently, no. However, if there is not enough logging to determine the problem, can we use this BZ to add what is necessary to diagnose it next time?

Comment 11 Ben Parees 2017-10-23 11:43:01 UTC
doesn't seem to be recurring, so low severity.

Comment 12 Ben Parees 2018-01-19 13:16:53 UTC
I need to check what we're logging in the build logs on failed push, might be the case that we weren't even able to reach the registry (hence nothing in the registry logs) and we didn't log a useful error in the build to indicate why the push failed.

Comment 13 Ben Parees 2018-01-19 17:00:12 UTC
We now log the error that lead to the retries/failure so if we hit this again we'll have more information:

return fmt.Errorf("After retrying %d times, %s image still failed due to error: %v", DefaultPushOrPullRetryCount, actionName, err)

Nothing else we can do with it for now, i'm going to close it and it can be reopened if further push issues are seen.