Bug 1490500 - [free-int] Spurious registry push failure
Summary: [free-int] Spurious registry push failure
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Image Registry
Version: 3.x
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: Ben Parees
QA Contact: Dongbo Yan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-11 19:06 UTC by Justin Pierce
Modified: 2018-01-19 17:00 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-19 17:00:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Justin Pierce 2017-09-11 19:06:49 UTC
Description of problem:
A build created on the free-int cluster failed unexpectedly to push to the registry. Subsequent builds worked fine.

Version-Release number of selected component (if applicable):
oc v3.7.0-0.104.0
kubernetes v1.7.0+695f48a16f
features: Basic-Auth GSSAPI Kerberos SPNEGO

How reproducible:
Just once so far. Several builds, following identical steps have not reproduced this failure. 

Steps to Reproduce:
1. Create new ruby application through web console using default github URL

Actual results:
Build was pending for a long time (15 min?) and push to registry failed.

Expected results:
Expected build push to succeed quickly. 


Additional info:

[root@free-int-master-3c664 ~]# oc project jmp-test-3
Now using project "jmp-test-3" on server "https://internal.api.free-int.openshift.com:443".

[root@free-int-master-3c664 ~]# oc get pods
NAME                 READY     STATUS    RESTARTS   AGE
jmp-test-3-1-build   0/1       Error     0          2h

[root@free-int-master-3c664 ~]# oc logs jmp-test-3-1-build
Pulling image "registry.access.redhat.com/rhscl/ruby-23-rhel7@sha256:9c9adb4882400df6f050960bdd029eb1a5fc26ab5df483742b3edee9b76ccf82" ...
---> Installing application source ...
---> Building your Ruby application from source ...
---> Running 'bundle install --deployment --without development:test' ...
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
Fetching gem metadata from https://rubygems.org/...............
Fetching version metadata from https://rubygems.org/..
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
Installing puma 3.4.0 with native extensions
Installing rack 1.6.4
Using bundler 1.10.6
Bundle complete! 2 Gemfile dependencies, 3 gems now installed.
Gems in the groups development and test were not installed.
Bundled gems are installed into ./bundle.
---> Cleaning up unused ruby gems ...
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.

Pushing image 172.30.215.46:5000/jmp-test-3/jmp-test-3:latest ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Warning: Push failed, retrying in 5s ...
Registry server Address: 
Registry server User Name: serviceaccount
Registry server Email: serviceaccount
Registry server Password: <<non-empty>>
error: build error: Failed to push image: After retrying 6 times, Push image still failed


** At this point, I try to reproduce with another build **
 

[root@free-int-master-3c664 ~]# oc get bc
NAME         TYPE      FROM         LATEST
jmp-test-3   Source    Git@master   1

[root@free-int-master-3c664 ~]# oc start-build jmp-test-3
build "jmp-test-3-2" started

[root@free-int-master-3c664 ~]# oc get builds
NAME           TYPE      FROM          STATUS                               STARTED          DURATION
jmp-test-3-1   Source    Git@1f3d5b7   Failed (PushImageToRegistryFailed)   3 hours ago      7m57s
jmp-test-3-2   Source    Git@1f3d5b7   Running                              22 seconds ago   

[root@free-int-master-3c664 ~]# oc get pods
NAME                 READY     STATUS    RESTARTS   AGE
jmp-test-3-1-build   0/1       Error     0          2h
jmp-test-3-2-build   1/1       Running   0          32s

[root@free-int-master-3c664 ~]# oc logs -f jmp-test-3-2-build
Pulling image "registry.access.redhat.com/rhscl/ruby-23-rhel7@sha256:9c9adb4882400df6f050960bdd029eb1a5fc26ab5df483742b3edee9b76ccf82" ...
---> Installing application source ...
---> Building your Ruby application from source ...
---> Running 'bundle install --deployment --without development:test' ...
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
Fetching gem metadata from https://rubygems.org/...............
Fetching version metadata from https://rubygems.org/..
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
Installing puma 3.4.0 with native extensions
Installing rack 1.6.4
Using bundler 1.10.6
Bundle complete! 2 Gemfile dependencies, 3 gems now installed.
Gems in the groups development and test were not installed.
Bundled gems are installed into ./bundle.
---> Cleaning up unused ruby gems ...
Warning: the running version of Bundler is older than the version that created the lockfile. We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.

Pushing image 172.30.215.46:5000/jmp-test-3/jmp-test-3:latest ...
Pushed 0/6 layers, 2% complete
Pushed 1/6 layers, 33% complete
Pushed 2/6 layers, 33% complete
Pushed 3/6 layers, 59% complete
Pushed 4/6 layers, 79% complete
Pushed 5/6 layers, 100% complete
Pushed 6/6 layers, 100% complete
Push successful

Comment 4 Justin Pierce 2017-09-11 19:32:22 UTC
Registry logs:
http://file.rdu.redhat.com/~jupierce/share/free-int-failed-docker-push.tgz

[root@free-int-master-3c664 ~]# oc project default
Now using project "default" on server "https://internal.api.free-int.openshift.com:443".
[root@free-int-master-3c664 ~]# oc get pods
NAME                           READY     STATUS    RESTARTS   AGE
docker-registry-21-cb9hd       1/1       Running   0          4d
docker-registry-21-frppl       1/1       Running   0          4d

Comment 5 Ben Parees 2017-10-05 03:59:07 UTC
Alexey have you had a chance to take look at the logs Justin provided?

Comment 7 Ben Parees 2017-10-10 12:48:17 UTC
Isn't that comment 4?

Comment 8 Alexey Gladkov 2017-10-10 13:16:47 UTC
Oops. Yes, I miss them.

Comment 9 Alexey Gladkov 2017-10-10 15:04:29 UTC
I do not see any errors on the registry side.

Does this error continue to be reproduced?

Comment 10 Justin Pierce 2017-10-13 17:55:06 UTC
I have not seen it recently, no. However, if there is not enough logging to determine the problem, can we use this BZ to add what is necessary to diagnose it next time?

Comment 11 Ben Parees 2017-10-23 11:43:01 UTC
doesn't seem to be recurring, so low severity.

Comment 12 Ben Parees 2018-01-19 13:16:53 UTC
I need to check what we're logging in the build logs on failed push, might be the case that we weren't even able to reach the registry (hence nothing in the registry logs) and we didn't log a useful error in the build to indicate why the push failed.

Comment 13 Ben Parees 2018-01-19 17:00:12 UTC
We now log the error that lead to the retries/failure so if we hit this again we'll have more information:

return fmt.Errorf("After retrying %d times, %s image still failed due to error: %v", DefaultPushOrPullRetryCount, actionName, err)

Nothing else we can do with it for now, i'm going to close it and it can be reopened if further push issues are seen.


Note You need to log in before you can comment on or make changes to this bug.