Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1455991

Summary: Builds are stuck in Running at the push stage
Product: OpenShift Container Platform Reporter: Vikas Laad <vlaad>
Component: Image RegistryAssignee: Oleg Bulatov <obulatov>
Status: CLOSED ERRATA QA Contact: Hongkai Liu <hongkliu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: aos-bugs, jminter, mfojtik, mifiedle, obulatov, smunilla, vlaad
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-10 05:25:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vikas Laad 2017-05-26 15:13:07 UTC
Description of problem:
I am running concurrent builds in scale environment, I saw this bug with 250 concurrent builds.

Cloning "https://github.com/redhat-performance/cakephp-ex.git" ...
        Commit: 0014ddebb91bc7dff3a1dabfbd7b51da762a6677 (made changes to enable database example)
        Author: ofthecure <robdean.smith>
        Date:   Mon Apr 25 14:33:06 2016 -0400
DEPRECATED: Use .s2i/bin instead of .sti/bin
---> Installing application source...
Pushing image 10.202.162.54:5000/proj565/cakephp-mysql-example:latest ...
Pushed 3/5 layers, 61% complete
Pushed 4/5 layers, 82% complete

Version-Release number of selected component (if applicable):
# openshift version
openshift v3.6.74
kubernetes v1.6.1+5115d708d7
etcd 3.1.0
# docker version
Client:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-common-1.12.6-14.el7.x86_64
 Go version:      go1.7.4
 Git commit:      3a094bd/1.12.6
 Built:           Thu Mar 16 14:27:53 2017
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-common-1.12.6-14.el7.x86_64
 Go version:      go1.7.4
 Git commit:      3a094bd/1.12.6
 Built:           Thu Mar 16 14:27:53 2017
 OS/Arch:         linux/amd64


How reproducible:
Start concurrent builds for cakephp app happens with 250 concurrent builds.

Actual results:
Builds are stuck in Running state

Expected results:
Build should finish successfully.

Additional info:
--- Comment #12 from Jim Minter <jminter> ---
Different bug.  Looking at the environment in question, all the stuck builds
are stuck on the final image push.  In the sample in c10, s2i is pushing to the
Docker daemon and is waiting for the Docker daemon to report completed.  I
think this is most likely to be an OpenShift registry bug or a Docker daemon
bug - I'm not sure which at this point.  Please open a new bz, and I suggest
capturing:

- registry pod goroutines (SIGABRT)
- registry pod log
- docker daemon goroutines on a node hosting a failed build (SIGABRT)
- docker daemon log on same

I am going to provide a link for all these logs.

Comment 6 Oleg Bulatov 2017-06-02 13:43:45 UTC
https://github.com/docker/distribution/pull/2299

Comment 7 Michal Fojtik 2017-06-12 07:38:09 UTC
(In reply to Oleg Bulatov from comment #6)
> https://github.com/docker/distribution/pull/2299

Oleg, can we pick this fix for registry to close this bug?

Comment 8 Oleg Bulatov 2017-06-12 08:13:50 UTC
Yes, we can. I expected it would be merged into upstream a little bit faster, but they didn't care.

Comment 9 Oleg Bulatov 2017-06-12 09:49:04 UTC
https://github.com/openshift/origin/pull/14581

Comment 10 ge liu 2017-06-14 03:11:57 UTC
The image(devenv-rhel7_6350) is not ready in aws according to PR in comment 9, we will test it after it ready.

Comment 11 Vikas Laad 2017-06-14 12:28:20 UTC
Ge Liu, I will test it in scale environment. will assign it to myself.

Comment 13 Hongkai Liu 2017-07-07 19:14:31 UTC
Rerun the test with 50 concurrent builds, all builds succeeded.

Comment 14 Hongkai Liu 2017-07-07 19:29:14 UTC
(In reply to Hongkai Liu from comment #13)
> Rerun the test with 50 concurrent builds, all builds succeeded.

Verified on 3.6.133

Comment 16 errata-xmlrpc 2017-08-10 05:25:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716