Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1455991 - Builds are stuck in Running at the push stage
Builds are stuck in Running at the push stage
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Image Registry (Show other bugs)
3.6.0
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Oleg Bulatov
Hongkai Liu
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-05-26 11:13 EDT by Vikas Laad
Modified: 2017-08-16 15 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-10 01:25:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1716 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 05:02:50 EDT

  None (edit)
Description Vikas Laad 2017-05-26 11:13:07 EDT
Description of problem:
I am running concurrent builds in scale environment, I saw this bug with 250 concurrent builds.

Cloning "https://github.com/redhat-performance/cakephp-ex.git" ...
        Commit: 0014ddebb91bc7dff3a1dabfbd7b51da762a6677 (made changes to enable database example)
        Author: ofthecure <robdean.smith@gmail.com>
        Date:   Mon Apr 25 14:33:06 2016 -0400
DEPRECATED: Use .s2i/bin instead of .sti/bin
---> Installing application source...
Pushing image 10.202.162.54:5000/proj565/cakephp-mysql-example:latest ...
Pushed 3/5 layers, 61% complete
Pushed 4/5 layers, 82% complete

Version-Release number of selected component (if applicable):
# openshift version
openshift v3.6.74
kubernetes v1.6.1+5115d708d7
etcd 3.1.0
# docker version
Client:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-common-1.12.6-14.el7.x86_64
 Go version:      go1.7.4
 Git commit:      3a094bd/1.12.6
 Built:           Thu Mar 16 14:27:53 2017
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-common-1.12.6-14.el7.x86_64
 Go version:      go1.7.4
 Git commit:      3a094bd/1.12.6
 Built:           Thu Mar 16 14:27:53 2017
 OS/Arch:         linux/amd64


How reproducible:
Start concurrent builds for cakephp app happens with 250 concurrent builds.

Actual results:
Builds are stuck in Running state

Expected results:
Build should finish successfully.

Additional info:
--- Comment #12 from Jim Minter <jminter@redhat.com> ---
Different bug.  Looking at the environment in question, all the stuck builds
are stuck on the final image push.  In the sample in c10, s2i is pushing to the
Docker daemon and is waiting for the Docker daemon to report completed.  I
think this is most likely to be an OpenShift registry bug or a Docker daemon
bug - I'm not sure which at this point.  Please open a new bz, and I suggest
capturing:

- registry pod goroutines (SIGABRT)
- registry pod log
- docker daemon goroutines on a node hosting a failed build (SIGABRT)
- docker daemon log on same

I am going to provide a link for all these logs.
Comment 6 Oleg Bulatov 2017-06-02 09:43:45 EDT
https://github.com/docker/distribution/pull/2299
Comment 7 Michal Fojtik 2017-06-12 03:38:09 EDT
(In reply to Oleg Bulatov from comment #6)
> https://github.com/docker/distribution/pull/2299

Oleg, can we pick this fix for registry to close this bug?
Comment 8 Oleg Bulatov 2017-06-12 04:13:50 EDT
Yes, we can. I expected it would be merged into upstream a little bit faster, but they didn't care.
Comment 9 Oleg Bulatov 2017-06-12 05:49:04 EDT
https://github.com/openshift/origin/pull/14581
Comment 10 ge liu 2017-06-13 23:11:57 EDT
The image(devenv-rhel7_6350) is not ready in aws according to PR in comment 9, we will test it after it ready.
Comment 11 Vikas Laad 2017-06-14 08:28:20 EDT
Ge Liu, I will test it in scale environment. will assign it to myself.
Comment 13 Hongkai Liu 2017-07-07 15:14:31 EDT
Rerun the test with 50 concurrent builds, all builds succeeded.
Comment 14 Hongkai Liu 2017-07-07 15:29:14 EDT
(In reply to Hongkai Liu from comment #13)
> Rerun the test with 50 concurrent builds, all builds succeeded.

Verified on 3.6.133
Comment 16 errata-xmlrpc 2017-08-10 01:25:32 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Note You need to log in before you can comment on or make changes to this bug.