Bug 1406319 - S2I build randomly got stuck at "Waiting for container to stop ..."
Summary: S2I build randomly got stuck at "Waiting for container to stop ..."
Keywords:
Status: CLOSED DUPLICATE of bug 1405306
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Jim Minter
QA Contact: Wang Haoran
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-20 09:20 UTC by Gan Huang
Modified: 2017-03-08 18:43 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-21 14:19:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Stack trace from hung build. (25.44 KB, text/plain)
2016-12-20 13:42 UTC, Mike Fiedler
no flags Details
Stack trace with loglevel=9 (249.41 KB, text/plain)
2016-12-20 13:51 UTC, Mike Fiedler
no flags Details

Description Gan Huang 2016-12-20 09:20:00 UTC
Description of problem:
Trigger a S2I build, it randomly (20% in my test env) got stuck at Waiting for container "8b581bda6b719653757edac4f111fb6f89ac396912964f390688ddad87347bd3" to stop ...

Version-Release number of selected component (if applicable):
openshift v3.4.0.38
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

docker-1.12.4-3.el7.x86_64

How reproducible:
20%

Steps to Reproduce:
1. Install OCP
2. oc new-app nodejs-mongodb-example
3. Recreate it if step 2 succeed
#oc cancel-build nodejs-mongodb-example-1
#oc start-build nodejs-mongodb-example --build-loglevel=5

Actual results:

S2I build got stuck in 5 retries

#oc build-logs nodejs-mongodb-example-6
I1220 08:41:12.617232       1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/openshift/pipeline/README.md as src/openshift/pipeline/README.md
I1220 08:41:12.617295       1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/openshift/templates/nodejs-mongodb-persistent.json as src/openshift/templates/nodejs-mongodb-persistent.json
I1220 08:41:12.617374       1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/openshift/templates/nodejs-mongodb.json as src/openshift/templates/nodejs-mongodb.json
I1220 08:41:12.717973       1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/openshift/templates/nodejs.json as src/openshift/templates/nodejs.json
I1220 08:41:12.719450       1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/package.json as src/package.json
I1220 08:41:12.719505       1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/server.js as src/server.js
I1220 08:41:12.719600       1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/tests/app_test.js as src/tests/app_test.js
I1220 08:41:12.719665       1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/views/index.html as src/views/index.html
I1220 08:41:12.981665       1 docker.go:1033] Waiting for container "8b581bda6b719653757edac4f111fb6f89ac396912964f390688ddad87347bd3" to stop ...

Go to the node where container 8b581 lives
# docker ps |grep 8b581
8b581bda6b71        registry.ops.openshift.com/rhscl/nodejs-4-rhel7@sha256:196bc359072ae634a37feef918ac9a4d43690b43edf9ef58a4f7703cedabb3de          "container-entrypoint"   10 minutes ago      Up 10 minutes                           s2i_registry_ops_openshift_com_rhscl_nodejs_4_rhel7_sha256_196bc359072ae634a37feef918ac9a4d43690b43edf9ef58a4f7703cedabb3de_824fb89e

Try to stop container manually. succeed!
# docker stop 8b581
8b581
# docker ps |grep 8b581


Expected results:
S2I build should 100% succeed.

Additional info:
*Note*: 
We disabled selinux in our test env for workaround BZ#1405306. Not sure it's related.

Comment 4 Mike Fiedler 2016-12-20 13:42:54 UTC
Created attachment 1233873 [details]
Stack trace from hung build.

Had loglevel 0 - let me know if you want another dump with a higher loglevel.

Comment 5 Mike Fiedler 2016-12-20 13:51:13 UTC
Created attachment 1233889 [details]
Stack trace with loglevel=9

Comment 6 Cesar Wong 2016-12-20 14:35:27 UTC
The launched assemble container was still waiting on stdin to tar to close. This looks like it has been addressed more recently by:
https://github.com/openshift/source-to-image/commit/eb59ecae86beedb5547c7da87bc184f78e9b8271

Jim, assigning to you, since it looks like you fixed it. Your change would need to be backported to the 3.4 stream.

Comment 7 Jim Minter 2016-12-20 15:12:23 UTC
Hi Mike, could you confirm your statement "Seeing this in SVT environments starting with 3.4.0.37".  Do you mean 3.4.0.36 was OK?

I'm concerned this might be a Docker issue - has any Docker RPM version changed?

Comment 8 Mike Fiedler 2016-12-20 15:28:08 UTC
Docker versions seem to be changing quickly in the 1.12.x stream.   This environment is currently 1.12.4-3 which was new late last week.

The docker 1.12.4-3/OCP 3.4.0.37 combo is the first test bed I hit the issue in.   I have not tried to bisect.

Comment 10 Jim Minter 2016-12-20 15:36:27 UTC
I think I've seen a docker container stdin hang elsewhere with source-to-image, also with docker-1.12.4-3.el7.x86_64.  This does not seem to occur for me (for instance) on docker-1.12.2-5.el7.x86_64.  What was the previous docker version you were using?  Do you get the same behaviour that you've reported if you use the same OCP version but with docker-1.12.2-5.el7.x86_64?

Comment 11 Mike Fiedler 2016-12-20 15:39:08 UTC
I'm testing docker-1.12.5-4 right now.   It does not seem to have the issue referenced in the description (https://bugzilla.redhat.com/show_bug.cgi?id=1405306) which was forcing us to disable selinux to run builds.   Will update with results.

Comment 12 Jim Minter 2016-12-20 16:06:12 UTC
Ditto - the issue I've seen, which I think is likely to be the same as this, has not occurred for me on docker-1.12.5-4.el7.x86_64.

Comment 13 Mike Fiedler 2016-12-20 16:09:39 UTC
I am also unable to reproduce on 1.12.5-4.el7.x86_64

Comment 14 Jim Minter 2016-12-20 16:18:36 UTC
This was probably https://github.com/docker/docker/issues/29421, https://github.com/projectatomic/docker/commit/e9e3ab6b6a718118a5928e726ab1297f0b8ef5cd.  >= 1.12.5-1.el7.x86_64 contains the required patch.

Comment 15 Gan Huang 2016-12-21 09:32:34 UTC
Test against docker-1.12.5-7.el7.x86_64.

Can't reproduce it in 10 S2I builds.

Comment 16 Scott Dodson 2016-12-21 14:19:16 UTC

*** This bug has been marked as a duplicate of bug 1405306 ***


Note You need to log in before you can comment on or make changes to this bug.