Description of problem: Trigger a S2I build, it randomly (20% in my test env) got stuck at Waiting for container "8b581bda6b719653757edac4f111fb6f89ac396912964f390688ddad87347bd3" to stop ... Version-Release number of selected component (if applicable): openshift v3.4.0.38 kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 docker-1.12.4-3.el7.x86_64 How reproducible: 20% Steps to Reproduce: 1. Install OCP 2. oc new-app nodejs-mongodb-example 3. Recreate it if step 2 succeed #oc cancel-build nodejs-mongodb-example-1 #oc start-build nodejs-mongodb-example --build-loglevel=5 Actual results: S2I build got stuck in 5 retries #oc build-logs nodejs-mongodb-example-6 I1220 08:41:12.617232 1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/openshift/pipeline/README.md as src/openshift/pipeline/README.md I1220 08:41:12.617295 1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/openshift/templates/nodejs-mongodb-persistent.json as src/openshift/templates/nodejs-mongodb-persistent.json I1220 08:41:12.617374 1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/openshift/templates/nodejs-mongodb.json as src/openshift/templates/nodejs-mongodb.json I1220 08:41:12.717973 1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/openshift/templates/nodejs.json as src/openshift/templates/nodejs.json I1220 08:41:12.719450 1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/package.json as src/package.json I1220 08:41:12.719505 1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/server.js as src/server.js I1220 08:41:12.719600 1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/tests/app_test.js as src/tests/app_test.js I1220 08:41:12.719665 1 tar.go:320] Adding to tar: /tmp/s2i-build644522434/upload/src/views/index.html as src/views/index.html I1220 08:41:12.981665 1 docker.go:1033] Waiting for container "8b581bda6b719653757edac4f111fb6f89ac396912964f390688ddad87347bd3" to stop ... Go to the node where container 8b581 lives # docker ps |grep 8b581 8b581bda6b71 registry.ops.openshift.com/rhscl/nodejs-4-rhel7@sha256:196bc359072ae634a37feef918ac9a4d43690b43edf9ef58a4f7703cedabb3de "container-entrypoint" 10 minutes ago Up 10 minutes s2i_registry_ops_openshift_com_rhscl_nodejs_4_rhel7_sha256_196bc359072ae634a37feef918ac9a4d43690b43edf9ef58a4f7703cedabb3de_824fb89e Try to stop container manually. succeed! # docker stop 8b581 8b581 # docker ps |grep 8b581 Expected results: S2I build should 100% succeed. Additional info: *Note*: We disabled selinux in our test env for workaround BZ#1405306. Not sure it's related.
Created attachment 1233873 [details] Stack trace from hung build. Had loglevel 0 - let me know if you want another dump with a higher loglevel.
Created attachment 1233889 [details] Stack trace with loglevel=9
The launched assemble container was still waiting on stdin to tar to close. This looks like it has been addressed more recently by: https://github.com/openshift/source-to-image/commit/eb59ecae86beedb5547c7da87bc184f78e9b8271 Jim, assigning to you, since it looks like you fixed it. Your change would need to be backported to the 3.4 stream.
Hi Mike, could you confirm your statement "Seeing this in SVT environments starting with 3.4.0.37". Do you mean 3.4.0.36 was OK? I'm concerned this might be a Docker issue - has any Docker RPM version changed?
Docker versions seem to be changing quickly in the 1.12.x stream. This environment is currently 1.12.4-3 which was new late last week. The docker 1.12.4-3/OCP 3.4.0.37 combo is the first test bed I hit the issue in. I have not tried to bisect.
I think I've seen a docker container stdin hang elsewhere with source-to-image, also with docker-1.12.4-3.el7.x86_64. This does not seem to occur for me (for instance) on docker-1.12.2-5.el7.x86_64. What was the previous docker version you were using? Do you get the same behaviour that you've reported if you use the same OCP version but with docker-1.12.2-5.el7.x86_64?
I'm testing docker-1.12.5-4 right now. It does not seem to have the issue referenced in the description (https://bugzilla.redhat.com/show_bug.cgi?id=1405306) which was forcing us to disable selinux to run builds. Will update with results.
Ditto - the issue I've seen, which I think is likely to be the same as this, has not occurred for me on docker-1.12.5-4.el7.x86_64.
I am also unable to reproduce on 1.12.5-4.el7.x86_64
This was probably https://github.com/docker/docker/issues/29421, https://github.com/projectatomic/docker/commit/e9e3ab6b6a718118a5928e726ab1297f0b8ef5cd. >= 1.12.5-1.el7.x86_64 contains the required patch.
Test against docker-1.12.5-7.el7.x86_64. Can't reproduce it in 10 S2I builds.
*** This bug has been marked as a duplicate of bug 1405306 ***