Bug 1436731
| Summary: | Few builds failing due to ContainerCannotRun Error | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Vikas Laad <vlaad> | ||||||
| Component: | docker | Assignee: | Mrunal Patel <mpatel> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | DeShuai Ma <dma> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 7.0 | CC: | accounts+fedora, amurdaca, aos-bugs, jhonce, jokerman, lsm5, lsu, mifiedle, mmccomas, smahajan, vgoyal, vlaad | ||||||
| Target Milestone: | rc | Keywords: | Extras | ||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2017-08-02 00:11:21 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 1266965 [details]
desc pod details
How was your storage configured? [docker|container]-storage-setup or openshift-ansible? Please attach: - /etc/sysconfig/*-storage-setup - docker log Please attach results from: # docker info # lvs # vgs # dmsetup table # dmsetup status # systemctl status docker.service Thanks Storage was configured using openshift-ansible
/etc/sysconfig/docker-storage-setup has following line
VG=docker_vg
docker logs attached
docker info attached
root@ip-172-31-34-20: ~ # lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
docker-pool docker_vg twi-aot--- 29.70g 85.39 46.64
root@ip-172-31-34-20: ~ # vgs
VG #PV #LV #SN Attr VSize VFree
docker_vg 1 1 0 wz--n- 30.00g 236.00m
root@ip-172-31-34-20: ~ # dmsetup table
docker_vg-docker--pool_tdata: 0 25026560 linear 202:16 67584
docker_vg-docker--pool_tdata: 25026560 37265408 linear 202:16 25159680
docker-202:2-14037750-a94dc897c6727356a0369871ae12903c0ae6254c8fa47aab6f237bcc660f7acb: 0 20971520 thin 253:2 31
docker_vg-docker--pool_tmeta: 0 65536 linear 202:16 25094144
docker-202:2-14037750-6e62818c646dc993c4178c4b0a5ad3afb0ed60c9c418b3bc3f0723266e4dda2b: 0 20971520 thin 253:2 34
docker_vg-docker--pool: 0 62291968 thin-pool 253:0 253:1 1024 24332 1 skip_block_zeroing
root@ip-172-31-34-20: ~ # dmsetup status
docker_vg-docker--pool_tdata: 0 25026560 linear
docker_vg-docker--pool_tdata: 25026560 37265408 linear
docker-202:2-14037750-a94dc897c6727356a0369871ae12903c0ae6254c8fa47aab6f237bcc660f7acb: 0 20971520 thin 496640 20971519
docker_vg-docker--pool_tmeta: 0 65536 linear
docker-202:2-14037750-6e62818c646dc993c4178c4b0a5ad3afb0ed60c9c418b3bc3f0723266e4dda2b: 0 20971520 thin 946176 20971519
docker_vg-docker--pool: 0 62291968 thin-pool 126045 3821/8192 51947/60832 - rw no_discard_passdown queue_if_no_space -
root@ip-172-31-34-20: ~ # systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2017-03-27 12:06:00 EDT; 1 day 22h ago
Docs: http://docs.docker.com
Main PID: 14089 (dockerd-current)
Memory: 379.3M
CGroup: /system.slice/docker.service
├─14089 /usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libe...
├─14096 /usr/bin/docker-containerd-current -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime ...
├─16588 /usr/bin/docker-containerd-shim-current e7db9efaaf3b6a2ba1b945bd2dd3e448c0a64f03ce6fc558cabf1a85be0b3b32 /var/run/docker/libcontainerd/e7db9efaaf3b6a2ba1b945bd2dd3e448c0a64f03ce6fc558cabf1a85be0b3b32 /usr/libexec/docker/docker-runc...
└─17092 /usr/bin/docker-containerd-shim-current d3c1bade7012e8634b47e6e7e3eac4544215c5d9222ea716b2ecdaf94b3b2330 /var/run/docker/libcontainerd/d3c1bade7012e8634b47e6e7e3eac4544215c5d9222ea716b2ecdaf94b3b2330 /usr/libexec/docker/docker-runc...
Mar 29 09:54:31 ip-172-31-34-20.us-west-2.compute.internal oci-systemd-hook[17776]: systemdhook <debug>: Skipping as container command is /usr/bin/openshift-sti-build, not init or systemd
Mar 29 09:54:33 ip-172-31-34-20.us-west-2.compute.internal dockerd-current[14089]: time="2017-03-29T09:54:33.024339234-04:00" level=info msg="{Action=stop, LoginUID=4294967295, PID=62198}"
Mar 29 09:54:33 ip-172-31-34-20.us-west-2.compute.internal oci-register-machine[17839]: 2017/03/29 09:54:33 Register machine: poststop af81974cfba51e7727a582c6b10a923f7a208f8e1c81d2366d5594e59acb0936 0 /var/lib/docker/devicemapper/mnt/5...eb89f72/rootfs
Mar 29 09:54:33 ip-172-31-34-20.us-west-2.compute.internal dockerd-current[14089]: time="2017-03-29T09:54:33.430639015-04:00" level=info msg="{Action=stop, LoginUID=4294967295, PID=62198}"
Mar 29 09:54:33 ip-172-31-34-20.us-west-2.compute.internal dockerd-current[14089]: time="2017-03-29T09:54:33.434910013-04:00" level=error msg="Handler for POST /v1.24/containers/af81974cfba51e7727a582c6b10a923f7a208f8e1c81d2366d5594e59a...ready stopped"
Mar 29 09:54:33 ip-172-31-34-20.us-west-2.compute.internal dockerd-current[14089]: time="2017-03-29T09:54:33.435028249-04:00" level=error msg="Handler for POST /v1.24/containers/af81974cfba51e7727a582c6b10a923f7a208f8e1c81d2366d5594e59a...ready stopped"
Mar 29 09:55:07 ip-172-31-34-20.us-west-2.compute.internal dockerd-current[14089]: time="2017-03-29T09:55:07.432215546-04:00" level=info msg="{Action=remove, LoginUID=4294967295, PID=62198}"
Mar 29 09:55:07 ip-172-31-34-20.us-west-2.compute.internal dockerd-current[14089]: time="2017-03-29T09:55:07.475223495-04:00" level=info msg="{Action=remove, LoginUID=4294967295, PID=62198}"
Mar 29 09:58:02 ip-172-31-34-20.us-west-2.compute.internal dockerd-current[14089]: time="2017-03-29T09:58:02.518478998-04:00" level=info msg="{Action=remove, LoginUID=4294967295, PID=62198}"
Mar 29 09:58:02 ip-172-31-34-20.us-west-2.compute.internal dockerd-current[14089]: time="2017-03-29T09:58:02.548314189-04:00" level=info msg="{Action=remove, LoginUID=4294967295, PID=62198}"
Hint: Some lines were ellipsized, use -l to show in full.
Created attachment 1267313 [details]
docker info
This race condition was fixed upstream with this commit: https://github.com/opencontainers/runc/commit/8f55948aa5c97472c4a8ef6664b10ed662ce1df5 Can this patch be included in the docker package? (It would need to patch the vendored runc dependency) Vivek, If you think lstat error is fixed this bug needs to be updated. Yes, lstat error is fixed in docker version docker-2:1.12.6-18.git29d6f69 Completed 100 cycle of concurrent builds on following version of docker, did not see this error. root@ip-172-31-31-51: ~ # openshift version openshift v3.6.106 kubernetes v1.6.1+5115d708d7 etcd 3.2.0 root@ip-172-31-31-51: ~ # docker version Client: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-28.git1398f24.el7.x86_64 Go version: go1.7.4 Git commit: 1398f24/1.12.6 Built: Wed May 17 01:16:44 2017 OS/Arch: linux/amd64 Server: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-28.git1398f24.el7.x86_64 Go version: go1.7.4 Git commit: 1398f24/1.12.6 Built: Wed May 17 01:16:44 2017 OS/Arch: linux/amd64 Per comment#10 , move to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2344 |
Description of problem: While running concurrent builds test I am seeing few failures with following error. root@ip-172-31-25-76: ~/svt/openshift_performance/ci/scripts # oc get pods --all-namespaces | grep ContainerCannotRun proj23 cakephp-mysql-example-2-build 0/1 ContainerCannotRun 0 8m proj25 cakephp-mysql-example-2-build 0/1 ContainerCannotRun 0 7m proj27 cakephp-mysql-example-3-build 0/1 ContainerCannotRun 0 1m proj5 cakephp-mysql-example-1-build 0/1 ContainerCannotRun 0 11m oc describe pod has following error in events 5m 5m 1 {kubelet ip-172-31-34-20.us-west-2.compute.internal} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "sti-build" with RunContainerError: "runContainer: Error response from daemon: {\"message\":\"linux runtime spec devices: lstat /dev/dm-22: no such file or directory\"}" Version-Release number of selected component (if applicable): openshift v3.5.3 kubernetes v1.5.2+43a9be4 etcd 3.1.0 Steps to Reproduce: 1. Create cluster with 1 master 1 infra and 2 worker nodes 2. Run 24 concurrent cakephp builds So far I have seen 5 failure out of 165 builds. Actual results: ContainerCannotRun Expected results: Build should complete. Additional info: See full describe pod details attached.