Bug 1263394
Summary: | Docker does not start up on the node | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Veer Muchandi <veer> | |
Component: | docker | Assignee: | Lokesh Mandvekar <lsm5> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | atomic-bugs <atomic-bugs> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 7.1 | CC: | aos-bugs, dwalsh, jlothian, jokerman, lsm5, mmccomas, sdodson, vgoyal | |
Target Milestone: | rc | Keywords: | Extras, Reopened | |
Target Release: | 7.1 | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | docker-1.9.1-2.el7_2 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1264193 (view as bug list) | Environment: | ||
Last Closed: | 2016-01-07 21:59:24 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1264193 |
Description
Veer Muchandi
2015-09-15 18:09:06 UTC
Could you be out of space? There is space # df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 20G 5.5G 15G 28% / devtmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.9G 0 3.9G 0% /dev/shm tmpfs 3.9G 17M 3.9G 1% /run tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup What happens if you run docker -d # docker -d INFO[0000] +job serveapi(unix:///var/run/docker.sock) INFO[0000] Listening for HTTP on unix (/var/run/docker.sock) ERRO[0000] WARNING: No --storage-opt dm.thinpooldev specified, using loopback; this configuration is strongly discouraged for production use Ok So docker is working without all of the other flags. vivek what are the standard docker flags on RHEL? docker -d --selinux-enabled --storage-opt ... Could this be running out of the 10 gig of space in the devicemapper? Try --storage-opt dm.basesize=10G and see if same problem happens. People have complained that mkfs.xfs is taking time with 100G thin devices when loopback devices are being used on top of cloud storage. And system times out. In my testing it was taking 30 seoconds. Primarily it is slower storage problem. For dm.basesize=10G to be effective, you will have to setup a fresh instance of docker. If you restart already setup instance with new parameter, it will not help. # docker -d --storage-opt dm.basesize=10G ERRO[0000] WARNING: No --storage-opt dm.thinpooldev specified, using loopback; this configuration is strongly discouraged for production use ERRO[0000] Unable to delete device: Error running DeleteDevice dm_task_run failed INFO[0000] +job serveapi(unix:///var/run/docker.sock) INFO[0000] Listening for HTTP on unix (/var/run/docker.sock) INFO[0008] +job init_networkdriver() INFO[0008] -job init_networkdriver() = OK (0) INFO[0008] Loading containers: start. .......................................................................................................................................................................................................................................... INFO[0008] Loading containers: done. INFO[0008] docker daemon: 1.6.2 ba1f6c3/1.6.2; execdriver: native-0.2; graphdriver: devicemapper INFO[0008] +job acceptconnections() INFO[0008] -job acceptconnections() = OK (0) INFO[0008] Daemon has completed initialization after running docker -d, I tried starting docker again with systemctl start docker and it starts Not sure what the difference is but it at least starts up. Delays happen only first time when mkfs.xfs is being done. If you have run docker with "-d" successfully, that means mkfs is done and when you run next time using systemd it succeeds as it does not go through mkfs step. If you are running on a loopback device, you should really consider moving to a physical device, We are heavily recommending that people do not run loopback devmapper in production. Inspite of docker service coming up after I run docker -d, it seems to be causing issues. I am not able to start any pods on openshift after this. They are all in pending status. Does something like docker run --rm fedora echo hello Work? Dan, Last night I ended up scrapping these boxes as I need OpenShift. I am reinstalling the whole thing again, Ok reopen if it happens again. docker-1.8.2-7.el7.x86_64 seems to have reduced the timeout to 60 seconds which is less than the systemd default of 90 seconds. When this happened I'm unable to start docker after cleaning up /var/lib/docker. I believe the intention was to increase the timeout rather than decrease it so I think this should be increased to 2min or more. On my laptop I can reliably create the 100G loopback device in 90 seconds but not 60 seconds. Cloud storage may be considerably slower than the SSD in my laptop however. I'll try to run a few rounds of tests in AWS and an OpenStack environment but I may not have time to do so. Lokesh lets set the timeout to 5 minutes. *** Bug 1230389 has been marked as a duplicate of this bug. *** |