Bug 1230321
| Summary: | Upgrading atomic 7.1.3 initial docker runs always time-out then eventually pass | ||
|---|---|---|---|
| Product: | [Retired] Atomic | Reporter: | Timothy St. Clair <tstclair> |
| Component: | docker-io | Assignee: | Lokesh Mandvekar <lsm5> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | unspecified | CC: | dwalsh, eparis, lsu, miabbott, mjenner, tstclair, vgoyal |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-07-28 19:46:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Timothy St. Clair
2015-06-10 16:03:11 UTC
Anything in the logs? Could this have something to do with devicemapper? very little details in the logs. n 10 10:17:15 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: Starting Docker Application Container Engine... n 10 09:58:43 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: Unit docker.service entered failed state. n 10 09:58:43 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: Failed to start Docker Application Container Engine. n 10 09:58:43 host06-rack10.scale.openstack.engineering.redhat.com docker[12239]: time="2015-06-10T09:58:43-04:00" level=info msg="Received signal 'terminated', starting shutdown of docker..." n 10 09:58:43 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: docker.service operation timed out. Terminating. n 10 09:57:13 host06-rack10.scale.openstack.engineering.redhat.com docker[12239]: time="2015-06-10T09:57:13-04:00" level=info msg="Listening for HTTP on unix (/var/run/docker.sock)" n 10 09:57:13 host06-rack10.scale.openstack.engineering.redhat.com docker[12239]: time="2015-06-10T09:57:13-04:00" level=info msg="+job serveapi(unix:///var/run/docker.sock)" n 10 09:57:13 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: Starting Docker Application Container Engine... n 10 09:56:46 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: Unit docker.service entered failed state. n 10 09:56:46 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: Failed to start Docker Application Container Engine. n 10 09:56:46 host06-rack10.scale.openstack.engineering.redhat.com docker[11877]: time="2015-06-10T09:56:46-04:00" level=info msg="Received signal 'terminated', starting shutdown of docker..." n 10 09:56:46 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: docker.service operation timed out. Terminating. n 10 09:55:16 host06-rack10.scale.openstack.engineering.redhat.com docker[11877]: time="2015-06-10T09:55:16-04:00" level=info msg="Listening for HTTP on unix (/var/run/docker.sock)" n 10 09:55:16 host06-rack10.scale.openstack.engineering.redhat.com docker[11877]: time="2015-06-10T09:55:16-04:00" level=info msg="+job serveapi(unix:///var/run/docker.sock)" n 10 09:55:16 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: Starting Docker Application Container Engine... n 10 09:54:14 host06-rack10.scale.openstack.engineering.redhat.com systemd[1]: Dependency failed for Docker Application Container Engine. Does not look like this is something related to devicemapper. Looks like systemd does not think that docker started properly and terminates it. May be docker has started but did not communicate back to systemd properly? If docker did not start properly, something should have been in the logs. Is docker setup to do sd_notofy? Might be timing out? In upgrading our cluster it always takes on the 4th try on every machine, which is slightly odd. Martin can you confirm if you are seeing similar? is this rhel or fedora ?? I believe he was doing RHEL 7.1.2 to RHEL 7.1.3 This is atomic 7.1.2 & 7.1.3. Update: This appears to be an ordering issue with flannel on startup, which I thought was all working. If I `ip link delete docker0` + systemctl start flannel + systemctl start docker all is well. do you have flannel 'enabled' ? flannel is not enabled. But I would not expect docker start to lag if it's not enabled. scratch comment #9, I'm still seeing it on other 7.1.3 machines. is this even after the recent compose (should have docker 1.6.2-14 build) ?? yes, latest 7.1.3 release. Any change with 7.1.4? Should be fixed in docker-1.9 7.2.1 release. Closing this one.. |