Bug 1140405

Summary:

systemctl start docker fails because systemd continuously restarts the daemon

Product:

[Fedora] Fedora

Reporter:

Toshio Ernie Kuratomi <a.badger>

Component:

docker-io

Assignee:

Lokesh Mandvekar <lsm5>

Status:

CLOSED EOL

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

unspecified

Docs Contact:

Priority:

medium

Version:

CC:

a.badger, admiller, dwalsh, golang-updates, hushan.jia, jperrin, mattdm, mgoldman, s, vbatts

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-06-30 01:08:40 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Output of journalctl -u docker --no-pager -l	none
Some systemctl status -l output	none

Description Toshio Ernie Kuratomi 2014-09-10 21:40:14 UTC

Description of problem:

I've installed docker for the first time and tried to start it with "systemctl start docker". The systemctl command returns successfully but then trying to run docker client commands against the daemon timed out. After some poking around I discovered that systemd was starting docker. Docker was taking quite a while to do various initialization tasks including invoking mkfs. systemd decided that docker was unresponsive and terminated it and then restarted it. Because mkfs hadn't finished, docker had to try running mkfs again. This cycle kept continuing and would probably have prevented docker from fully starting up forever.

I worked around the problem by telling systemd not to start docker, running the docker daemon manually from a shell, waiting until the mkfs had completed, then shutting down my daemon and rerunning systemctl start docker. After that, the docker service runs fine.

Version-Release number of selected component (if applicable):

docker-io-1.2.0-2.fc20.x86_64

How reproducible:
Everytime for me until after I ran docker as a daemon manually. I don't know how to reproduce once docker has initialized (Probably removing some file or volume but I don't know what it would be).

Steps to Reproduce:
1. On a system that hasn't had docker running before
2. yum install docker-io
3. systemctl start docker
4. watch the output of systemctl status docker -l

Actual results:

systemctl status docker -l will report that docker is in state Activating for several minutes, then show that systemd decided docker wasn't responding, terminate it, and restart. The -l output will also show that docker is running mkfs for most of that time and is still running it when docker is terminated.

Expected results:

systemctl status docker -l will show that the state has gone to active (running)

Additional info:

* My filesystem is ext4. The docker initialization is running mkfs.ext4
* I'm using a 4-5 year old laptop with platter HDs. A faster machine or SSD drives might run mkfs quickly enough to not see this issue.
* This might be "fixed" by adding some documentation that says to perform certain steps to initialize docker before running systemctl start docker rather than changing docker code to finish initialization sooner.

Comment 1 Toshio Ernie Kuratomi 2014-09-10 21:43:15 UTC

Created attachment 936326 [details]
Output of journalctl -u docker --no-pager -l

Here's output from journalctl. You can see that at first systemd is starting docker, deciding that it timed out, terminating it, and then restarting it.

The eventual successful start by systemd at the bottom of the log comes after I manually ran the daemon so that the mkfs would complete.

Comment 2 Toshio Ernie Kuratomi 2014-09-10 21:49:44 UTC

Created attachment 936327 [details]
Some systemctl status -l output

here's a copy and paste of some runs of systemctl status -l while I was still debugging this.  You can see that docker starts up and by 59s it's invoked mkfs.ext4 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 /dev/mapper/docker-253:2-7757935-base

At 1min 26s, the same mkfs is still running.  Sometime after that, systemd has terminated that docker daemon and tried to start a new one.

Comment 3 Daniel Walsh 2014-09-12 17:54:19 UTC

Is it possible to tell systemd to not restart docker?  Not sure why we would want this autorestarted.

Comment 4 Toshio Ernie Kuratomi 2014-09-18 16:43:00 UTC

just to note -- it's okay for systemd to try starting docker if it's not running; we just don't want it to assume docker is hung and kill it (at least during this initialization step).

Comment 5 Toshio Ernie Kuratomi 2014-09-22 19:53:27 UTC

Confirmed that running stemctl start docker for the first time on an SSD machine was fine.  So it seems to be related to how quickly the mkfs is run on the specific hardware.

Comment 6 Lokesh Mandvekar 2015-01-15 21:43:41 UTC

Hi Toshio, sorry to get back so late on this, could you please retry this with docker-io-1.4.1-5 ?

Comment 7 Toshio Ernie Kuratomi 2015-01-15 23:55:28 UTC

Still happening. docker-io-1.4.1-5.fc21.x86_64

I'm guessing there's no way to solve this unless you can do one of the following:

* speed up the mkfs that docker is using in its initial run
* push that initialization into something besides service startup
* Tell systemd that starting docker should have a longer than normal timeout

Comment 8 Daniel Walsh 2015-03-10 00:35:26 UTC

Lokesh can you see about extending the systemd timeout?

Comment 9 Fedora Admin XMLRPC Client 2015-03-24 03:37:45 UTC

This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 10 Fedora End Of Life 2015-05-29 12:50:57 UTC

This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Daniel Walsh 2015-05-29 13:07:41 UTC

Toshio are you still seeing this problem?

Comment 12 Fedora End Of Life 2015-06-30 01:08:40 UTC

Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 13 Red Hat Bugzilla 2023-09-14 02:47:24 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days