Bug 1746355 - Error starting daemon: Devices cgroup isn't mounted
Summary: Error starting daemon: Devices cgroup isn't mounted
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: moby-engine
Version: 33
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Olivier Lemasle
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: RejectedBlocker
: 1751636 1757078 1885433 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-28 09:17 UTC by Lukas Slebodnik
Modified: 2021-03-19 20:15 UTC (History)
46 users (show)

Fixed In Version: moby-engine-20.10.5-1.fc34
Clone Of:
Environment:
Last Closed: 2021-03-19 20:15:47 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github containerd containerd issues 3726 0 None closed cgroups v2 Support 2021-01-10 22:07:44 UTC

Description Lukas Slebodnik 2019-08-28 09:17:55 UTC
Description of problem:
The default cgroup hierarchy is set to unified (cgroups v2) (#1732114).
https://fedoraproject.org/wiki/Changes/CGroupsV2
and thus moby-engine(docker.service) does not work on f31 by default


Version-Release number of selected component (if applicable):
sh$ rpm -q moby-engine systemd
moby-engine-18.09.8-2.ce.git0dd43dd.fc31.x86_64
systemd-243~rc2-1.fc31.x86_64

How reproducible:
Deterministic

Steps to Reproduce:
1. boot minimal machine with >= systemd-243~rc2-1.fc31.x86_64
2. dnf install -y moby-engine
3. systemctl start docker.service

Actual results:

sh# systemctl start docker.service
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xe" for details.

sh# systemctl status docker.service | cat
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2019-08-28 11:16:10 CEST; 4s ago
     Docs: https://docs.docker.com
  Process: 21555 ExecStart=/usr/bin/dockerd --host=fd:// --exec-opt native.cgroupdriver=systemd $OPTIONS (code=exited, status=1/FAILURE)
 Main PID: 21555 (code=exited, status=1/FAILURE)
      CPU: 192ms

Aug 28 11:16:10 kvm-01-guest06.lab.eng.brq.redhat.com systemd[1]: docker.service: Service RestartSec=100ms expired, scheduling restart.
Aug 28 11:16:10 kvm-01-guest06.lab.eng.brq.redhat.com systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Aug 28 11:16:10 kvm-01-guest06.lab.eng.brq.redhat.com systemd[1]: Stopped Docker Application Container Engine.
Aug 28 11:16:10 kvm-01-guest06.lab.eng.brq.redhat.com systemd[1]: docker.service: Start request repeated too quickly.
Aug 28 11:16:10 kvm-01-guest06.lab.eng.brq.redhat.com systemd[1]: docker.service: Failed with result 'exit-code'.
Aug 28 11:16:10 kvm-01-guest06.lab.eng.brq.redhat.com systemd[1]: Failed to start Docker Application Container Engine.

Expected results:
The service docker.service is running without any problem


Additional info:
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.246674880+02:00" level=info msg=serving... address=/var/run/docker/containerd/containerd-debug.sock
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.246765661+02:00" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock.ttrpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.246850456+02:00" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.246908728+02:00" level=info msg="containerd successfully booted in 0.005902s"
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.250414430+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc00090edb0, READY" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.257215489+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.257326949+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.257423104+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.257490048+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.259201027+02:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.268368337+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/containerd.sock 0  <nil>}]" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.268525439+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.268608193+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000652c70, CONNECTING" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.268975701+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000652c70, READY" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.269076346+02:00" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/containerd.sock 0  <nil>}]" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.269121236+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.269184880+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000652f40, CONNECTING" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.269537140+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000652f40, READY" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.342301674+02:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.342516945+02:00" level=warning msg="Your kernel does not support cgroup memory limit"
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.342566592+02:00" level=warning msg="Unable to find cpu cgroup in mounts"
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.342612586+02:00" level=warning msg="Unable to find blkio cgroup in mounts"
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.342657229+02:00" level=warning msg="Unable to find cpuset cgroup in mounts"
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.342712706+02:00" level=warning msg="mountpoint for pids not found"
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.343018181+02:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.343162904+02:00" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.345078327+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000652f40, TRANSIENT_FAILURE" module=grpc
Aug 28 11:16:09 host.example.com dockerd[21555]: time="2019-08-28T11:16:09.345181054+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc000652f40, CONNECTING" module=grpc
Aug 28 11:16:10 host.example.com dockerd[21555]: Error starting daemon: Devices cgroup isn't mounted
Aug 28 11:16:10 host.example.com audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=docker comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Aug 28 11:16:10 host.example.com systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Aug 28 11:16:10 host.example.com systemd[1]: docker.service: Failed with result 'exit-code'.
Aug 28 11:16:10 host.example.com systemd[1]: Failed to start Docker Application Container Engine.
Aug 28 11:16:10 host.example.com systemd[1]: docker.service: Service RestartSec=100ms expired, scheduling restart.
Aug 28 11:16:10 host.example.com systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Aug 28 11:16:10 host.example.com systemd[1]: Stopped Docker Application Container Engine.

Comment 1 Lukas Slebodnik 2019-08-28 09:20:02 UTC
Workaround is add kernel commandline option: systemd.unified_cgroup_hierarchy=0

Comment 2 nicolasoliver03 2019-09-13 22:21:35 UTC
I am having the same problem in Fedora IoT 31.
The workaround posted by Lukas also works for me (rpm-ostree kargs --editor, add systemd.unified_cgroup_hierarchy=0, and systemctl reboot)

Comment 3 Lukas Slebodnik 2019-09-16 15:23:45 UTC
*** Bug 1751636 has been marked as a duplicate of this bug. ***

Comment 4 Fedora Blocker Bugs Application 2019-10-14 21:30:33 UTC
Proposed as a Blocker for 31-final by Fedora user leonid224 using the blocker tracking app because:

 Docker (moby-engine) is a major component with a lot of users. The workaround in the bug proposes switching systemd to use cgroups1 instead of cgroups2. I suspect that cgroups1, while well-tested by the virtue of being used in Fedora for many releases, isn't well-tested specifically with Fedora 31, where cgroups2 is the default and many things might implicitly rely on cgroups2.

Comment 5 Lukas Slebodnik 2019-10-14 22:34:16 UTC
(In reply to Fedora Blocker Bugs Application from comment #4)
> Proposed as a Blocker for 31-final by Fedora user leonid224 using the
> blocker tracking app because:
> 
>  Docker (moby-engine) is a major component with a lot of users. The
> workaround in the bug proposes switching systemd to use cgroups1 instead of
> cgroups2. I suspect that cgroups1, while well-tested by the virtue of being
> used in Fedora for many releases, isn't well-tested specifically with Fedora
> 31, where cgroups2 is the default and many things might implicitly rely on
> cgroups2.

I test cgroups V1 daily with moby-engine on fedora 31.
And not just with moby-engine also with podman. And life is much more stable with cgroups V1

Comment 6 Zbigniew Jędrzejewski-Szmek 2019-10-15 06:38:10 UTC
-1 for blocker.

This is unfortunate, but docker is a package with troubled upstream. This issue could have
been handled on the docker side any time during the last ... 5 years (I think that as of
kernel 3.16 from August 2014 the cgroupsv2 api was more or less finalized). We cannot block
or delay Fedora based on the hope that this will happen next week.

The number of users who need docker is a small fraction of Fedora users.

Comment 7 Sam 2019-10-15 11:20:05 UTC
Without Docker how can Fedora serve as a development platform for Kubernetes deployments?

Not supporting Docker seems to remove Fedora from much of the cloud work being done by OpenShift and will set back its adoption significantly in a space that is still gathering interest and momentum.

Comment 8 Pablo Iranzo Gómez 2019-10-15 11:24:44 UTC
(In reply to Sam from comment #7)
> Without Docker how can Fedora serve as a development platform for Kubernetes
> deployments?
> 
> Not supporting Docker seems to remove Fedora from much of the cloud work
> being done by OpenShift and will set back its adoption significantly in a
> space that is still gathering interest and momentum.

In the meantime, you can use podman to run the containers as a workaround (as I do)

Comment 9 Zbigniew Jędrzejewski-Szmek 2019-10-15 12:05:53 UTC
> Without Docker how can Fedora serve as a development platform for Kubernetes deployments?

Let's not get overly dramatic. You can either a) use one of the other implementations, podman,
etc, or b) simply set the kernel option. Having to set a kernel option is not the end of the world.
Running with cgroups v1 is still supported, just not the default.

Comment 10 Sam 2019-10-15 12:40:00 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #9)
> > Without Docker how can Fedora serve as a development platform for Kubernetes deployments?
> 
> a) use one of the other implementations, podman,

The problem is not that Fedora cannot run containers, the problem is that a development environment for many Kubernetes installations needs to run `docker build` with a Docker file. If a developer can't build their deployment artifact on their machine, they will be very unlikely to use Fedora. For good or bad, Kubernetes most often means Docker.

> Running with cgroups v1 is still supported, just not the default.

I appreciate this. The engineers I work with will not use Fedora if they must touch kernel arguments.

Comment 11 Sam 2019-10-15 12:42:54 UTC
For others reading, `podman build` does support Dockerfiles. I wasn't clear in my previous post. You _can_ build images, just not with the technology that may finally end up running them.

Comment 12 Lukas Slebodnik 2019-10-15 21:28:38 UTC
BTW moby-engine(docker) can use oci runtime (which support cgroupsv2) but it is not enough because docker daemon still expects cgroups v1

(In reply to Sam from comment #10)
> > Running with cgroups v1 is still supported, just not the default.
> 
> I appreciate this. The engineers I work with will not use Fedora if they
> must touch kernel arguments.

Adding "systemd.unified_cgroup_hierarchy=0" the option GRUB_CMDLINE_LINUX in /etc/sysconfig/grub
is very trivial. And they still can use fedora 30 for moby-engine if they do not wand to touch kernel arguments.
Maybe mob-engine upstream will solve it meanwhile.

Comment 13 Adam Williamson 2019-10-16 01:04:22 UTC
Yeah, I think I'm -1 on this. Per the criteria we don't block on anything container-y, and if we were going to, it'd likely be podman, not docker.

Comment 14 Brian 'redbeard' Harrington 2019-10-16 01:41:58 UTC
-1

This sounds like errata to be documented given the scope of effect (only users who chose to use Docker for containerization) and numerous workarounds.

Comment 15 František Zatloukal 2019-10-16 14:40:54 UTC
-1 Blocker

Comment 16 Adam Williamson 2019-10-16 15:06:25 UTC
That's -4, so rejecting.

Comment 17 Adam Williamson 2019-10-29 02:00:51 UTC

*** This bug has been marked as a duplicate of bug 1757078 ***

Comment 18 Lukas Slebodnik 2019-10-31 21:42:04 UTC
*** Bug 1757078 has been marked as a duplicate of this bug. ***

Comment 19 Alexander von Gluck IV 2019-11-09 15:23:33 UTC
A quick bit of commentary late to the game.

I'd personally prefer to use podman for building my containers, however podman images are *not* compatible with docker 19.03 at the moment due to the following bug:
https://github.com/moby/moby/issues/39727

Pretty much:
  * Build Dockerfile with podman, no issues
  * Push image to hub.docker.com, no issues
  * Pull image to docker 19.03.x system to deploy:
    * Error response from daemon: mediaType in manifest should be 'application/vnd.docker.distribution.manifest.v2+json' not ''

That kind of sucks... so on Fedora 31 I can't generate container images compatible with docker without the cgroup v1 hack.

Comment 20 Adam Williamson 2019-11-12 23:55:45 UTC
"That kind of sucks... so on Fedora 31 I can't generate container images compatible with docker without the cgroup v1 hack."

I mean, to be clear, it's not a "hack". It's a configuration option. It is an entirely supported one that we expect people to use, and that's why it's there: we know some people will need cgroups v1, for legitimate reasons. You don't need to worry that you're doing something hacky or temporary or potentially broken or anything, just because you're picking this configuration option.

Comment 21 Lukas Slebodnik 2019-11-13 21:19:30 UTC
BTW Is there an upstream issue for moby and cgroups v2?

Comment 22 Sam 2019-11-15 06:42:32 UTC
Re Adam Williamson, it is true kernel arguments are not "hacks," but they are never addressed during system upgrades. In 5 months when I'm installing Fedora 32, I expect I will not be notified if cgroups v1 is required or not, nor that I can move to v2 when the Docker installation understands it. My kernel options may just be kicked to an rpmnew file and I'll be back to this ticket again.

At this very moment I'm staring at the SBT docker plugin's error message that it can't build a docker image now that I've aliased podman to docker. I don't want a sleeping configuration surprised in my grub configuration that will break my system 5 months from now when I've forgotten about cgroup namespacing. 

I love that Fedora is often very cutting-edge, but breaking Docker is a significant every-day problem for me. Podman isn't sufficient. Kernel arguments are not well supported, and I frankly don't trust that v1 is fully regression tested so that it is _actually_ an expected configuration more than just a possible one that _should_ work.

Comment 23 Leonid Podolny 2019-11-16 16:04:00 UTC
To Sam:
(I am the person who proposed this ticket as a release blocker).

On a practical level, instead of aliasing, you could install a package called "podman-docker", which expressly mimics docker command line options. For me, it solved all the inconsistencies I saw this far.
As to your philosophical argument, it's docker's rather than Fedora's fault, cgroups v2 isn't exactly new or a surprise to anyone. Docker just dropped a ball as an upstream. You could take the "I was trying to upgrade Fedora, so it's Fedora's regression" stand, but the thing is that Fedora has many thousands of upstreams, each with their own bugs, it needs to find some kind of a compromise between them. Here it seems to have chosen to remain close to systemd, which is a more important upstream than docker.

Comment 24 nicolasoliver03 2019-12-18 23:07:05 UTC
The containerd issue to enable cgroups v2 in docker has been pretty active in the last days (https://github.com/containerd/containerd/issues/3726)

Comment 25 AJ Seelund 2019-12-28 23:23:12 UTC
Late to the convo as well, but I'm just now getting to update my machine. 

After a recent update to fc31, I find I cant build docker anymore and the service wont start. 
Reading above, +1 on defect!

As a frontend dev, someone who lives on the client side, I develop using docker. this isn't just a backed thing and is now a blocker for my local development... I cant any-more. now I need to ssh into a remote box with ubuntu or hack away at my kernel or settings. I've been a FC user since FC12. don't block me now, please =)

I'll check out whatever the docker community is doing. but this blows sorry to say.

Comment 26 Bruno Meneguele 2020-01-03 11:14:58 UTC
(In reply to AJ Seelund from comment #25)
> Late to the convo as well, but I'm just now getting to update my machine. 
> 
> After a recent update to fc31, I find I cant build docker anymore and the
> service wont start. 
> Reading above, +1 on defect!
> 

+1 on Docker's defect, and not Fedora's, as already stated. 

> As a frontend dev, someone who lives on the client side, I develop using
> docker. this isn't just a backed thing and is now a blocker for my local
> development... I cant any-more. now I need to ssh into a remote box with
> ubuntu or hack away at my kernel or settings. I've been a FC user since
> FC12. don't block me now, please =)
> 
> I'll check out whatever the docker community is doing. but this blows sorry
> to say.

A distribution doesn't have control to decisions made by application upstream community; a distro can suggest/recommend, but the final decision of not supporting cgroupsv2 was done in docker side.

Did you try to append systemd.unified_cgroup_hierarchy=0 to your /etc/sysconfig/grub "GRUB_CMDLINE_LINUX=" var?

GRUB_CMDLINE_LINUX="resume=UUID=270e3ca3-3f19-47cb-a99d-3bd30ab21b5f rhgb quiet systemd.unified_cgroup_hierarchy=0"

then, 

$ grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

I'm not a container guy and whenever I need to use that I prefer podman. OTOH when working with docker I make sure to have this additional kernel option in place; which is far from something "hacky", but an actual solution to a userspace application's (docker) problem.

Comment 27 Olivier Lemasle 2020-10-06 09:06:39 UTC
*** Bug 1885433 has been marked as a duplicate of this bug. ***

Comment 28 Ben Cotton 2020-11-03 15:32:39 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 29 Olivier Lemasle 2020-11-14 20:32:16 UTC
Docker 20.10 brings the compatibility with cgroups v2. It will then be fixed in the next release.

Comment 30 Fedora Update System 2021-03-14 22:50:14 UTC
FEDORA-2021-d404cb481d has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-d404cb481d

Comment 31 Fedora Update System 2021-03-16 14:42:31 UTC
FEDORA-2021-d404cb481d has been pushed to the Fedora 34 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-d404cb481d`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-d404cb481d

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 32 Olivier Lemasle 2021-03-16 18:12:31 UTC
moby-engine 20.10.5 (currently in updates-testing on f34) brings compatibility with cgroups v2.

Comment 33 Fedora Update System 2021-03-19 20:15:47 UTC
FEDORA-2021-d404cb481d has been pushed to the Fedora 34 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.