Bug 1320601 - Unable to start docker container because oci-register-machine fails
Summary: Unable to start docker container because oci-register-machine fails
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: docker
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Lokesh Mandvekar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1321618 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-23 15:28 UTC by Andy Goldstein
Modified: 2016-04-03 09:46 UTC (History)
15 users (show)

Fixed In Version: docker-1.10.3-4.gitf8a9a2a.fc25
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-29 18:21:15 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Andy Goldstein 2016-03-23 15:28:33 UTC
Description of problem: Docker containers fail to start because ici-register-machine fails.


Version-Release number of selected component (if applicable): docker-1.10.3-3.gitd93ee51.fc25.x86_64


How reproducible: 100%


Steps to Reproduce:
1. docker run --rm busybox date
2.
3.

Actual results: Container fails to start


Expected results: Container starts


Additional info: from the journal:

Mar 23 11:05:21 hack2015 docker[7326]: time="2016-03-23T11:05:21.655385103-04:00" level=debug msg="Calling POST /v1.22/containers/7b687c0c73f3824e1bc0f8145621b54dd46768f0d24bb
b02de1995895e12fa3c/start"
Mar 23 11:05:21 hack2015 docker[7326]: time="2016-03-23T11:05:21.655413547-04:00" level=debug msg="POST /v1.22/containers/7b687c0c73f3824e1bc0f8145621b54dd46768f0d24bbb02de199
5895e12fa3c/start"
...
[snip]
...
Mar 23 11:05:21 hack2015 oci-register-machine[7740]: 2016/03/23 11:05:21 Register machine: prestart %!d(string=7b687c0c73f3824e1bc0f8145621b54dd46768f0d24bbb02de1995895e12fa3c) %!s(int=7713) /var/lib/docker/devicemapper/mnt/6f9a36c0ab1a6dfeb991b51dbfc2f4636d31cfc71c346ce930ea6c42b396ab2a/rootfs
Mar 23 11:05:21 hack2015 oci-register-machine[7740]: 2016/03/23 11:05:21 Register machine failed: Failed to determine unit of process 7713 : No such device or address
Mar 23 11:05:21 hack2015 docker[7326]: time="2016-03-23T11:05:21.976319905-04:00" level=warning msg="signal: killed"
Mar 23 11:05:21 hack2015 oci-systemd-hook[7746]: systemdhook <debug>: Skipping as container command is date, not init or systemd
Mar 23 11:05:21 hack2015 oci-register-machine[7747]: 2016/03/23 11:05:21 Register machine: poststop %!d(string=7b687c0c73f3824e1bc0f8145621b54dd46768f0d24bbb02de1995895e12fa3c) %!s(int=0) /var/lib/docker/devicemapper/mnt/6f9a36c0ab1a6dfeb991b51dbfc2f4636d31cfc71c346ce930ea6c42b396ab2a/rootfs
Mar 23 11:05:21 hack2015 oci-register-machine[7747]: 2016/03/23 11:05:21 TerminateMachine failed: No machine '7b687c0c73f3824e1bc0f8145621b54dd46768f0d24bbb02de1995895e12fa3c'

Comment 1 Daniel Walsh 2016-03-23 18:21:14 UTC
This looks like a new feature of systemd?

Comment 2 Daniel Walsh 2016-03-23 18:23:00 UTC
systemd guys, is there some new requirement for registering a machine, that is just showing up in rawhide?

Andy could you attach the output of 

cat /proc/self/cgroup

From inside of the container.

Comment 3 Andy Goldstein 2016-03-23 18:26:42 UTC
I can't start any containers, so I can't provide that info. Do you want me to disable the oci-register-machine hook and then get the cgroup info? If so, here it is:

$ sudo docker run --rm busybox cat /proc/self/cgroup
11:hugetlb:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
10:devices:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
9:cpu,cpuacct:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
8:freezer:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
7:blkio:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
6:pids:/system.slice/docker.service
5:memory:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
4:perf_event:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
3:net_cls,net_prio:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
2:cpuset:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054
1:name=systemd:/docker/5b93dd25de4f0330f34ac997784f0610983187028ad3240922093f5d76306054


Note, I am running systemd-229-7.fc25.x86_64.

Comment 4 Michal Sekletar 2016-03-23 19:02:17 UTC
At the first glance this looks like a race. I think docker creates scope for the container but doesn't wait for systemd start job to complete and proceeds immediately with registration of the machine. This may fail because machined tries to determine cgroup of the container scope, however cgroup was not realized and processes were not migrated there yet.

Are you sure you wait for completion of scope start job?

Comment 5 Daniel Walsh 2016-03-23 19:43:13 UTC
Andy are you running with the standard docker unit file?

Is your docker daemon running with 

--exec-opt native.cgroupdriver=systemd 

Michal 
We are registering the machine after the process is created, so I figure it is registered with the cgroup.  But Mrunal would no for sure.

Is this a new check in systemd-229-7?  We have not seen this problem before.

Comment 6 Andy Goldstein 2016-03-23 19:47:53 UTC
I'm using the standard unit file from the rpm. The only change I made to /etc/sysconfig/docker was to enable debug logging (-D). It does not appear to be running with any --exec-opt settings. The output from ps is

/usr/bin/docker daemon --selinux-enabled --log-driver=journald -D

Comment 7 Daniel Walsh 2016-03-23 19:49:48 UTC
Could you add that option to /etc/sysconfig/docker and see if that fixes the issue.  docker-1.10 is no longer using systemd by default for setting up cgroups.  We are supposed to have changed the default back in our package.

Comment 8 Andy Goldstein 2016-03-23 20:00:39 UTC
Setting the cgroupdriver to systemd fixed the problem.

Comment 9 Daniel Walsh 2016-03-23 20:49:15 UTC
Lokesh could you make sure this flag is in /etc/sysconfig/docker

Comment 10 Lokesh Mandvekar 2016-03-28 17:43:00 UTC
*** Bug 1321618 has been marked as a duplicate of this bug. ***

Comment 12 Andreu Botella 2016-03-30 14:29:25 UTC
I'm having the same problem in Fedora 24. Any plans to backport the fix?

Comment 13 Daniel Walsh 2016-03-30 14:32:49 UTC
Yes we need to make this the default in F24 and all docker-1.10 packages.

Comment 14 Lokesh Mandvekar 2016-04-01 18:40:21 UTC
(In reply to Andreu Botella from comment #12)
> I'm having the same problem in Fedora 24. Any plans to backport the fix?

Yes, I've added this to f24 as well, the update can be found here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-976f49409c . Please add karma so we can move it to stable quicker.

Comment 15 Daniel Walsh 2016-04-03 09:46:53 UTC
Lokesh that fixes the problem, but you should have put the cgroup option into /etc/sysconfig/docker rather then in the unit file, so people could easily change it.


Note You need to log in before you can comment on or make changes to this bug.