Bug 1317991

Summary: cgroups: cgroups.proc no such file or directory error during docker build
Product: Red Hat Enterprise Linux 7 Reporter: Lokesh Mandvekar <lsm5>
Component: dockerAssignee: Mrunal Patel <mpatel>
Status: CLOSED ERRATA QA Contact: atomic-bugs <atomic-bugs>
Severity: unspecified Docs Contact: Yoana Ruseva <yruseva>
Priority: unspecified    
Version: 7.2CC: adimania, admiller, amurdaca, ccoleman, dwalsh, extras-qa, ichavero, jcajka, jchaloup, lnykryn, lsm5, lsu, marianne, miminar, mpatel, systemd-maint, vbatts
Target Milestone: rcKeywords: Extras
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: docker-1.9.1-22.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1317059 Environment:
Last Closed: 2016-03-31 23:24:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Lokesh Mandvekar 2016-03-15 17:21:32 UTC
+++ This bug was initially created as a clone of Bug #1317059 +++

Description of problem:

Errors such as
System error: open /sys/fs/cgroup/devices/system.slice/docker-fc2e8c0bfdef0d585ed13e784ccc1024ec33f4f01e3c4c992ea15b38abca58b7.scope/cgroup.procs: no such file or directory

are seen during origin tests.
For more info see https://github.com/openshift/origin/issues/7927

Version-Release number of selected component (if applicable):
docker 1.9.1

How reproducible:
Seen during jenkins tests



Actual results:
docker build failure

Expected results:
docker build shouldn't fail with this cgroups error


Additional info:
This seems like a probable race of some kind with systemd cgroups support.

The code that is failing does the following:
1. Creates a systemd transient unit for e.g. /system.slice/docker-fc2e8c0bfdef0d585ed13e784ccc1024ec33f4f01e3c4c992ea15b38abca58b7.scope
2. Join the device cgroup manually by creating the device directory
at sys/fs/cgroup/devices/system.slice/docker-fc2e8c0bfdef0d585ed13e784ccc1024ec33f4f01e3c4c992ea15b38abca58b7.scope
3. Write the pid of the container process to 
/sys/fs/cgroup/devices/system.slice/docker-fc2e8c0bfdef0d585ed13e784ccc1024ec33f4f01e3c4c992ea15b38abca58b7.scope/cgroup.procs (creating the file as well)

--- Additional comment from Clayton Coleman on 2016-03-15 09:25:10 CDT ---

This blocks the OpenShift 3.2 release, given that it results in failures in roughly 1% of container launches on Docker 1.9 on RHEL.

--- Additional comment from Mrunal Patel on 2016-03-15 12:17:34 CDT ---

https://github.com/projectatomic/docker/pull/76 is a potential fix that we need a rpm for.

Comment 3 Luwen Su 2016-03-20 13:35:33 UTC
Works fine with 
# mkdir mkdir container-test.scope
# pwd
/sys/fs/cgroup/systemd/system.slice/container-test.scope

root      2490  0.2  0.2 245988 16404 pts/0    Sl+  21:33   0:00 docker run -it rhel7 /bin/bash

#echo 2490 > cgroup.proc
In docker-1.9.1-23.el7.x86_64, move to verified

Comment 5 errata-xmlrpc 2016-03-31 23:24:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0536.html