Bug 1317627 - cgroups: cpu.shares: no such file or directory error seen during openshift builds
Summary: cgroups: cpu.shares: no such file or directory error seen during openshift bu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.2
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Mrunal Patel
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-14 17:21 UTC by Lokesh Mandvekar
Modified: 2019-03-06 01:25 UTC (History)
14 users (show)

Fixed In Version: docker-1.9.1-20.el7
Doc Type: Bug Fix
Doc Text:
Clone Of: 1317616
Environment:
Last Closed: 2016-03-31 23:23:58 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0536 0 normal SHIPPED_LIVE docker bug fix and enhancement update 2016-04-01 03:19:56 UTC

Description Lokesh Mandvekar 2016-03-14 17:21:26 UTC
+++ This bug was initially created as a clone of Bug #1317616 +++

Description of problem:
I0225 07:28:08.539954     473 container.go:386] Start housekeeping for container "/system.slice/docker-c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57.scope"
E0225 07:28:08.540224     473 manager.go:1873] Error running pod "failing-dc-mid-1-deploy_test(503dd22a-db91-11e5-9ed3-0e652d436df1)" container "deployment": runContainer: API error (500): Cannot start container c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57: [8] System error: open /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57.scope/cpu.shares: no such file or directory
E0225 07:28:08.540286     473 pod_workers.go:138] Error syncing pod 503dd22a-db91-11e5-9ed3-0e652d436df1, skipping: failed to "StartContainer" for "deployment" with RunContainerError: "runContainer: API error (500): Cannot start container c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57: [8] System error: open /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57.scope/cpu.shares: no such file or directory\n"
I0225 07:28:08.540353     473 server.go:577] Event(api.ObjectReference{Kind:"Pod", Namespace:"test", Name:"failing-dc-mid-1-deploy", UID:"503dd22a-db91-11e5-9ed3-0e652d436df1", APIVersion:"v1", ResourceVersion:"664", FieldPath:"spec.containers{deployment}"}): type: 'Warning' reason: 'Failed' Failed to start container with docker id c6d21444deb7 with error: API error (500): Cannot start container c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57: [8] System error: open /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57.scope/cpu.shares: no such file or directory
I0225 07:28:08.540382     473 server.go:577] Event(api.ObjectReference{Kind:"Pod", Namespace:"test", Name:"failing-dc-mid-1-deploy", UID:"503dd22a-db91-11e5-9ed3-0e652d436df1", APIVersion:"v1", ResourceVersion:"664", FieldPath:""}): type: 'Warning' reason: 'FailedSync' Error syncing pod, skipping: failed to "StartContainer" for "deployment" with RunContainerError: "runContainer: API error (500): Cannot start container c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57: [8] System error: open /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-c6d21444deb7683205f69b73e0036e07e05598d54e47e5459d3d6af859408b57.scope/cpu.shares: no such file or directory\n"
I0225 07:28:08.657335     473 manager.go:1331] Container "8784d460e653120669f56085c4e5462664353358f25d1c08b9f4e365a0105132 test/failing-dc-2-deploy" exited after 202.851918ms
I0225 07:28:08.657335     473 manager.go:1331] Container "8784d460e653120669f56085c4e5462664353358f25d1c08b9f4e365a0105132 test/failing-dc-2-deploy" exited after 204.899194ms
W0225 07:28:08.657402     473 manager.go:1337] No ref for pod '"8784d460e653120669f56085c4e5462664353358f25d1c08b9f4e365a0105132 test/failing-dc-2-deploy"'
I

See https://github.com/openshift/origin/issues/7616

It appears that systemd cgroups Transient Unit doesn't join all the Cgroups sometimes. 

Version-Release number of selected component (if applicable):
1.9.1

How reproducible:
Seen during jenkins test on Openshift


Steps to Reproduce:
1.
2.
3.

Actual results:
Error trying to access cgroups files


Expected results:
No such errors

Additional info:

Comment 3 Luwen Su 2016-03-17 17:00:53 UTC
In docker-1.9.1-23.el7.x86_64, from docker side, i'd like to move this bug to verified.

github issue : https://github.com/docker/docker/issues/9365
fix patch : https://github.com/projectatomic/docker/commit/4d4cab6d2f16fe7b5f088c20a8daecd459005385

Steps:
#docker run -d --name test --rm busybox /bin/sh -c 'while true; do date; sleep 5; done'
#docker inspect -f '{{ .State.Pid }}' test 
15303


##All cgroup elements have link

#cat /proc/15303/cgroup 
10:freezer:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
9:net_cls:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
8:devices:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
7:blkio:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
6:perf_event:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
5:cpuacct,cpu:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
4:memory:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
3:hugetlb:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
2:cpuset:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
1:name=systemd:/system.slice/docker-b28d990eadfe4452fda2723e78847a7beb8b06b24009873abeda700d673c1630.scope
# ls /sys/fs/cgroup
blkio  cpuacct      cpuset   freezer  memory   perf_event
cpu    cpu,cpuacct  devices  hugetlb  net_cls  systemd

Comment 5 errata-xmlrpc 2016-03-31 23:23:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0536.html


Note You need to log in before you can comment on or make changes to this bug.