Hide Forgot
Description of problem: mounting the cgroup controllers to the 'all' context is not supported by aspects of lxc. In -online production, they are not doing this, but mounting individual cgroup controllers (this is default from libcgroup). This was discovered during testing docker+lxc on rhel6.5 devenv images. Version-Release number of selected component (if applicable): libcgroup-0.40.rc1-5.el6.x86_64 rhc-node-1.17.1-1.git.166.14a0780.el6.x86_64 How reproducible: very Steps to Reproduce: 1. create a devenv instance 2. install docker-io from epel 3. one terminal: `docker -d` 4. `docker run mattdm/fedora bash` Actual results: lxc-start: cgroup is not mounted lxc-start: Error setting devices.deny to a for lxc/15509c8f85901ba2fc2cb883d777b836cee7cdeb2b46d6ebc07d9a1a5b1de0f0 lxc-start: failed to setup the cgroups for '15509c8f85901ba2fc2cb883d777b836cee7cdeb2b46d6ebc07d9a1a5b1de0f0' lxc-start: failed to spawn '15509c8f85901ba2fc2cb883d777b836cee7cdeb2b46d6ebc07d9a1a5b1de0f0' lxc-start: Device or resource busy - failed to remove cgroup '/cgroup/all/lxc/15509c8f85901ba2fc2cb883d777b836cee7cdeb2b46d6ebc07d9a1a5b1de0f0' Expected results: a running bash shell Additional info: the cgconfig.conf changes we originally due to https://bugzilla.redhat.com/show_bug.cgi?id=846445
I have confirmed that reverting the /etc/cgconfig.conf to mounting individual controllers no longer has the effect of slowing down the %postin scripts of 'rhc-node' See the attached scripts, from the following output: # wc -l /etc/passwd 39 /etc/passwd # time sh rhc-node.sh /sbin/restorecon: lstat(/etc/init.d/libra) failed: No such file or directory Setting AVC Cache Threshold... [ OK ] Stopping system message bus: [ OK ] Starting system message bus: [ OK ] Shutting down oddjobd: [ OK ] Starting oddjobd: [ OK ] Stopping Watchman Services: [ OK ] Starting Watchman Services: [ OK ] real 0m14.014s user 0m8.822s sys 0m1.083s # time sh growapps.sh [...] 1997 1998 1999 2000 real 23m16.357s user 14m1.983s sys 2m46.948s # cat /etc/cgconfig.conf mount { cpuset = /cgroup/cpuset; cpu = /cgroup/cpu; cpuacct = /cgroup/cpuacct; memory = /cgroup/memory; devices = /cgroup/devices; freezer = /cgroup/freezer; net_cls = /cgroup/net_cls; blkio = /cgroup/blkio; } group /openshift { cpu {} cpuacct {} memory {} net_cls {} freezer {} } group /openshift { cpu {} cpuacct {} memory {} net_cls {} freezer {} } group /openshift/529f9da38f05cccae500000e { perm { task { uid = 1000; gid = 1000; } admin { uid = root; gid = root; }} cpu { cpu.cfs_quota_us = 100000; cpu.shares = 128; } cpuacct {} memory { memory.limit_in_bytes = 536870912; memory.move_charge_at_immigrate = 1; memory.memsw.limit_in_bytes = 641728512; } net_cls { net_cls.classid = 66536; } freezer { freezer.state = THAWED; }} # wc -l /etc/passwd 2039 /etc/passwd # time sh rhc-node.sh /sbin/restorecon: lstat(/etc/init.d/libra) failed: No such file or directory Setting AVC Cache Threshold... [ OK ] Stopping system message bus: [ OK ] Starting system message bus: [ OK ] Shutting down oddjobd: [ OK ] Starting oddjobd: [ OK ] Stopping Watchman Services: [ OK ] Starting Watchman Services: [ OK ] real 0m14.223s user 0m9.089s sys 0m1.157s
Created attachment 832891 [details] a script to create 2000 fake users/apps
Created attachment 832892 [details] the rhc-node %post scripts, less the logic to tamper with /etc/cgconfig.conf
also added an upstream lxc issue, to support 'all' https://github.com/lxc/lxc/issues/110
Created attachment 836496 [details] unset the "all" mounts of cgconfig.conf this script will revert the mounts to "all", back to mounting individual controllers
here is the output of the restore script, [root@ip-10-158-77-124 ~]# cat /etc/cgconfig.conf # # Copyright IBM Corporation. 2007 # # Authors: Balbir Singh <balbir.ibm.com> # This program is free software; you can redistribute it and/or modify it # under the terms of version 2.1 of the GNU Lesser General Public License # as published by the Free Software Foundation. # # This program is distributed in the hope that it would be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. # # See man cgconfig.conf for further details. # # By default, mount all controllers to /cgroup/<controller> mount { # cpuset = /cgroup/all; cpu = /cgroup/cpu; cpuacct = /cgroup/cpuacct; memory = /cgroup/memory; # devices = /cgroup/all; freezer = /cgroup/freezer; net_cls = /cgroup/net_cls; # blkio = /cgroup/all; } group /openshift { cpu {} cpuacct {} memory {} net_cls {} freezer {} } group /openshift/52ab46a7c07028d18b000006 { perm { task { uid = 1000; gid = 1000; } admin { uid = root; gid = root; }} cpu { cpu.cfs_quota_us = 100000; cpu.shares = 128; } cpuacct {} memory { memory.limit_in_bytes = 536870912; memory.move_charge_at_immigrate = 1; memory.memsw.limit_in_bytes = 641728512; } net_cls { net_cls.classid = 66536; } freezer { freezer.state = THAWED; }} group /openshift/52ab78f6c07028b252000001 { perm { task { uid = 1001; gid = 1001; } admin { uid = root; gid = root; }} cpu { cpu.cfs_quota_us = 100000; cpu.shares = 128; } cpuacct {} memory { memory.limit_in_bytes = 536870912; memory.move_charge_at_immigrate = 1; memory.memsw.limit_in_bytes = 641728512; } net_cls { net_cls.classid = 66537; } freezer { freezer.state = THAWED; }} [root@ip-10-158-77-124 ~]# ruby restore_cgconfig_mounts.rb backed up [/etc/cgconfig.conf] to [/etc/cgconfig.conf.bak.26602] amending: cpuset amending: devices amending: blkio compare with `diff -u /etc/cgconfig.conf.bak.26602 /etc/cgconfig.conf`
https://github.com/vbatts/li/commit/9ca2b4e1dcbdfcef17d0fba2b6ffe26cf6500ceb
Created attachment 836512 [details] restore cgconfig to individual controller mounts adding a check for whether anything was amended. remove backup and don't write anything if nothing changed. Also added more info in the output.
https://github.com/openshift/li/pull/2235
Created attachment 848376 [details] also, uncomment the devices controller updated the restore script to uncomment the 'devices' controller. Also, updated the 'li' pull request to not comment out the 'devices' controller, https://github.com/vbatts/li/commit/9bc85e493737f12fe14609a4259192d58e1b8a54
closing as this has been resolved.