Bug 1038328 - /etc/cgconfig.conf ought to match production (and be closer to default)
Summary: /etc/cgconfig.conf ought to match production (and be closer to default)
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OKD
Classification: Red Hat
Component: Containers
Version: 1.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Vincent Batts
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-04 22:20 UTC by Vincent Batts
Modified: 2014-10-17 17:36 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-10-17 17:36:26 UTC
Target Upstream Version:


Attachments (Terms of Use)
a script to create 2000 fake users/apps (849 bytes, application/x-shellscript)
2013-12-04 22:26 UTC, Vincent Batts
no flags Details
the rhc-node %post scripts, less the logic to tamper with /etc/cgconfig.conf (1.86 KB, application/x-shellscript)
2013-12-04 22:27 UTC, Vincent Batts
no flags Details
unset the "all" mounts of cgconfig.conf (1.34 KB, text/x-ruby)
2013-12-13 20:56 UTC, Vincent Batts
no flags Details
restore cgconfig to individual controller mounts (1.58 KB, text/x-ruby)
2013-12-13 22:11 UTC, Vincent Batts
no flags Details
also, uncomment the devices controller (1.91 KB, application/x-ruby)
2014-01-10 20:06 UTC, Vincent Batts
no flags Details

Description Vincent Batts 2013-12-04 22:20:44 UTC
Description of problem:
mounting the cgroup controllers to the 'all' context is not supported by aspects of lxc. In -online production, they are not doing this, but mounting individual cgroup controllers (this is default from libcgroup).
This was discovered during testing docker+lxc on rhel6.5 devenv images.

Version-Release number of selected component (if applicable):
libcgroup-0.40.rc1-5.el6.x86_64
rhc-node-1.17.1-1.git.166.14a0780.el6.x86_64


How reproducible:
very

Steps to Reproduce:
1. create a devenv instance
2. install docker-io from epel
3. one terminal: `docker -d`
4. `docker run mattdm/fedora bash`

Actual results:
lxc-start: cgroup is not mounted                                                
lxc-start: Error setting devices.deny to a for lxc/15509c8f85901ba2fc2cb883d777b836cee7cdeb2b46d6ebc07d9a1a5b1de0f0 
lxc-start: failed to setup the cgroups for '15509c8f85901ba2fc2cb883d777b836cee7cdeb2b46d6ebc07d9a1a5b1de0f0' 
lxc-start: failed to spawn '15509c8f85901ba2fc2cb883d777b836cee7cdeb2b46d6ebc07d9a1a5b1de0f0' 
lxc-start: Device or resource busy - failed to remove cgroup '/cgroup/all/lxc/15509c8f85901ba2fc2cb883d777b836cee7cdeb2b46d6ebc07d9a1a5b1de0f0'

Expected results:
a running bash shell

Additional info:
the cgconfig.conf changes we originally due to https://bugzilla.redhat.com/show_bug.cgi?id=846445

Comment 1 Vincent Batts 2013-12-04 22:22:50 UTC
I have confirmed that reverting the /etc/cgconfig.conf to mounting individual controllers no longer has the effect of slowing down the %postin scripts of 'rhc-node'
See the attached scripts, from the following output:

# wc -l /etc/passwd                                                              
39 /etc/passwd                                                                   

# time sh rhc-node.sh                                                            
/sbin/restorecon:  lstat(/etc/init.d/libra) failed:  No such file or directory   
Setting AVC Cache Threshold...                             [  OK  ]              
Stopping system message bus:                               [  OK  ]              
Starting system message bus:                               [  OK  ]              
Shutting down oddjobd:                                     [  OK  ]              
Starting oddjobd:                                          [  OK  ]              
Stopping Watchman Services:                                [  OK  ]              
Starting Watchman Services:                                [  OK  ]              
                                                                                 
real    0m14.014s                                                                
user    0m8.822s                                                                 
sys     0m1.083s                                                                     

# time sh growapps.sh                                                                
[...]                                                                                
1997 1998 1999 2000                                                                  
real    23m16.357s                                                                   
user    14m1.983s                                                                    
sys     2m46.948s                                                                    

# cat /etc/cgconfig.conf                                                             
mount {                                                                              
        cpuset  = /cgroup/cpuset;                                                    
        cpu     = /cgroup/cpu;                                                       
        cpuacct = /cgroup/cpuacct;                                                   
        memory  = /cgroup/memory;                                                    
        devices = /cgroup/devices;                                                   
        freezer = /cgroup/freezer;                                                 
        net_cls = /cgroup/net_cls;                                                 
        blkio   = /cgroup/blkio;
}                                                                                
group /openshift { cpu {} cpuacct {} memory {} net_cls {} freezer {} }           
group /openshift { cpu {} cpuacct {} memory {} net_cls {} freezer {} }           
group /openshift/529f9da38f05cccae500000e { perm { task { uid = 1000;  gid = 1000; } admin { uid = root;  gid = root; }} cpu { cpu.cfs_quota_us = 100000;  cpu.shares = 128; } cpuacct {} memory { memory.limit_in_bytes = 536870912;  memory.move_charge_at_immigrate = 1;  memory.memsw.limit_in_bytes = 641728512; } net_cls { net_cls.classid = 66536; } freezer { freezer.state = THAWED; }}  

# wc -l /etc/passwd                                                              
2039 /etc/passwd                                                                 

# time sh rhc-node.sh                                                            
/sbin/restorecon:  lstat(/etc/init.d/libra) failed:  No such file or directory   
Setting AVC Cache Threshold...                             [  OK  ]              
Stopping system message bus:                               [  OK  ]              
Starting system message bus:                               [  OK  ]              
Shutting down oddjobd:                                     [  OK  ]              
Starting oddjobd:                                          [  OK  ]              
Stopping Watchman Services:                                [  OK  ]              
Starting Watchman Services:                                [  OK  ]              
                                                                                 
real    0m14.223s                                                                
user    0m9.089s                                                                 
sys     0m1.157s

Comment 2 Vincent Batts 2013-12-04 22:26:04 UTC
Created attachment 832891 [details]
a script to create 2000 fake users/apps

Comment 3 Vincent Batts 2013-12-04 22:27:08 UTC
Created attachment 832892 [details]
the rhc-node %post scripts, less the logic to tamper with /etc/cgconfig.conf

Comment 4 Vincent Batts 2013-12-05 19:48:07 UTC
also added an upstream lxc issue, to support 'all'
https://github.com/lxc/lxc/issues/110

Comment 5 Vincent Batts 2013-12-13 20:56:23 UTC
Created attachment 836496 [details]
unset the "all" mounts of cgconfig.conf

this script will revert the mounts to "all", back to mounting individual controllers

Comment 6 Vincent Batts 2013-12-13 21:17:43 UTC
here is the output of the restore script,

[root@ip-10-158-77-124 ~]# cat /etc/cgconfig.conf
#
#  Copyright IBM Corporation. 2007
#
#  Authors:     Balbir Singh <balbir.ibm.com>
#  This program is free software; you can redistribute it and/or modify it
#  under the terms of version 2.1 of the GNU Lesser General Public License
#  as published by the Free Software Foundation.
#
#  This program is distributed in the hope that it would be useful, but
#  WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See man cgconfig.conf for further details.
#
# By default, mount all controllers to /cgroup/<controller>

mount {
#       cpuset  = /cgroup/all;
        cpu     = /cgroup/cpu;
        cpuacct = /cgroup/cpuacct;
        memory  = /cgroup/memory;
#       devices = /cgroup/all;
        freezer = /cgroup/freezer;
        net_cls = /cgroup/net_cls;
#       blkio   = /cgroup/all;
}

group /openshift { cpu {} cpuacct {} memory {} net_cls {} freezer {} }
group /openshift/52ab46a7c07028d18b000006 { perm { task { uid = 1000;  gid = 1000; } admin { uid = root;  gid = root; }} cpu { cpu.cfs_quota_us = 100000;  cpu.shares = 128; } cpuacct {} memory { memory.limit_in_bytes = 536870912;  memory.move_charge_at_immigrate = 1;  memory.memsw.limit_in_bytes = 641728512; } net_cls { net_cls.classid = 66536; } freezer { freezer.state = THAWED; }}
group /openshift/52ab78f6c07028b252000001 { perm { task { uid = 1001;  gid = 1001; } admin { uid = root;  gid = root; }} cpu { cpu.cfs_quota_us = 100000;  cpu.shares = 128; } cpuacct {} memory { memory.limit_in_bytes = 536870912;  memory.move_charge_at_immigrate = 1;  memory.memsw.limit_in_bytes = 641728512; } net_cls { net_cls.classid = 66537; } freezer { freezer.state = THAWED; }}
[root@ip-10-158-77-124 ~]# ruby restore_cgconfig_mounts.rb 
backed up [/etc/cgconfig.conf] to [/etc/cgconfig.conf.bak.26602]
amending: cpuset
amending: devices
amending: blkio
compare with `diff -u /etc/cgconfig.conf.bak.26602 /etc/cgconfig.conf`

Comment 8 Vincent Batts 2013-12-13 22:11:28 UTC
Created attachment 836512 [details]
restore cgconfig to individual controller mounts

adding a check for whether anything was amended. remove backup and don't write anything if nothing changed.
Also added more info in the output.

Comment 9 Vincent Batts 2013-12-13 22:15:20 UTC
https://github.com/openshift/li/pull/2235

Comment 10 Vincent Batts 2014-01-10 20:06:02 UTC
Created attachment 848376 [details]
also, uncomment the devices controller

updated the restore script to uncomment the 'devices' controller.

Also, updated the 'li' pull request to not comment out the 'devices' controller, https://github.com/vbatts/li/commit/9bc85e493737f12fe14609a4259192d58e1b8a54

Comment 11 Vincent Batts 2014-10-17 17:36:26 UTC
closing as this has been resolved.


Note You need to log in before you can comment on or make changes to this bug.