Bug 688616 - VDSM should disable cpu cgroups
Summary: VDSM should disable cpu cgroups
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Yotam Oron
QA Contact: Dafna Ron
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-17 14:04 UTC by Andrew Cathrow
Modified: 2014-09-07 22:54 UTC (History)
12 users (show)

Fixed In Version: vdsm-4.9-62
Doc Type: Bug Fix
Doc Text:
VDSM disables cgroups on installation due to scalability issues.
Clone Of:
Environment:
Last Closed: 2011-12-06 07:09:33 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:1782 0 normal SHIPPED_LIVE new packages: vdsm 2011-12-06 11:55:51 UTC

Description Andrew Cathrow 2011-03-17 14:04:26 UTC
To work around the RHEL 6 cgroup issue : https://bugzilla.redhat.com/show_bug.cgi?id=623712 

The VDSM RPM should disable cgroups

Comment 1 Andrew Cathrow 2011-03-17 14:06:13 UTC
Setting need info for Yaniv to verify correct mechanism to disable after discussing with platform QE

Comment 3 Haim 2011-03-17 17:34:13 UTC
tested several scenario, all documented in the original bug.
in short, completely disabling cgroups is a workaround for this problem.

Comment 4 Andrew Cathrow 2011-03-17 18:52:32 UTC
Haim, would you mind updating this bug with the correct steps.
thanks
Aic

Comment 5 Haim 2011-03-20 07:21:45 UTC
Test cases and steps: 

- unpatched kernel with cgroups enabled - failed 
  * run vm failed on machine running with ~10 guests 
- unpatched kernel with cgroups 'cpu' commented out on conf file - failed  
  * run vm failed on machine running with ~10 guests 
- unpatched kernel with cgroups 'cpu' disabled via kernel cmd line - pass with 
  exception
  * run new vm succeeds ***but*** it takes 20 time longer to run new vm (need to 
    investigate further) 
- unpatched kernel without cgroups (service cgonfig stop) - pass  
  * managed to run 100 new vms  
- patched kernel with cgroups on - pass 
  * managed to run 100 new vms

Comment 6 Dan Kenigsberg 2011-03-22 13:24:44 UTC
We can easily add "cgroup_disable=cpu" argument to kernel on installation on RHEL. Alan, I believe a similar action should be done for RHEV-H?

I'm not saying that this is an ugly cowardly "solution" to the probelm, since that's clear to all of us.

Comment 7 Yaniv Kaul 2011-03-22 13:29:53 UTC
(In reply to comment #6)
> We can easily add "cgroup_disable=cpu" argument to kernel on installation on
> RHEL. Alan, I believe a similar action should be done for RHEV-H?
> 
> I'm not saying that this is an ugly cowardly "solution" to the probelm, since
> that's clear to all of us.

Moreover, it'll remain there forever, long after the cgroups bug is fixed...

Comment 8 Andrew Cathrow 2011-03-22 13:30:25 UTC
Haim - comment #4
Please could you verify exactly what we need to do in the vdsm install
Is it just chkconfig cgconfig off ?

Comment 9 RHEL Program Management 2011-04-04 01:47:05 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 12 Yotam Oron 2011-04-06 06:36:26 UTC
Preintegration will check all the options for this issue and will come up with the best solution:
#236: [688616] - VDSM should disable cpu cgroup
--------------------------+------------------------
 Reporter:  hateya@…      |      Owner:  mgoldboi@…
     Type:  Feature       |     Status:  new
 Priority:  major         |  Milestone:  Rhev -2.3
Component:  VDSM-libvirt  |    Version:  Rhel6.1
 Keywords:                |
--------------------------+------------------------
 Due to bug 623712 (scalability problems with cgroups 'cpu' on large SMP
 systems), which is a version\scalability blocker, vdsm is required to
 disable part\all cgroups functionality till original bug will be fixed in
 kernel.

 pre-integration should map all possible ways for solution, and provide
 information on how each affect system functionality, so VDSM will take the
 best decision.

 BZ!#688616 !<https://bugzilla.redhat.com/show_bug.cgi?id=688616> has afew
 optional ways of resolution.

 After speaking with Haim, we agreed that all of them should be tested
 so that the best solution (with the least or no side effects) is chosen
 and implemented.

 = '''__Summary:__''' =
 || Bug || Assigned || Description ||
 || || || ||

Comment 13 Moran Goldboim 2011-04-06 08:29:29 UTC
Since bug 623712 isn't going to be fixed for 6.1, we recommend that vdsm disable all cgroup usage on the host.

Comment 14 Yotam Oron 2011-04-06 14:27:21 UTC
Not going to fix since 623712 should be fixed for 6.2

Comment 16 Andrew Cathrow 2011-04-11 01:02:00 UTC
(In reply to comment #14)
> Not going to fix since 623712 should be fixed for 6.2

No. We aren't going to live through this for beta
Only when we have POSITIVE confirmation that the kernel has the fix and it's verified in 6.2 then we can consider undoing this.
I'm also removing the depends on since this is the workaround for this issue, the bugzilla to re-enable cgroups would depend on the 623712 bug, not this one.

Comment 17 Yotam Oron 2011-04-11 09:06:01 UTC
Turned off all cgroups (cpuset,ns,cpu,cpuacct,memory,devices,freezer,net_cls,blkio) in kernel parameters, and also turned off cgconfig.

I think that we should also add release notes to that version, since a persistent user can undo all those changes and get the scalability issue again.

Does it make any sense?

Comment 18 Dan Kenigsberg 2011-04-14 20:54:01 UTC
(In reply to comment #17)
> Turned off all cgroups
> (cpuset,ns,cpu,cpuacct,memory,devices,freezer,net_cls,blkio) in kernel
> parameters, and also turned off cgconfig.

Why should we turn them all off? as far as I understand https://bugzilla.redhat.com/show_bug.cgi?id=623712#c33 disabling cpu alone should be enough.

Comment 19 Yotam Oron 2011-04-17 04:19:31 UTC
After testing that David Naori have made, it seems that we need to turn all cgroups off for the scalability issue to be gone, that's why I turned it all off.

Comment 22 Yotam Oron 2011-05-02 06:44:40 UTC
I'm committing a patch that will disable cgroups on installation, hence libvirtd will be aware if the situation from the very beginning.

Comment 23 Yotam Oron 2011-05-02 06:44:40 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
VDSM disables cgroups on installation due to scalability issues.

Comment 24 Dafna Ron 2011-05-02 08:51:09 UTC
verified on ic114
vdsm-4.9-63.el6.x86_64
vdsm-debug-plugin-4.9-63.el6.x86_64
vdsm-cli-4.9-63.el6.x86_64
vdsm-debuginfo-4.9-63.el6.x86_64
qemu-kvm-0.12.1.2-2.160.el6.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.160.el6.x86_64
qemu-kvm-tools-0.12.1.2-2.160.el6.x86_64
libvirt-0.8.7-18.el6.x86_64
libvirt-client-0.8.7-18.el6.x86_64
libvirt-python-0.8.7-18.el6.x86_64
libvirt-debuginfo-0.8.7-18.el6.x86_64
libvirt-devel-0.8.7-18.el6.x86_64



cgroup mount is disabled:

[root@blond-vdsg cgroup]# lscgroup 
cgroups can't be listed: Cgroup is not mounted
[root@blond-vdsg cgroup]# cat /proc/mounts |grep cgroup
[root@blond-vdsg cgroup]#

Comment 26 Dafna Ron 2011-05-03 11:37:35 UTC
there is still a bug in bootsrap which require a second reboot after host installation to disable cgroup. it will be solved in the next ic. 
for the vdsm - I am verifying this bug since cgroup is disabled by vdsm during install. 

cgroup active: 

[root@blond-vdsg yum.repos.d]# cat /proc/mounts |grep cgroup
cgroup /cgroup/cpuset cgroup rw,relatime,cpuset 0 0
cgroup /cgroup/cpu cgroup rw,relatime,cpu 0 0
cgroup /cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
cgroup /cgroup/memory cgroup rw,relatime,memory 0 0
cgroup /cgroup/devices cgroup rw,relatime,devices 0 0
cgroup /cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /cgroup/net_cls cgroup rw,relatime,net_cls 0 0
cgroup /cgroup/blkio cgroup rw,relatime,blkio 0 0


after vdsm-4.9-63.el6.x86_64 install:

root@blond-vdsg vdsm]# cat /proc/mounts |grep cgroup
[root@blond-vdsg vdsm]#

Comment 27 Alan Pevec 2011-05-10 09:24:05 UTC
*** Bug 703095 has been marked as a duplicate of this bug. ***

Comment 28 Dan Kenigsberg 2011-07-27 09:26:43 UTC
Now that kernel -171 is finally in the hybrid repo, the code disabling cgroups on installation has been reverted.

http://gerrit.usersys.redhat.com/437

Note that if you have an old host, which had cgroup disabled, you would have to enable it manually by editing grub.conf.

Comment 29 Alan Pevec 2011-09-09 07:59:25 UTC
(In reply to comment #28)
> Now that kernel -171 is finally in the hybrid repo, the code disabling cgroups
> on installation has been reverted.
> 
> http://gerrit.usersys.redhat.com/437
> 
> Note that if you have an old host, which had cgroup disabled, you would have to
> enable it manually by editing grub.conf.

This was reverted in b48a4d9b2ce2162135b29695de66f1a250fecd3c
but we need it back for Beta3 with 6.2 kernel, right?

Comment 30 errata-xmlrpc 2011-12-06 07:09:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html


Note You need to log in before you can comment on or make changes to this bug.