Bug 1689297

Summary: RFE: rewrite cgroups code to support v2 subsystem
Product: Red Hat Enterprise Linux 8 Reporter: Karen Noel <knoel>
Component: libvirtAssignee: Pavel Hrdina <phrdina>
Status: CLOSED ERRATA QA Contact: yisun
Severity: high Docs Contact:
Priority: high    
Version: 8.0CC: ailan, berrange, dyuan, hpopal, jdenemar, jsuchane, jwboyer, kanderso, knoel, lhuang, lmen, mmcgrath, mrichter, mtessun, phrdina, rbalakri, wchadwic, weizhan, xuzhang, yafu, yalzhang, yisun
Target Milestone: rcKeywords: FutureFeature
Target Release: 8.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-4.5.0-31.el8 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 1513930
: 1717394 (view as bug list) Environment:
Last Closed: 2019-11-05 20:48:28 UTC Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1401552, 1548266, 1548268, 1548272, 1548274, 1548276, 1656432, 1724617, 1741825, 1741837    
Bug Blocks: 1701002, 1717394    

Comment 9 Pavel Hrdina 2019-06-28 15:11:08 UTC
The cgroups v2 controller is blocked by systemd BZ 1724617 where they need to add support for it.

Comment 12 yisun 2019-09-09 08:42:25 UTC
Hi Pavel, 
I saw the systemd still not enabling cpuset controller by default (Bug 1724617 - RFE: add support for cgroups v2 cpuset controller)
So I manually enable the cpuset controller and have a basic test, but seems not working well. 
And in that bz you mentioned "applications should not enable any controllers that are not supported by systemd"
So do we still need to wait for bz1724617? Or we should fully test the cpuset controller now?

2 problems found for now, pls help to check.
# mount | grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate)

# pwd
/sys/fs/cgroup

# echo "+cpuset" > cgroup.subtree_control 
# echo "+cpuset" > machine.slice/cgroup.subtree_control 

# cat machine.slice/cgroup.subtree_control 
cpuset cpu io memory pids

(prolem 1). if we add cpuset controller before vm started, it cannot be started
# virsh start avocado-vt-vm1
error: Failed to start domain avocado-vt-vm1
error: Start job for unit machine-qemu\x2d11\x2davocado\x2dvt\x2dvm1.scope failed with 'failed'


so, let's remove the cpuset controller and start the vm again. Then, add the cpuset controller back
# echo "-cpuset" > machine.slice/cgroup.subtree_control 
# echo "-cpuset" > cgroup.subtree_control 
# virsh start avocado-vt-vm1
Domain avocado-vt-vm1 started

# echo "+cpuset" > cgroup.subtree_control 
# echo "+cpuset" > machine.slice/cgroup.subtree_control 
# echo "+cpuset" > machine.slice/machine-qemu\\x2d16\\x2davocado\\x2dvt\\x2dvm1.scope/cgroup.subtree_control

Now the vm has cpuset controller enabled
# ll machine.slice/machine-qemu\\x2d16\\x2davocado\\x2dvt\\x2dvm1.scope/ | grep cpuset
-rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus
-r--r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus.effective
-rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus.partition
-rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.mems
-r--r--r--. 1 root root 0 Sep  9 04:25 cpuset.mems.effective

# ll machine.slice/machine-qemu\\x2d16\\x2davocado\\x2dvt\\x2dvm1.scope/vcpu0/|grep cpuset
-rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus
-r--r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus.effective
-rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus.partition
-rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.mems
-r--r--r--. 1 root root 0 Sep  9 04:25 cpuset.mems.effective

(problem 2). Using vcpupin to change the vcpuset.cpus value, seems not work.
# virsh vcpupin avocado-vt-vm1 0 1,2
# virsh vcpupin avocado-vt-vm1
VCPU: CPU Affinity
----------------------------------
   0: 1-2
   1: 0-3
# cat machine.slice/machine-qemu\\x2d16\\x2davocado\\x2dvt\\x2dvm1.scope/vcpu0/cpuset.cpus
<=== nothing here

Comment 13 Pavel Hrdina 2019-09-09 09:12:48 UTC
(In reply to yisun from comment #12)
> Hi Pavel, 
> I saw the systemd still not enabling cpuset controller by default (Bug
> 1724617 - RFE: add support for cgroups v2 cpuset controller)
> So I manually enable the cpuset controller and have a basic test, but seems
> not working well. 
> And in that bz you mentioned "applications should not enable any controllers
> that are not supported by systemd"
> So do we still need to wait for bz1724617? Or we should fully test the
> cpuset controller now?

It would be probably good to wait until the systemd BZs are fixed.

> 2 problems found for now, pls help to check.
> # mount | grep cgroup
> cgroup2 on /sys/fs/cgroup type cgroup2
> (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate)
> 
> # pwd
> /sys/fs/cgroup
> 
> # echo "+cpuset" > cgroup.subtree_control 
> # echo "+cpuset" > machine.slice/cgroup.subtree_control 
> 
> # cat machine.slice/cgroup.subtree_control 
> cpuset cpu io memory pids
> 
> (prolem 1). if we add cpuset controller before vm started, it cannot be
> started
> # virsh start avocado-vt-vm1
> error: Failed to start domain avocado-vt-vm1
> error: Start job for unit machine-qemu\x2d11\x2davocado\x2dvt\x2dvm1.scope
> failed with 'failed'

For this issue I've already created a separate BZ 1724651 to track it.

> so, let's remove the cpuset controller and start the vm again. Then, add the
> cpuset controller back
> # echo "-cpuset" > machine.slice/cgroup.subtree_control 
> # echo "-cpuset" > cgroup.subtree_control 
> # virsh start avocado-vt-vm1
> Domain avocado-vt-vm1 started
> 
> # echo "+cpuset" > cgroup.subtree_control 
> # echo "+cpuset" > machine.slice/cgroup.subtree_control 
> # echo "+cpuset" >
> machine.slice/machine-qemu\\x2d16\\x2davocado\\x2dvt\\x2dvm1.scope/cgroup.
> subtree_control
> 
> Now the vm has cpuset controller enabled
> # ll machine.slice/machine-qemu\\x2d16\\x2davocado\\x2dvt\\x2dvm1.scope/ |
> grep cpuset
> -rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus
> -r--r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus.effective
> -rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus.partition
> -rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.mems
> -r--r--r--. 1 root root 0 Sep  9 04:25 cpuset.mems.effective
> 
> # ll
> machine.slice/machine-qemu\\x2d16\\x2davocado\\x2dvt\\x2dvm1.scope/vcpu0/
> |grep cpuset
> -rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus
> -r--r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus.effective
> -rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.cpus.partition
> -rw-r--r--. 1 root root 0 Sep  9 04:25 cpuset.mems
> -r--r--r--. 1 root root 0 Sep  9 04:25 cpuset.mems.effective
> 
> (problem 2). Using vcpupin to change the vcpuset.cpus value, seems not work.
> # virsh vcpupin avocado-vt-vm1 0 1,2
> # virsh vcpupin avocado-vt-vm1
> VCPU: CPU Affinity
> ----------------------------------
>    0: 1-2
>    1: 0-3
> # cat
> machine.slice/machine-qemu\\x2d16\\x2davocado\\x2dvt\\x2dvm1.scope/vcpu0/
> cpuset.cpus
> <=== nothing here

I'll look into it, if the cpuset controller is enabled after the VM is started
libvirtd will not detect it while the VM is starting up and it will not probably
refresh it later when you try to set the values.  However, it should probably fail
so there is something weird going on.

Comment 14 yisun 2019-09-23 10:37:08 UTC
Verified on libvirt-4.5.0-35.module+el8.1.0+4227+b2722cb3.x86_64

Auto scripts to test blkiotune/memtune/schedinfo:
https://github.com/autotest/tp-libvirt/pull/2324
https://github.com/avocado-framework/avocado-vt/pull/2258

blkiotune/memtune/schedinfo passed on both cgroup_v1 and cgroup_v2 env:
cgroup_v1
https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/libvirt-RHEL-8.1-runtest-x86_64-function-guest_resource_control/38/testReport/
cgroup_v2
https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/libvirt-RHEL-8.1-runtest-x86_64-function-guest_resource_control/39/testReport/

Since cpuset controller still having issue in systemd, the related function cannot be tested on cgroup2, just do a regression test on cgroup1
numatune job on cgroup v1:
https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.1%20x86_64/job/libvirt-RHEL-8.1-runtest-x86_64-function-numa/20/testReport/rhel.virsh/numatune/

vcpu affinity job on cgroup v1:
 https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/libvirt/view/RHEL-8.1%20x86_64/job/libvirt-RHEL-8.1-runtest-x86_64-function-cpu/24/testReport/
Failed cases are env and script issues
rhel.vcpu_affinity.positive_test.cputune.offline_hostcpu ==> env issue; host cpus are not enough
rhel.vcpu_affinity.negative_test.vcpu.outrange_cpuset ==> env issue; host cpus are not enough
rhel.vcpu_affinity.negative_test.cputune.outrange_cpuset ==> env issue; host cpus are not enough
rhel.vcpu_affinity.negative_test.cputune.offline_hostcpu ==> env issue; host cpus are not enough
rhel.virsh.setvcpus.error_test.shut_off_error_option.with_topology ==> known script; pr is ready
rhel.virsh.cpu_compare_xml.cpu_xml.invalid_test ==> err info for negative cmd has changed
rhel.libvirt_qemu_cmdline.hypervisor_features.pv_eoi.enable ==> qemu cmd line info has minor change
rhel.libvirt_qemu_cmdline.hypervisor_features.pv_eoi.disable ==> qemu cmd line info has minor change

device controller regression test on cgroupv1 (manually):
RHEL7-18222 - [Device.list] Start VM with host serial port
https://polarion.engineering.redhat.com/polarion/#/project/RedHatEnterpriseLinux7/workitem?id=RHEL7-18222

RHEL7-98668 - [Device.list] Start VM with backing file in host storage device    
https://polarion.engineering.redhat.com/polarion/#/project/RedHatEnterpriseLinux7/workitem?id=RHEL7-98668

RHEL7-98667 - [Device.list] Start VM with host storage device    
https://polarion.engineering.redhat.com/polarion/#/project/RedHatEnterpriseLinux7/workitem?id=RHEL7-98667

RHEL7-98984 - [Device.list] Start VM with sound device when setting different value to 'vnc_allow_host_audio'
https://polarion.engineering.redhat.com/polarion/#/project/RedHatEnterpriseLinux7/workitem?id=RHEL7-98984

RHEL7-18232 - [Cgroup_device_acl] Default value of device.list of VM for default configuration of cgroup_device_acl in qemu.conf
https://polarion.engineering.redhat.com/polarion/#/project/RedHatEnterpriseLinux7/workitem?id=RHEL7-18232

RHEL7-23878 - [Cgroup_controllers] Disable memory 'cgroup_controllers' in qemu.conf    
https://polarion.engineering.redhat.com/polarion/#/project/RedHatEnterpriseLinux7/workitem?id=RHEL7-23878


The test result is PASSED, and we'll continue tracking following issues by separated bzs
cpuset controller not fully work due to systemd bz:
Bug 1724617 - RFE: add support for cgroups v2 cpuset controller
existing lower priority issues:
Bug 1740049 - [cgroup_v2] Libvirt crashed when do blkiotune to running vm with cgroup2 manually mounted
Bug 1734625 - [cgroup_v2]When disable 'io' controller, the error message is code-level when try to set values to blk related params
Bug 1717394 - RFE: add cgroups v2 BPF devices support

Comment 16 errata-xmlrpc 2019-11-05 20:48:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3345