Bug 1808940

Summary: cpuset controller group not created for qemu vm
Product: Red Hat Enterprise Linux 8 Reporter: yisun
Component: systemdAssignee: Michal Sekletar <msekleta>
Status: CLOSED ERRATA QA Contact: Frantisek Sumsal <fsumsal>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.2CC: coli, dzheng, jiyan, jsynacek, lcheng, lhuang, lmen, lmiksik, michal.skrivanek, phrdina, smitterl, systemd-maint-list, systemd-maint, weizhan, xuzhang, yisun, ymankad
Target Milestone: rcKeywords: Automation, Regression, TestBlocker
Target Release: 8.0   
Hardware: x86_64   
OS: All   
Whiteboard:
Fixed In Version: systemd-239-28.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:45:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1802014, 1809620, 1810605    

Description yisun 2020-03-02 04:04:00 UTC
Description of problem:
cpuset controller group not created for qemu vm

Version-Release number of selected component (if applicable):
systemd-239-27.el8.x86_64
qemu-kvm-2.12.0-98.module+el8.2.0+5698+10a84757.x86_64
libvirt-4.5.0-40.module+el8.2.0+5761+d16d25e7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. start a vm
[root@dell-per730-59 machine.slice]# virsh start avocado-vt-vm1
Domain avocado-vt-vm1 started

[root@dell-per730-59 machine.slice]# ps -ef | grep avocado-vt-vm1
qemu       82455       1 99 23:00 ?        00:00:06 /usr/libexec/qemu-kvm -name guest=avocado-vt-vm1

2. Check if cpuset cgroup created for this vm
[root@dell-per730-59 machine.slice]# ll /sys/fs/cgroup/cpuset/machine.slice/ | grep machine | wc -l
0
<=== cpuset cgroup not created for the vm

3. virsh setvcpus cmd works fine but actually takes no effect
[root@dell-per730-59 machine.slice]# virsh setvcpus avocado-vt-vm1 2 | echo $?
0

Actual results:
As above, cpuset controller not work for vm

Additional info:
this is not reproduced on systemd-239-25.el8.x86_64 with same qemu and libvirt versions, so file to systemd first for a triage.

Comment 2 yisun 2020-03-02 04:13:05 UTC
and 2 other cgroup related issues happened in libvirt with systemd-239-27.el8.x86_64 but not systemd-239-25.el8.x86_64:
Bug 1808293 - libvirtd crashed after setting blkio weight 
Bug 1808087 - [cgroups] resource controllers not mounted

Comment 7 Michal Sekletar 2020-03-04 14:09:06 UTC
I don't think we need to block snapshot release on this BZ. AFAICT, the bug doesn't affect very basic libvirt functionality (i.e. starting/stopping VMs). VM resource management is affected, but I have the suspicion that partners don't do any VM related performance testing on snaps.

Comment 9 Pavel Hrdina 2020-03-05 17:39:39 UTC
*** Bug 1808087 has been marked as a duplicate of this bug. ***

Comment 11 Michal Sekletar 2020-03-06 14:32:30 UTC
I think I've fixed the issue. Here are the test packages,

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=27095924

Please retest.

Comment 12 yisun 2020-03-09 05:38:07 UTC
(In reply to Michal Sekletar from comment #11)
> I think I've fixed the issue. Here are the test packages,
> 
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=27095924
> 
> Please retest.
Just reserved a test machine and executed the test, result is PASS

===========================
1. WITH PROBLEMATIC SYSTEMD VERSION TO REPRODUCE 
===========================
(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# rpm -qa| grep systemd-2
systemd-239-27.el8.x86_64

(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 files]# avocado run --vt-type libvirt vcpu_hotpluggable.positive_test.plug.live
JOB ID     : 2d98dfd950bd82ba2a729c1a53c791c4578b7f91
JOB LOG    : /root/avocado/job-results/job-2020-03-09T00.38-2d98dfd/job.log
 (1/1) type_specific.io-github-autotest-libvirt.vcpu_hotpluggable.positive_test.plug.live: ERROR: Command 'lscgroup| grep cpuset| grep vm1| grep vcpu0' failed.\nstdout: b''\nstderr: b''\nadditional_info: None (46.16 s)
RESULTS    : PASS 0 | ERROR 1 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB TIME   : 48.61 s

(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 files]# virsh blkiotune avocado-vt-vm1
error: Unable to get blkio parameters
error: Requested operation is not valid: blkio cgroup isn't mounted

(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 files]# virsh memtune avocado-vt-vm1
error: Unable to get memory parameters
error: Requested operation is not valid: cgroup memory controller is not mounted

(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 files]# virsh schedinfo avocado-vt-vm1
Scheduler      : Unknown
error: Requested operation is not valid: cgroup CPU controller is not mounted


(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# virsh blkiotune avocado-vt-vm1 --weight 200
error: Disconnected from qemu:///system due to end of file
error: Unable to change blkio parameters
error: End of file while reading data: Input/output error


===========================
WITH NEW PACKAGES:
===========================
(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# rpm -qa | grep systemd-2
systemd-239-27.el8.cpuset_enable_mask.x86_64
(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# avocado run --vt-type libvirt vcpu_hotpluggable.positive_test.plug.live
JOB ID     : c4a95273fe6d94054a8a040d140687b147830f90
JOB LOG    : /root/avocado/job-results/job-2020-03-09T01.32-c4a9527/job.log
 (1/1) type_specific.io-github-autotest-libvirt.vcpu_hotpluggable.positive_test.plug.live: PASS (41.96 s)
RESULTS    : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB TIME   : 44.72 s


(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# virsh blkiotune avocado-vt-vm1
weight         : 200
device_weight  :
device_read_iops_sec:
device_write_iops_sec:
device_read_bytes_sec:
device_write_bytes_sec:

(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# virsh blkiotune avocado-vt-vm1 --weight 300

(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# cat /sys/fs/cgroup/blkio/machine.slice/machine-qemu\\x2d1\\x2davocado\\x2dvt\\x2dvm1.scope/blkio.bfq.weight
300

(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# virsh memtune avocado-vt-vm1
hard_limit     : unlimited
soft_limit     : unlimited
swap_hard_limit: unlimited

(.libvirt-ci-venv-ci-runtest-lPzIc4) [root@lenovo-sr630-13 ~]# virsh schedinfo avocado-vt-vm1
Scheduler      : posix
cpu_shares     : 1024
vcpu_period    : 100000
vcpu_quota     : -1
emulator_period: 100000
emulator_quota : -1
global_period  : 100000
global_quota   : -1
iothread_period: 100000
iothread_quota : -1

Comment 13 Lukáš Nykrýn 2020-03-09 13:00:41 UTC
fix merged to github master branch -> https://github.com/systemd-rhel/rhel-8/pull/73

Comment 17 yisun 2020-03-17 03:38:53 UTC
libvirt gating case can be passed with latest systemd
moved to VERIFIED

(.libvirt-ci-venv-ci-runtest-eoS6Nk) [root@libvirt-rhel-8 ~]# rpm -qa | egrep "^systemd-2|^libvirt-6"
libvirt-6.0.0-10.module+el8.2.0+5984+dce93708.x86_64
systemd-239-28.el8.x86_64

(.libvirt-ci-venv-ci-runtest-eoS6Nk) [root@libvirt-rhel-8 ~]# avocado run --vt-type libvirt vcpu_hotpluggable.positive_test.plug.live
JOB ID     : c0d08096142b04e106cc702279029c17b2391e57
JOB LOG    : /root/avocado/job-results/job-2020-03-16T23.36-c0d0809/job.log
 (1/1) type_specific.io-github-autotest-libvirt.vcpu_hotpluggable.positive_test.plug.live: PASS (49.95 s)
RESULTS    : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB TIME   : 51.77 s

Comment 19 errata-xmlrpc 2020-04-28 16:45:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1794