Bug 2037998

Summary: [cgroup_v2] Mismatch range info of cpu_shares, vcpu_quota,emulator_quota,global_quota, iothread_quota with cgroup v2 in virsh schedinfo cmd prompt, man-pages and libvirt doc
Product: Red Hat Enterprise Linux 9 Reporter: liang cong <lcong>
Component: libvirtAssignee: Pavel Hrdina <phrdina>
libvirt sub component: General QA Contact: yisun
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: dzheng, jdenemar, jsuchane, lmen, pkrempa, smooney, virt-maint, xuzhang
Version: 9.0Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-9.0.0-2.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-09 07:26:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 9.1.0
Embargoed:
Bug Depends On:    
Bug Blocks: 2035518, 2121158    

Description liang cong 2022-01-07 01:23:31 UTC
Description of problem:
Mismatch range info of cpu_shares, vcpu_quota,emulator_quota,global_quota, iothread_quota with cgroup2 in virsh schedinfo cmd prompt, man-pages and libvirt doc

Version-Release number of selected component (if applicable):
libvirt-7.10.0-1.module+el8.6.0+13502+4f24a11d.x86_64
qemu-kvm-6.2.0-1.module+el8.6.0+13725+61ae1949.x86_64
systemd-239-51.el8.x86_64
Linux 4.18.0-348.4.el8.kpq0.x86_64

How reproducible:
100%

Steps to Reproduce:
1. cpu_share
1.1. Boot RHEL-8 guest machine with systemd.unified_cgroup_hierarchy=1 on the kernel cmdline to start with cgroup v2
1.2. set cpu_share with invalid number and command responds the valid range:
# virsh schedinfo vm --set cpu_shares=-1
Scheduler      : posix
error: invalid argument: shares '18446744073709551615' must be in range [2, 262144]

1.3. man virsh shows cpu_shares valid range is also [2, 262144]
Note: The cpu_shares parameter has a valid value range of 2-262144.

1.4. libvirt doc(https://libvirt.org/formatdomain.html#cpu-tuning) shows cpu_shares valid range is [2, 262144]
The value should be in range [2, 262144].

Actual results:
1.5.set cpu_share with number 10001 which in valid range implied at step 1 get error respond
# virsh schedinfo vm --set cpu_shares=10001
Scheduler      : posix
error: error from service: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Value specified in CPUWeight is out of range


2. vcpu_quota,emulator_quota,global_quota, iothread_quota
2.1. Boot RHEL-8 guest machine with systemd.unified_cgroup_hierarchy=1 on the kernel cmdline to start with cgroup v2
2.2 man virsh shows vcpu_quota,emulator_quota, iothread_quota valid range is [1000, 17592186044415] or less than 0.
the vcpu_quota, emulator_quota, and iothread_quota parameters have a valid value range  of 1000-17592186044415 or less than 0.

2.3 libvirt doc(https://libvirt.org/formatdomain.html#cpu-tuning) shows vcpu_quota,emulator_quota,global_quota,iothread_quota valid range [1000, 17592186044415] or less than 0.
The value should be in range [1000, 17592186044415] or less than 0.

Actual results:
2.4 set vcpu_quota,emulator_quota,global_quota,iothread_quota with value -1 then get error message
# virsh schedinfo vm --set vcpu_quota=-1
Scheduler      : posix
error: Invalid value '-1' for 'cpu.max': Invalid argument

# virsh schedinfo vm --set emulator_quota=-1
Scheduler      : posix
error: Invalid value '-1' for 'cpu.max': Invalid argument

# virsh schedinfo vm --set global_quota=-1
Scheduler      : posix
error: Invalid value '-1' for 'cpu.max': Invalid argument

# virsh schedinfo vm --set iothread_quota=-1
Scheduler      : posix
error: Invalid value '-1' for 'cpu.max': Invalid argument


Expected results:
Describe the correct range info of cpu_shares, vcpu_quota,emulator_quota,global_quota, iothread_quota with cgroup2 in virsh schedinfo cmd prompt, man-pages and libvirt doc

Additional info:
cgroup v2:
cpu_share setting range [2, 10000] by observation
(https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html shows the cpu.weight range is [1, 10000])

vcpu_quota,emulator_quota,global_quota, iothread_quota, range [1000, 17592186044415] and 0 by observation

Comment 1 Peter Krempa 2022-06-09 10:29:55 UTC
Upstream issue referencing the same problem https://gitlab.com/libvirt/libvirt/-/issues/324

Comment 2 smooney 2022-06-14 09:58:20 UTC
for visablity this is also breaking our openstack product
https://bugzilla.redhat.com/show_bug.cgi?id=2035518

Comment 3 Pavel Hrdina 2023-01-17 08:55:06 UTC
The -1 issue with quota limits was fixed in upstream by commit

commit 9233f0fa8c8e031197c647f7bc980dee45283641
Author: antonios-f <anton.fadeev>
Date:   Thu Nov 17 09:53:23 2022 +0000

    src/util/vircgroupv2.c: interpret neg quota as "max"

Looking into the other part of the bug report.

Comment 4 Pavel Hrdina 2023-01-17 10:02:53 UTC
The range check and documentation is fixed now as well in upstream.

commit ead6e1b00285cbd98e0f0727efb8adcb29ebc1ba
Author: Pavel Hrdina <phrdina>
Date:   Tue Jan 17 10:33:22 2023 +0100

    docs: document correct cpu shares limits with both cgroups v1 and v2

commit 38af6497610075e5fe386734b87186731d4c17ac
Author: Pavel Hrdina <phrdina>
Date:   Tue Jan 17 10:08:08 2023 +0100

    domain_validate: drop cpu.shares cgroup check

commit cf3414a85b8383d71d6ae2a53daf63c331cc2230
Author: Pavel Hrdina <phrdina>
Date:   Tue Jan 17 10:02:07 2023 +0100

    vircgroupv2: fix cpu.weight limits check

These will have to be backported but the fix mentioned in Comment 3 will be picked up by the latest rebase.

Comment 6 yisun 2023-01-20 03:00:38 UTC
Preverified with:
libvirt-9.1.0-1.fc37.x86_64
libvirt-docs-9.1.0-1.fc37.x86_64

====================
Document check:
====================
[root@yisun-prevbug ~]# man virsh
...
   schedinfo
...
       Note: The cpu_shares parameter has a valid value range of 2-262144 with cgroups v1, 1-10000 with cgroups v2.


[root@yisun-prevbug ~]# cat libvirt/docs/formatdomain.rst | grep "using cgroups v2" -a5
   The optional ``shares`` element specifies the proportional weighted share for
   the domain. If this is omitted, it defaults to the OS provided defaults. NB,
   There is no unit for the value, it's a relative measure based on the setting
   of other VM, e.g. A VM configured with value 2048 will get twice as much CPU
   time as a VM configured with value 1024. The value should be in range
   [2, 262144] using cgroups v1, [1, 10000] using cgroups v2. :since:`Since 0.9.0`
``period``
   The optional ``period`` element specifies the enforcement interval (unit:
   microseconds). Within ``period``, each vCPU of the domain will not be allowed
   to consume more than ``quota`` worth of runtime. The value should be in range
   [1000, 1000000]. A period with value 0 means no value. :since:`Only QEMU


====================
Feature check:
====================

**** check cfs_quota can be set as negative values ****
https://gitlab.com/libvirt/libvirt/-/merge_requests/206

1. prepare a running vm with iothread enabled
# virsh dumpxml w10 | grep iothread -a
  <name>w10</name>
  <uuid>4f6fd482-438a-4af2-9965-d1bb4451ed29</uuid>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <vcpu placement='static'>2</vcpu>
  <iothreads>2</iothreads>
  <iothreadids>
    <iothread id='2'/>
    <iothread id='1'/>
  </iothreadids>

2. set its cpu quota related parameteres to 9999 and check cgroup parameters are actually changed
# virsh schedinfo w10 --set vcpu_quota=9999 emulator_quota=9999 global_quota=9999 iothread_quota=9999
Scheduler      : posix
cpu_shares     : 100
vcpu_period    : 100000
vcpu_quota     : 9999
emulator_period: 100000
emulator_quota : 9999
global_period  : 100000
global_quota   : 9999
iothread_period: 100000
iothread_quota : 9999

# pwd
/sys/fs/cgroup/machine.slice/machine-qemu\x2d6\x2dw10.scope

[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/cpu.max
9999 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/emulator/cpu.max
9999 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/vcpu0/cpu.max
9999 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/iothread2/cpu.max
9999 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/iothread1/cpu.max
9999 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/vcpu1/cpu.max
9999 100000

3. set these parameters again with negative values
# virsh schedinfo w10 --set vcpu_quota=-1 emulator_quota=-22 global_quota=-333 iothread_quota=-4444
Scheduler      : posix
cpu_shares     : 100
vcpu_period    : 100000
vcpu_quota     : 17592186044415
emulator_period: 100000
emulator_quota : 17592186044415
global_period  : 100000
global_quota   : 17592186044415
iothread_period: 100000
iothread_quota : 17592186044415

4. check the cgroup parameters are actually changed back to 'max'
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/cpu.max
max 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/emulator/cpu.max
max 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/vcpu0/cpu.max
max 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/iothread2/cpu.max
max 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/iothread1/cpu.max
max 100000
[root@yisun-prevbug machine-qemu\x2d6\x2dw10.scope]# cat libvirt/vcpu1/cpu.max
max 100000



**** check cpu shares error message is correct for cgroup2 ****
https://gitlab.com/redhat/rhel/src/libvirt/-/merge_requests/73

postive tests:
[root@yisun-prevbug ~]# virsh schedinfo w10 --set cpu_shares=1
Scheduler      : posix
cpu_shares     : 1
vcpu_period    : 100000
vcpu_quota     : 17592186044415
emulator_period: 100000
emulator_quota : 17592186044415
global_period  : 100000
global_quota   : 17592186044415
iothread_period: 100000
iothread_quota : 17592186044415

[root@yisun-prevbug ~]# virsh schedinfo w10 --set cpu_shares=10000
Scheduler      : posix
cpu_shares     : 10000
vcpu_period    : 100000
vcpu_quota     : 17592186044415
emulator_period: 100000
emulator_quota : 17592186044415
global_period  : 100000
global_quota   : 17592186044415
iothread_period: 100000
iothread_quota : 17592186044415

negative tests:
[root@yisun-prevbug ~]# virsh schedinfo w10 --set cpu_shares=10001
Scheduler      : posix
error: invalid argument: shares '10001' must be in range [1, 10000]

[root@yisun-prevbug ~]# virsh schedinfo w10 --set cpu_shares=0
Scheduler      : posix
error: invalid argument: shares '0' must be in range [1, 10000]

[root@yisun-prevbug ~]# virsh schedinfo w10 --set cpu_shares=a
Scheduler      : posix
error: invalid argument: Invalid value for field 'cpu_shares': expected unsigned long long

[root@yisun-prevbug ~]# virsh schedinfo w10 --set cpu_shares=-1
Scheduler      : posix
error: invalid argument: shares '18446744073709551615' must be in range [1, 10000]

====================
QE follow-up tasks:
====================
1. modify cgorup2 polarion cases and corresponding auto scripts about **** check cfs_quota can be set as negative values **** 
2. for document modification, no need to add cases
3. for **** check cpu shares error message is correct for cgroup2 ****, no need to add or modify cases for now, cgroup2's range check is not all covered by libvirt, in other situations a dbus error will be reported and is clear enough
[root@yisun-prevbug ~]# virsh blkiotune w10 --weight 1234321421
error: Unable to change blkio parameters
error: error from service: GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Value specified in IOWeight is out of range

Comment 10 yisun 2023-02-02 02:29:22 UTC
Verified with:
libvirt-9.0.0-2.el9.x86_64
libvirt-docs-9.0.0-2.el9.x86_64

Resutl is passed, steps are same as comment6

Comment 12 errata-xmlrpc 2023-05-09 07:26:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2171