Bug 1798464

Summary: cpu.shares for existing scopes under machine.slice are reset to default when creating a new scope after daemon-reload
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Jaroslav Suchanek <jsuchane>
Component: libvirtAssignee: Pavel Hrdina <phrdina>
Status: CLOSED ERRATA QA Contact: yisun
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.4CC: ben, dyuan, fjin, jdenemar, jinqi, jsuchane, lmen, msekleta, neil, phrdina, smitterl, systemd-maint-list, virt-bugs, virt-maint, xuzhang, yisun
Target Milestone: rcKeywords: Triaged
Target Release: 8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-7.0.0-5.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1798463 Environment:
Last Closed: 2021-05-25 06:41:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1789824, 1798463, 1927290, 1934484    
Bug Blocks:    

Description Jaroslav Suchanek 2020-02-05 12:24:52 UTC
+++ This bug was initially created as a clone of Bug #1798463 +++

+++ This bug was initially created as a clone of Bug #1789824 +++

Description of problem:

The cgroups cpu.shares limits applied to existing scopes (e.g. machine-qemu*, systemd-nspawn*) under /sys/fs/cgroup/cpu/machine.slice/ are reset to the default value (of 1024) when next creating a new scope via machinectl or libvirt after performing a systemctl daemon-reload.

However, if you manually create a new scope under machine.slice and change the cpu.shares to a different value, but don't allocate a process to it that is registered with systemd-machined, then the value of cpu.shares is preserved.

This affects both (QEMU/KVM) VMs created via libvirt and nspawn containers created via machinectl.


Version-Release number of selected component (if applicable):

systemd-219-67.el7_7.2
libvirt-4.5.0-23.el7_7.3


How reproducible:

Consistently reproducible.


Steps to Reproduce:

1. virsh create test1.xml
2. systemctl daemon-reload
3. virsh create test2.xml
4. systemctl daemon-reload
5. virsh create test3.xml

(See attached libvirt XML definition)


Actual results:

After (1):

grep . /sys/fs/cgroup/cpu/machine.slice/*/cpu.shares
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d148\x2dtest1.scope/cpu.shares:2048

After (3):

grep . /sys/fs/cgroup/cpu/machine.slice/*/cpu.shares
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d148\x2dtest1.scope/cpu.shares:1024
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d150\x2dtest2.scope/cpu.shares:2048

After (5):

grep . /sys/fs/cgroup/cpu/machine.slice/*/cpu.shares
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d148\x2dtest1.scope/cpu.shares:1024
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d150\x2dtest2.scope/cpu.shares:1024
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d151\x2dtest3.scope/cpu.shares:2048


Expected results:

grep . /sys/fs/cgroup/cpu/machine.slice/*/cpu.shares
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d148\x2dtest1.scope/cpu.shares:2048
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d150\x2dtest2.scope/cpu.shares:2048
/sys/fs/cgroup/cpu/machine.slice/machine-qemu\x2d151\x2dtest3.scope/cpu.shares:2048


Additional info:

This seems similar in scope to the runc bug detailed in #1455071 and the systemd bug detailed in #1139223.

I've confirmed that the Delegate= directive is correctly set and applied to the existing machine scopes under machine.slice:

# cat /run/systemd/system/machine-qemu\\x2d148\\x2dtest1.scope.d/50-Delegate.conf
[Scope]
Delegate=yes

# systemctl show "machine-qemu\\x2d148\\x2dtest1.scope" | grep Delegate
Delegate=yes

Comment 2 Pavel Hrdina 2021-02-19 09:00:33 UTC
Fixed in upstream:

6a1f5e8a4f vircgroup: correctly free nested virCgroupPtr
85099c3393 tests: add cgroup nested tests
184245f53b vircgroup: introduce nested cgroup to properly work with systemd
badc2bcc73 vircgroup: introduce virCgroupV1Exists and virCgroupV2Exists
382fa15cde vircgroupv2: move task into cgroup before enabling controllers
5f56dd7c83 vircgroupv1: refactor virCgroupV1DetectPlacement
9c1693eff4 vircgroup: use DBus call to systemd for some APIs
d3fb774b1e virsystemd: introduce virSystemdGetMachineUnitByPID
385704d5a4 virsystemd: introduce virSystemdGetMachineByPID
a51147d906 virsystemd: export virSystemdHasMachined

Comment 5 yisun 2021-02-21 03:17:56 UTC
reproduced on libvirt-7.0.0-3.module+el8.4.0+9709+a99efd61.x86_64, qa_ack+

[root@dell-per730-59 libvirt-ci]# virsh schedinfo avocado-vt-vm1 --set cpu_shares=2048
Scheduler      : posix
cpu_shares     : 2048
vcpu_period    : 100000
vcpu_quota     : -1
emulator_period: 100000
emulator_quota : -1
global_period  : 100000
global_quota   : -1
iothread_period: 100000
iothread_quota : -1

[root@dell-per730-59 libvirt-ci]# virsh schedinfo avocado-vt-vm1
Scheduler      : posix
cpu_shares     : 2048
vcpu_period    : 100000
vcpu_quota     : -1
emulator_period: 100000
emulator_quota : -1
global_period  : 100000
global_quota   : -1
iothread_period: 100000
iothread_quota : -1

            
[root@dell-per730-59 libvirt-ci]# cat /sys/fs/cgroup/cpu\,cpuacct/machine.slice/machine-qemu\\x2d3\\x2davocado\\x2dvt\\x2dvm1.scope/cpu.shares 
2048
[root@dell-per730-59 libvirt-ci]# systemctl daemon-reload

[root@dell-per730-59 libvirt-ci]# cat /sys/fs/cgroup/cpu\,cpuacct/machine.slice/machine-qemu\\x2d3\\x2davocado\\x2dvt\\x2dvm1.scope/cpu.shares 
1024

[root@dell-per730-59 libvirt-ci]# virsh schedinfo avocado-vt-vm1
Scheduler      : posix
cpu_shares     : 1024
vcpu_period    : 100000
vcpu_quota     : -1
emulator_period: 100000
emulator_quota : -1
global_period  : 100000
global_quota   : -1
iothread_period: 100000
iothread_quota : -1

Comment 8 yisun 2021-02-24 09:31:21 UTC
Test result: PASS

cgroup v1:
==========
cpu.shares
==========
[root@dell-per740-18 yum.repos.d]# virsh schedinfo vm1 --set cpu_shares=2048
Scheduler      : posix
cpu_shares     : 2048
vcpu_period    : 100000
vcpu_quota     : -1
emulator_period: 100000
emulator_quota : -1
global_period  : 100000
global_quota   : -1
iothread_period: 100000
iothread_quota : -1

[root@dell-per740-18 yum.repos.d]# systemctl daemon-reload
[root@dell-per740-18 yum.repos.d]# virsh schedinfo vm1
Scheduler      : posix
cpu_shares     : 2048
vcpu_period    : 100000
vcpu_quota     : -1
emulator_period: 100000
emulator_quota : -1
global_period  : 100000
global_quota   : -1
iothread_period: 100000
iothread_quota : -1

[root@dell-per740-18 yum.repos.d]# cat /sys/fs/cgroup/cpu\,cpuacct/machine.slice/machine-qemu\\x2d
machine-qemu\x2d1\x2dgls.scope/ machine-qemu\x2d5\x2dvm1.scope/ 
[root@dell-per740-18 yum.repos.d]# cat /sys/fs/cgroup/cpu\,cpuacct/machine.slice/machine-qemu\\x2d5\\x2dvm1.scope/cpu.shares 
2048


blkio.bfq.weight
==========
[root@dell-per740-18 yum.repos.d]# virsh blkiotune vm1 --weight 123

[root@dell-per740-18 yum.repos.d]# systemctl daemon-reload
[root@dell-per740-18 yum.repos.d]# virsh blkiotune vm1
weight         : 123
device_weight  : 
device_read_iops_sec: 
device_write_iops_sec: 
device_read_bytes_sec: 
device_write_bytes_sec: 

[root@dell-per740-18 yum.repos.d]# cat /sys/fs/cgroup/blkio/machine.slice/machine-qemu\\x2d
machine-qemu\x2d1\x2dgls.scope/ machine-qemu\x2d5\x2dvm1.scope/ 
[root@dell-per740-18 yum.repos.d]# cat /sys/fs/cgroup/blkio/machine.slice/machine-qemu\\x2d5\\x2dvm1.scope/blkio.bfq.weight
123


cgroup v2:
==========
cpu.weight
==========
[root@dell-per740-18 ~]# virsh schedinfo vm1 --set cpu_shares=2048
Scheduler      : posix
cpu_shares     : 2048
vcpu_period    : 100000
vcpu_quota     : 17592186044415
emulator_period: 100000
emulator_quota : 17592186044415
global_period  : 100000
global_quota   : 17592186044415
iothread_period: 100000
iothread_quota : 17592186044415

[root@dell-per740-18 ~]# systemctl daemon-reload
[root@dell-per740-18 ~]# virsh schedinfo vm1
Scheduler      : posix
cpu_shares     : 2048
vcpu_period    : 100000
vcpu_quota     : 17592186044415
emulator_period: 100000
emulator_quota : 17592186044415
global_period  : 100000
global_quota   : 17592186044415
iothread_period: 100000
iothread_quota : 17592186044415

[root@dell-per740-18 ~]# cat /sys/fs/cgroup/machine.slice/machine-qemu\\x2d1\\x2dvm1.scope/cpu.weight
2048

io.bfq.weight 
==========
<== failed with Bug 1927290 - cgroup: setting IOWeight doesn't work with cgroups v2
[root@dell-per740-18 ~]# virsh blkiotune vm1 --weight 123

[root@dell-per740-18 ~]# virsh blkiotune vm1
weight         : 100
device_weight  : 
device_read_iops_sec: 
device_write_iops_sec: 
device_read_bytes_sec: 
device_write_bytes_sec:

Comment 9 Pavel Hrdina 2021-02-24 16:05:26 UTC
Hi Yi,

thanks for quick testing. This change affects the remaining cgroups as well where the
configuration was moved to "libvirt" sub-cgroup so it will require to modify some of
the current cgroup tests.

Can you please verify the remaining cgroups that libvirt can modify as well?

Comment 16 errata-xmlrpc 2021-05-25 06:41:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2098