| Summary: | libvirtd do not pin vcpu process according numad returned | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Jincheng Miao <jmiao> | ||||
| Component: | libvirt | Assignee: | John Ferlan <jferlan> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 6.5 | CC: | dyuan, gsun, honzhang, jdenemar, jiahu, jtomko, mzhan, rbalakri, xuzhang | ||||
| Target Milestone: | rc | Keywords: | Upstream | ||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-0.10.2-36.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-10-14 04:17:16 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
Fixed upstream by:
commit a39f69d2bb5494d661be917956baa437d01a4d13
Author: Osier Yang <jyang>
Date: Fri May 24 17:08:28 2013 +0800
qemu: Set cpuset.cpus for domain process
When either "cpuset" of <vcpu> is specified, or the "placement" of
<vcpu> is "auto", only setting the cpuset.mems might cause the guest
starting to fail. E.g. ("placement" of both <vcpu> and <numatune> is
"auto"):
1) Related XMLs
<vcpu placement='auto'>4</vcpu>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
2) Host NUMA topology
% numactl --hardware
available: 8 nodes (0-7)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 16374 MB
node 0 free: 11899 MB
node 1 cpus: 32 36 40 44 48 52 56 60
node 1 size: 16384 MB
node 1 free: 15318 MB
node 2 cpus: 2 6 10 14 18 22 26 30
node 2 size: 16384 MB
node 2 free: 15766 MB
node 3 cpus: 34 38 42 46 50 54 58 62
node 3 size: 16384 MB
node 3 free: 15347 MB
node 4 cpus: 3 7 11 15 19 23 27 31
node 4 size: 16384 MB
node 4 free: 15041 MB
node 5 cpus: 35 39 43 47 51 55 59 63
node 5 size: 16384 MB
node 5 free: 15202 MB
node 6 cpus: 1 5 9 13 17 21 25 29
node 6 size: 16384 MB
node 6 free: 15197 MB
node 7 cpus: 33 37 41 45 49 53 57 61
node 7 size: 16368 MB
node 7 free: 15669 MB
4) cpuset.cpus will be set as: (from debug log)
2013-05-09 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/toy/cpuset.cpus'
to '0-63'
5) The advisory nodeset got from querying numad (from debug log)
2013-05-09 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 :
Nodeset returned from numad: 1
6) cpuset.mems will be set as: (from debug log)
2013-05-09 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/toy/cpuset.mems'
to '0-7'
I.E, the domain process's memory is restricted on the first NUMA node,
however, it can use all of the CPUs, which will likely cause the domain
process to fail to start because of the kernel fails to allocate
memory with the the memory policy as "strict".
% tail -n 20 /var/log/libvirt/qemu/toy.log
...
2013-05-09 05:53:32.972+0000: 7318: debug : virCommandHandshakeChild:377 :
Handshake with parent is done
char device redirected to /dev/pts/2 (label charserial0)
kvm_init_vcpu failed: Cannot allocate memory
...
Signed-off-by: Peter Krempa <pkrempa>
commit b8b38321e724b5b1b7858c415566ab5e6e96ec8c
Author: Peter Krempa <pkrempa>
Date: Thu Jul 18 11:21:48 2013 +0200
caps: Add helpers to convert NUMA nodes to corresponding CPUs
These helpers use the remembered host capabilities to retrieve the cpu
map rather than query the host again. The intended usage for this
helpers is to fix automatic NUMA placement with strict memory alloc. The
code doing the prepare needs to pin the emulator process only to cpus
belonging to a subset of NUMA nodes of the host.
v1.1.0-254-ga39f69d
https://bugzilla.redhat.com/show_bug.cgi?id=949408#c16
hi, Jan,
I'm verifying this bug, but I can't start the guest on one NUMA host as the bug description setting.
This NUMA host is the one used in description, this NUMA contains 8 cells.
But I can start the guest on another NUMA host, it contains only 2 NUMA cells.
Is this a new issue?
Test with the following packages:
libvirt-0.10.2-34.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.425.el6.x86_64
kernel-2.6.32-461.el6.x86_64
Steps:
1. prepare one guest like following:
# virsh dumpxml r6
......
<vcpu placement='auto'>20</vcpu>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
......
2. start the guest will meet error:
# virsh start r6
error: Failed to start domain r6
error: Unable to set cpuset.cpus for domain r6: Device or resource busy
Created attachment 892799 [details]
libvirtd debug log
This is the libvirtd debug log while the guest can't be started up.
According to my log for this problem:
2014-05-07 02:13:23.252+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/r6/cpuset.cpus' to '0-63'
2014-05-07 02:13:23.252+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/r6/cpuset.mems' to '0-7'
2014-05-07 02:13:23.252+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/r6/cpuset.mems' to '0-7'
2014-05-07 02:13:29.898+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/memory/libvirt/qemu/r6/memory.use_hierarchy' to '1'
2014-05-07 02:13:29.898+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/memory/libvirt/qemu/r6/memory.use_hierarchy' to '1'
2014-05-07 02:35:52.615+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.deny' to 'a'
2014-05-07 02:35:52.615+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 136:* rw'
2014-05-07 02:35:52.616+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:3 rw'
2014-05-07 02:35:52.616+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:7 rw'
2014-05-07 02:35:52.617+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:5 rw'
2014-05-07 02:35:52.617+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:8 rw'
2014-05-07 02:35:52.617+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:8 rw'
2014-05-07 02:35:52.617+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:9 rw'
2014-05-07 02:35:52.618+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 5:2 rw'
2014-05-07 02:35:52.618+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 10:232 rw'
2014-05-07 02:35:52.618+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 254:0 rw'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 10:228 rw'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 10:228 rw'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/r6/cpuset.mems' to '2,4,7'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/cpuset.cpus' to '1-3,5-7,9-11,13-15,17-19,21-23,25-27,29-31'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:335 : Failed to write value '1-3,5-7,9-11,13-15,17-19,21-23,25-27,29-31': Device or resource busy
the last virCgroupSetValueStr() to /cgroup/cpuset/libvirt/qemu/cpuset.cpus is forbid, because the problem of cgroup in rhel6.
Here is a libvirt problem: why need to set /cgroup/cpuset/libvirt/qemu/cpuset.cpus after numad returned?
The affected code is in libvirt-rhel/src/qemu/qemu_cgroup.c:
int qemuSetupCgroup(struct qemud_driver *driver,
...
458 rc = virCgroupSetCpusetCpus(driver->cgroup, cpu_mask);
459 VIR_FREE(cpu_mask);
460 if (rc != 0) {
461 virReportSystemError(-rc,
462 _("Unable to set cpuset.cpus for domain %s"),
463 vm->def->name);
464 goto cleanup;
465 }
I thought we could just set /cgroup/cpuset/libvirt/qemu/r6/cpuset.cpus instead.
The latest libvirt-0.10.2-36.el6 fixed this bug:
# virsh edit rhel65
<domain>
...
<vcpu placement='auto'>24</vcpu>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
...
</domain>
# virsh start rhel65
the numad returns nodes '2,4,7'
2014-05-20 03:59:50.930+0000: 40636: debug : qemuProcessStart:3858 : Nodeset returned from numad: 2,4,7
and libvirtd only pin vcpus to numad returned cpus.
# virsh vcpuinfo rhel65 | grep -w "CPU:"
CPU: 6
CPU: 9
CPU: 21
CPU: 1
CPU: 7
CPU: 11
CPU: 1
CPU: 19
CPU: 18
CPU: 27
CPU: 9
CPU: 26
CPU: 7
CPU: 7
CPU: 2
CPU: 1
CPU: 19
CPU: 17
CPU: 5
CPU: 10
CPU: 2
CPU: 9
CPU: 27
CPU: 11
And the problem of comment 6 is also fixed, some info from log file:
2014-05-20 03:59:50.933+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/rhel65/cpuset.cpus' to '0-63'
2014-05-20 03:59:50.933+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/rhel65/cpuset.mems' to '0-7'
2014-05-20 03:59:50.933+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/memory/libvirt/qemu/rhel65/memory.use_hierarchy' to '1'
2014-05-20 03:59:50.933+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.deny' to 'a'
2014-05-20 03:59:50.934+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 136:* rw'
2014-05-20 03:59:50.934+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:3 rw'
2014-05-20 03:59:50.934+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:7 rw'
2014-05-20 03:59:50.935+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:5 rw'
2014-05-20 03:59:50.935+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:8 rw'
2014-05-20 03:59:50.935+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:9 rw'
2014-05-20 03:59:50.935+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 5:2 rw'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 10:232 rw'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 254:0 rw'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 10:228 rw'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/rhel65/cpuset.mems' to '2,4,7'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/rhel65/cpuset.cpus' to '1-3,5-7,9-11,13-15,17-19,21-23,25-27,29-31'
The cpuset.cpus is set exactly to target guest.
So I change the status to VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1374.html |
Description libvirt.org said:"auto" indicates the domain process will be pinned to the advisory nodeset from querying numad. But libvirtd do not pin vcpu process to the nodeset numad returned. Version: libvirt-0.10.2-27.el6.x86_64 qemu-kvm-0.12.1.2-2.404.el6.x86_64 numad-0.5-9.20130814git.el6.x86_64 kernel-2.6.32-419.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. prepare domain # virsh edit test <domain> ... <vcpu placement='auto'>24</vcpu> <numatune> <memory mode='strict' placement='auto'/> </numatune> ... </domain> # virsh start test Domain test started 2. find out domain is allocated to node 2,4-5,7 # grep Nodeset /tmp/libvirtd.log 2013-09-27 03:23:45.443+0000: 52548: debug : qemuProcessStart:3781 : Nodeset returned from numad: 2,4-5,7 3. find out the cpu numbers of node # numactl --hardware available: 8 nodes (0-7) node 0 cpus: 0 4 8 12 16 20 24 28 node 0 size: 16349 MB node 0 free: 15690 MB node 1 cpus: 32 36 40 44 48 52 56 60 node 1 size: 16384 MB node 1 free: 15934 MB node 2 cpus: 1 5 9 13 17 21 25 29 node 2 size: 16384 MB node 2 free: 15913 MB node 3 cpus: 33 37 41 45 49 53 57 61 node 3 size: 16384 MB node 3 free: 15943 MB node 4 cpus: 2 6 10 14 18 22 26 30 node 4 size: 16384 MB node 4 free: 15818 MB node 5 cpus: 34 38 42 46 50 54 58 62 node 5 size: 16384 MB node 5 free: 16007 MB node 6 cpus: 35 39 43 47 51 55 59 63 node 6 size: 16384 MB node 6 free: 15913 MB node 7 cpus: 3 7 11 15 19 23 27 31 node 7 size: 16367 MB node 7 free: 15780 MB # virsh vcpuinfo test | grep -w "CPU:" CPU: 47 CPU: 43 CPU: 43 CPU: 24 CPU: 43 CPU: 10 CPU: 26 CPU: 5 CPU: 1 CPU: 9 CPU: 14 CPU: 6 CPU: 52 CPU: 6 CPU: 56 CPU: 63 CPU: 31 CPU: 14 CPU: 30 CPU: 14 CPU: 20 CPU: 4 CPU: 13 CPU: 55 The cpu which each vcpu process running on should be the same with numad returned. Apparently, the cpu 47 is not in node 2,4-5,7. Expected result: virsh vcpuinfo should use nodes returned from numad.