RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1012846 - libvirtd do not pin vcpu process according numad returned
Summary: libvirtd do not pin vcpu process according numad returned
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.5
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: John Ferlan
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-27 09:14 UTC by Jincheng Miao
Modified: 2019-06-13 07:55 UTC (History)
9 users (show)

Fixed In Version: libvirt-0.10.2-36.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-10-14 04:17:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd debug log (7.45 KB, text/plain)
2014-05-06 07:52 UTC, Xuesong Zhang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1374 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2014-10-14 08:11:54 UTC

Description Jincheng Miao 2013-09-27 09:14:01 UTC
Description
libvirt.org said:"auto" indicates the domain process will be pinned to the advisory nodeset from querying numad.
But libvirtd do not pin vcpu process to the nodeset numad returned.

Version:
libvirt-0.10.2-27.el6.x86_64
qemu-kvm-0.12.1.2-2.404.el6.x86_64
numad-0.5-9.20130814git.el6.x86_64
kernel-2.6.32-419.el6.x86_64

How reproducible:
100%

Steps to Reproduce:

1. prepare domain
# virsh edit test
<domain>
  ...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
  ...
</domain>

# virsh start test
Domain test started

2. find out domain is allocated to node 2,4-5,7
# grep Nodeset /tmp/libvirtd.log 
2013-09-27 03:23:45.443+0000: 52548: debug : qemuProcessStart:3781 : Nodeset returned from numad: 2,4-5,7

3. find out the cpu numbers of node 
# numactl --hardware
available: 8 nodes (0-7)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 16349 MB
node 0 free: 15690 MB
node 1 cpus: 32 36 40 44 48 52 56 60
node 1 size: 16384 MB
node 1 free: 15934 MB
node 2 cpus: 1 5 9 13 17 21 25 29
node 2 size: 16384 MB
node 2 free: 15913 MB
node 3 cpus: 33 37 41 45 49 53 57 61
node 3 size: 16384 MB
node 3 free: 15943 MB
node 4 cpus: 2 6 10 14 18 22 26 30
node 4 size: 16384 MB
node 4 free: 15818 MB
node 5 cpus: 34 38 42 46 50 54 58 62
node 5 size: 16384 MB
node 5 free: 16007 MB
node 6 cpus: 35 39 43 47 51 55 59 63
node 6 size: 16384 MB
node 6 free: 15913 MB
node 7 cpus: 3 7 11 15 19 23 27 31
node 7 size: 16367 MB
node 7 free: 15780 MB

# virsh vcpuinfo test | grep -w "CPU:"
CPU:            47
CPU:            43
CPU:            43
CPU:            24
CPU:            43
CPU:            10
CPU:            26
CPU:            5
CPU:            1
CPU:            9
CPU:            14
CPU:            6
CPU:            52
CPU:            6
CPU:            56
CPU:            63
CPU:            31
CPU:            14
CPU:            30
CPU:            14
CPU:            20
CPU:            4
CPU:            13
CPU:            55

The cpu which each vcpu process running on should be the same with numad returned.

Apparently, the cpu 47 is not in node 2,4-5,7.

Expected result:
virsh vcpuinfo should use nodes returned from numad.

Comment 2 Ján Tomko 2014-03-31 12:46:10 UTC
Fixed upstream by:
commit a39f69d2bb5494d661be917956baa437d01a4d13
Author: Osier Yang <jyang>
Date:   Fri May 24 17:08:28 2013 +0800

    qemu: Set cpuset.cpus for domain process
    
    When either "cpuset" of <vcpu> is specified, or the "placement" of
    <vcpu> is "auto", only setting the cpuset.mems might cause the guest
    starting to fail. E.g. ("placement" of both <vcpu> and <numatune> is
    "auto"):
    
    1) Related XMLs
      <vcpu placement='auto'>4</vcpu>
      <numatune>
        <memory mode='strict' placement='auto'/>
      </numatune>
    
    2) Host NUMA topology
      % numactl --hardware
      available: 8 nodes (0-7)
      node 0 cpus: 0 4 8 12 16 20 24 28
      node 0 size: 16374 MB
      node 0 free: 11899 MB
      node 1 cpus: 32 36 40 44 48 52 56 60
      node 1 size: 16384 MB
      node 1 free: 15318 MB
      node 2 cpus: 2 6 10 14 18 22 26 30
      node 2 size: 16384 MB
      node 2 free: 15766 MB
      node 3 cpus: 34 38 42 46 50 54 58 62
      node 3 size: 16384 MB
      node 3 free: 15347 MB
      node 4 cpus: 3 7 11 15 19 23 27 31
      node 4 size: 16384 MB
      node 4 free: 15041 MB
      node 5 cpus: 35 39 43 47 51 55 59 63
      node 5 size: 16384 MB
      node 5 free: 15202 MB
      node 6 cpus: 1 5 9 13 17 21 25 29
      node 6 size: 16384 MB
      node 6 free: 15197 MB
      node 7 cpus: 33 37 41 45 49 53 57 61
      node 7 size: 16368 MB
      node 7 free: 15669 MB
    
    4) cpuset.cpus will be set as: (from debug log)
    
    2013-05-09 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
    Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/toy/cpuset.cpus'
    to '0-63'
    
    5) The advisory nodeset got from querying numad (from debug log)
    
    2013-05-09 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 :
    Nodeset returned from numad: 1
    
    6) cpuset.mems will be set as: (from debug log)
    
    2013-05-09 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 :
    Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/toy/cpuset.mems'
    to '0-7'
    
    I.E, the domain process's memory is restricted on the first NUMA node,
    however, it can use all of the CPUs, which will likely cause the domain
    process to fail to start because of the kernel fails to allocate
    memory with the the memory policy as "strict".
    
    % tail -n 20 /var/log/libvirt/qemu/toy.log
    ...
    2013-05-09 05:53:32.972+0000: 7318: debug : virCommandHandshakeChild:377 :
    Handshake with parent is done
    char device redirected to /dev/pts/2 (label charserial0)
    kvm_init_vcpu failed: Cannot allocate memory
    ...
    
    Signed-off-by: Peter Krempa <pkrempa>

commit b8b38321e724b5b1b7858c415566ab5e6e96ec8c
Author: Peter Krempa <pkrempa>
Date:   Thu Jul 18 11:21:48 2013 +0200

    caps: Add helpers to convert NUMA nodes to corresponding CPUs
    
    These helpers use the remembered host capabilities to retrieve the cpu
    map rather than query the host again. The intended usage for this
    helpers is to fix automatic NUMA placement with strict memory alloc. The
    code doing the prepare needs to pin the emulator process only to cpus
    belonging to a subset of NUMA nodes of the host.

v1.1.0-254-ga39f69d

https://bugzilla.redhat.com/show_bug.cgi?id=949408#c16

Comment 6 Xuesong Zhang 2014-05-06 07:51:19 UTC
hi, Jan, 

I'm verifying this bug, but I can't start the guest on one NUMA host as the bug description setting.
This NUMA host is the one used in description, this NUMA contains 8 cells.

But I can start the guest on another NUMA host, it contains only 2 NUMA cells.

Is this a new issue?


Test with the following packages:
libvirt-0.10.2-34.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.425.el6.x86_64
kernel-2.6.32-461.el6.x86_64

Steps:
1. prepare one guest like following:
# virsh dumpxml r6
......
  <vcpu placement='auto'>20</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
......


2. start the guest will meet error:
# virsh start r6
error: Failed to start domain r6
error: Unable to set cpuset.cpus for domain r6: Device or resource busy

Comment 7 Xuesong Zhang 2014-05-06 07:52:18 UTC
Created attachment 892799 [details]
libvirtd debug log

This is the libvirtd debug log while the guest can't be started up.

Comment 8 Jincheng Miao 2014-05-07 02:31:07 UTC
According to my log for this problem:
2014-05-07 02:13:23.252+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/r6/cpuset.cpus' to '0-63'
2014-05-07 02:13:23.252+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/r6/cpuset.mems' to '0-7'
2014-05-07 02:13:23.252+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/r6/cpuset.mems' to '0-7'
2014-05-07 02:13:29.898+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/memory/libvirt/qemu/r6/memory.use_hierarchy' to '1'
2014-05-07 02:13:29.898+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/memory/libvirt/qemu/r6/memory.use_hierarchy' to '1'
2014-05-07 02:35:52.615+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.deny' to 'a'
2014-05-07 02:35:52.615+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 136:* rw'
2014-05-07 02:35:52.616+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:3 rw'
2014-05-07 02:35:52.616+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:7 rw'
2014-05-07 02:35:52.617+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:5 rw'
2014-05-07 02:35:52.617+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:8 rw'
2014-05-07 02:35:52.617+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:8 rw'
2014-05-07 02:35:52.617+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 1:9 rw'
2014-05-07 02:35:52.618+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 5:2 rw'
2014-05-07 02:35:52.618+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 10:232 rw'
2014-05-07 02:35:52.618+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 254:0 rw'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 10:228 rw'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/r6/devices.allow' to 'c 10:228 rw'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/r6/cpuset.mems' to '2,4,7'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/cpuset.cpus' to '1-3,5-7,9-11,13-15,17-19,21-23,25-27,29-31'
2014-05-07 02:35:52.619+0000: 27283: debug : virCgroupSetValueStr:335 : Failed to write value '1-3,5-7,9-11,13-15,17-19,21-23,25-27,29-31': Device or resource busy

the last virCgroupSetValueStr() to /cgroup/cpuset/libvirt/qemu/cpuset.cpus is forbid, because the problem of cgroup in rhel6.

Here is a libvirt problem: why need to set /cgroup/cpuset/libvirt/qemu/cpuset.cpus after numad returned?

The affected code is in libvirt-rhel/src/qemu/qemu_cgroup.c:

int qemuSetupCgroup(struct qemud_driver *driver,
...
458        rc = virCgroupSetCpusetCpus(driver->cgroup, cpu_mask);
459        VIR_FREE(cpu_mask);
460        if (rc != 0) {
461            virReportSystemError(-rc,
462                                 _("Unable to set cpuset.cpus for domain %s"),
463                                 vm->def->name);
464            goto cleanup;
465        }

I thought we could just set /cgroup/cpuset/libvirt/qemu/r6/cpuset.cpus instead.

Comment 11 Jincheng Miao 2014-05-20 03:57:05 UTC
The latest libvirt-0.10.2-36.el6 fixed this bug:
# virsh edit rhel65
<domain>
  ...
  <vcpu placement='auto'>24</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
  ...
</domain>

# virsh start rhel65

the numad returns nodes '2,4,7'
2014-05-20 03:59:50.930+0000: 40636: debug : qemuProcessStart:3858 : Nodeset returned from numad: 2,4,7

and libvirtd only pin vcpus to numad returned cpus.
# virsh vcpuinfo rhel65 | grep -w "CPU:"
CPU:            6
CPU:            9
CPU:            21
CPU:            1
CPU:            7
CPU:            11
CPU:            1
CPU:            19
CPU:            18
CPU:            27
CPU:            9
CPU:            26
CPU:            7
CPU:            7
CPU:            2
CPU:            1
CPU:            19
CPU:            17
CPU:            5
CPU:            10
CPU:            2
CPU:            9
CPU:            27
CPU:            11

And the problem of comment 6 is also fixed, some info from log file:
2014-05-20 03:59:50.933+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/rhel65/cpuset.cpus' to '0-63'
2014-05-20 03:59:50.933+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/rhel65/cpuset.mems' to '0-7'
2014-05-20 03:59:50.933+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/memory/libvirt/qemu/rhel65/memory.use_hierarchy' to '1'
2014-05-20 03:59:50.933+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.deny' to 'a'
2014-05-20 03:59:50.934+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 136:* rw'
2014-05-20 03:59:50.934+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:3 rw'
2014-05-20 03:59:50.934+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:7 rw'
2014-05-20 03:59:50.935+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:5 rw'
2014-05-20 03:59:50.935+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:8 rw'
2014-05-20 03:59:50.935+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 1:9 rw'
2014-05-20 03:59:50.935+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 5:2 rw'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 10:232 rw'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 254:0 rw'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/rhel65/devices.allow' to 'c 10:228 rw'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/rhel65/cpuset.mems' to '2,4,7'
2014-05-20 03:59:50.936+0000: 40636: debug : virCgroupSetValueStr:331 : Set value '/cgroup/cpuset/libvirt/qemu/rhel65/cpuset.cpus' to '1-3,5-7,9-11,13-15,17-19,21-23,25-27,29-31'

The cpuset.cpus is set exactly to target guest. 

So I change the status to VERIFIED

Comment 13 errata-xmlrpc 2014-10-14 04:17:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1374.html


Note You need to log in before you can comment on or make changes to this bug.