RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1161540 - kvm_init_vcpu failed for cpu hot-plugging in NUMA
Summary: kvm_init_vcpu failed for cpu hot-plugging in NUMA
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.1
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Martin Kletzander
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-07 10:23 UTC by Jincheng Miao
Modified: 2020-09-10 09:20 UTC (History)
9 users (show)

Fixed In Version: libvirt-1.2.8-15.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-03-05 07:47:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:0323 0 normal SHIPPED_LIVE Low: libvirt security, bug fix, and enhancement update 2015-03-05 12:10:54 UTC

Description Jincheng Miao 2014-11-07 10:23:27 UTC
Description of problem:
hot-plug vcpu in NUMA will casue guest exits.
From the guest log, it saids:
"kvm_init_vcpu failed: Cannot allocate memory"

version:
libvirt-1.2.8-6.el7.x86_64
qemu-kvm-1.5.3-77.el7.x86_64
kernel-3.10.0-195.el7.x86_64

How reproducible:
100%

Step to reproduce:

0. prepare a NUMA, which DMA32 zone is in Node 0.
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 65514 MB
node 0 free: 63531 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 65536 MB
node 1 free: 62750 MB
node distances:
node   0   1 
  0:  10  11 
  1:  11  10 

# grep "zone    DMA" /proc/zoneinfo
Node 0, zone    DMA32

1. start a guest with 2 current used vcpu
# virsh dumpxml r71
...
  <vcpu placement='auto' current='2'>4</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
...

# virsh start r71

numad suggests bind memory to Node 1.
# cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dr71.scope/cpuset.mems 
1


2. hot-plug vcpu
# virsh setvcpus r71 3
error: Unable to read from monitor: Connection reset by peer


Expect result:
hot-plug works.


Work around:
before you hot-plug vcpu, change domain's emulator pinning:
# echo 0-1 > /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dr71.scope/cpuset.mems

# echo 0-1 > /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dr71.scope/emulator/cpuset.mems

# virsh setvcpus r71 3

# virsh vcpucount r71
maximum      config         4
maximum      live           4
current      config         2
current      live           3

Comment 1 Martin Kletzander 2014-12-15 08:05:01 UTC
Upstream patch proposed:

https://www.redhat.com/archives/libvir-list/2014-December/msg00718.html

Comment 9 Martin Kletzander 2014-12-23 12:00:32 UTC
There are still some problems with this and they might be bigger than what we think.  Latest ideas are discussed upstream:

https://www.redhat.com/archives/libvir-list/2014-December/msg00998.html

Comment 15 Jincheng Miao 2015-01-26 08:51:14 UTC
This bug is fixed in libvirt-1.2.8-15.el7:

1. prepare NUMA host and older libvirt
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 65514 MB
node 0 free: 62974 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 65536 MB
node 1 free: 62821 MB
node distances:
node   0   1 
  0:  10  11 
  1:  11  10 

# grep DMA32 /proc/zoneinfo 
Node 0, zone    DMA32

# rpm -q libvirt
libvirt-1.2.8-13.el7

2. config guest xml, mbind to host node 1 (the node without DMA32 zone)
# virsh edit rhel7
...
  <vcpu placement='static' current='2'>5</vcpu>
  <numatune>
    <memory mode='strict' nodeset='1'/>
  </numatune>
...

3. start guest
# virsh start rhel7

4. Hotplug vcpu
# virsh setvcpus rhel7 3
error: Unable to read from monitor: Connection reset by peer

5. running a guest and upgrade to libvirt-1.2.8-15.el7
# virsh start rhel7

# yum install libvirt

# rpm -q libvirt
libvirt-1.2.8-15.el7.x86_64

6. hotplug vcpu again
# virsh setvcpus rhel7 3

# virsh vcpucount rhel7
maximum      config         5
maximum      live           5
current      config         2
current      live           3

# virsh destroy rhel7
Domain rhel7 destroyed

7. restart guest and hotplug vcpu on libvirt-1.2.8-15.el7
# virsh start rhel7
Domain rhel7 started

# virsh setvcpus rhel7 3

# virsh vcpucount rhel7
maximum      config         5
maximum      live           5
current      config         2
current      live           3

# virsh destroy rhel7
Domain rhel7 destroyed

According to above 7 steps, this bug is fixed, and I will change the status to VERIFIED.

Comment 17 errata-xmlrpc 2015-03-05 07:47:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0323.html


Note You need to log in before you can comment on or make changes to this bug.