Red Hat Bugzilla – Bug 1161540
kvm_init_vcpu failed for cpu hot-plugging in NUMA
Last modified: 2017-08-17 02:14:08 EDT
Description of problem: hot-plug vcpu in NUMA will casue guest exits. From the guest log, it saids: "kvm_init_vcpu failed: Cannot allocate memory" version: libvirt-1.2.8-6.el7.x86_64 qemu-kvm-1.5.3-77.el7.x86_64 kernel-3.10.0-195.el7.x86_64 How reproducible: 100% Step to reproduce: 0. prepare a NUMA, which DMA32 zone is in Node 0. # numactl --hard available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 node 0 size: 65514 MB node 0 free: 63531 MB node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 node 1 size: 65536 MB node 1 free: 62750 MB node distances: node 0 1 0: 10 11 1: 11 10 # grep "zone DMA" /proc/zoneinfo Node 0, zone DMA32 1. start a guest with 2 current used vcpu # virsh dumpxml r71 ... <vcpu placement='auto' current='2'>4</vcpu> <numatune> <memory mode='strict' placement='auto'/> </numatune> ... # virsh start r71 numad suggests bind memory to Node 1. # cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dr71.scope/cpuset.mems 1 2. hot-plug vcpu # virsh setvcpus r71 3 error: Unable to read from monitor: Connection reset by peer Expect result: hot-plug works. Work around: before you hot-plug vcpu, change domain's emulator pinning: # echo 0-1 > /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dr71.scope/cpuset.mems # echo 0-1 > /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dr71.scope/emulator/cpuset.mems # virsh setvcpus r71 3 # virsh vcpucount r71 maximum config 4 maximum live 4 current config 2 current live 3
Upstream patch proposed: https://www.redhat.com/archives/libvir-list/2014-December/msg00718.html
There are still some problems with this and they might be bigger than what we think. Latest ideas are discussed upstream: https://www.redhat.com/archives/libvir-list/2014-December/msg00998.html
This bug is fixed in libvirt-1.2.8-15.el7: 1. prepare NUMA host and older libvirt # numactl --hard available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 node 0 size: 65514 MB node 0 free: 62974 MB node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 node 1 size: 65536 MB node 1 free: 62821 MB node distances: node 0 1 0: 10 11 1: 11 10 # grep DMA32 /proc/zoneinfo Node 0, zone DMA32 # rpm -q libvirt libvirt-1.2.8-13.el7 2. config guest xml, mbind to host node 1 (the node without DMA32 zone) # virsh edit rhel7 ... <vcpu placement='static' current='2'>5</vcpu> <numatune> <memory mode='strict' nodeset='1'/> </numatune> ... 3. start guest # virsh start rhel7 4. Hotplug vcpu # virsh setvcpus rhel7 3 error: Unable to read from monitor: Connection reset by peer 5. running a guest and upgrade to libvirt-1.2.8-15.el7 # virsh start rhel7 # yum install libvirt # rpm -q libvirt libvirt-1.2.8-15.el7.x86_64 6. hotplug vcpu again # virsh setvcpus rhel7 3 # virsh vcpucount rhel7 maximum config 5 maximum live 5 current config 2 current live 3 # virsh destroy rhel7 Domain rhel7 destroyed 7. restart guest and hotplug vcpu on libvirt-1.2.8-15.el7 # virsh start rhel7 Domain rhel7 started # virsh setvcpus rhel7 3 # virsh vcpucount rhel7 maximum config 5 maximum live 5 current config 2 current live 3 # virsh destroy rhel7 Domain rhel7 destroyed According to above 7 steps, this bug is fixed, and I will change the status to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html