Bug 1182467

Summary: guest with NUMA topology cannot start without numatune memory nodeset
Product: Red Hat Enterprise Linux 7 Reporter: Jincheng Miao <jmiao>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dyuan, honzhang, lhuang, mzhan, pkrempa, rbalakri
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.2.13-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 06:08:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jincheng Miao 2015-01-15 08:25:40 UTC
Description of problem:
Stricting one guest numa node to hugetlbfs, the others are not,
and don't setup numatune memory nodeset, qemu-kvm-rhev will report
"memdev option must be specified for either all or no nodes"

Version-Release number of selected component (if applicable):
libvirt-1.2.8-12.el7.x86_64
qemu-kvm-rhev-2.1.2-19.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Only 'memnode' under <numatune>.
# virsh edit a
...
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>4</vcpu>
  <numatune>
    <memnode cellid='0' mode='strict' nodeset='1'/>
  </numatune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1048576'/>
      <cell id='1' cpus='2-3' memory='1048576'/>
    </numa>
  </cpu>
...

2. start guest
# virsh start a
error: Failed to start domain a
error: internal error: early end of file from monitor: possible problem:
2015-01-15T07:43:29.666260Z qemu-kvm: -numa node,nodeid=1,cpus=2-3,mem=1024: qemu: memdev option must be specified for either all or no nodes

Expect result:
start success.

Additional info:
workaround is add <memory mode="strict" nodeset="0"/> to <numatune>

Comment 1 Peter Krempa 2015-01-31 08:03:39 UTC
Fixed upstream:

commit b92a0037103efc15639dee9562866dbaffe302fb
Author: Peter Krempa <pkrempa>
Date:   Mon Jan 26 13:48:02 2015 +0100

    qemu: command: Don't combine old and modern NUMA node creation
    
    Change done by commit f309db1f4d51009bad0d32e12efc75530b66836b wrongly
    assumes that qemu can start with a combination of NUMA nodes specified
    with the "memdev" option and the appropriate backends, and the legacy
    way by specifying only "mem" as a size argument. QEMU rejects such
    commandline though:
    
    $ /usr/bin/qemu-system-x86_64 -S -M pc -m 1024 -smp 2 \
    -numa node,nodeid=0,cpus=0,mem=256 \
    -object memory-backend-ram,id=ram-node1,size=12345 \
    -numa node,nodeid=1,cpus=1,memdev=ram-node1
    qemu-system-x86_64: -numa node,nodeid=1,cpus=1,memdev=ram-node1: qemu: memdev option must be specified for either all or no nodes
    
    To fix this issue we need to check if any of the nodes requires the new
    definition with the backend and if so, then all other nodes have to use
    it too.

v1.2.12-64-gb92a003

Comment 3 Luyao Huang 2015-05-21 03:02:25 UTC
I can reproduce this issue with libvirt-1.2.8-13.el7.x86_64:

1. prepare a vm have xml like this:

# virsh dumpxml test3

  <numatune>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
    <boot dev='cdrom'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='host-passthrough'>
    <numa>
      <cell id='0' cpus='0' memory='512000'/>
      <cell id='1' cpus='1' memory='512000'/>
    </numa>
  </cpu>

2. try to start vm:

# virsh start test3
error: Failed to start domain test3
error: internal error: process exited while connecting to monitor: 2015-05-21T02:51:21.280059Z qemu-kvm: -numa node,nodeid=1,cpus=1,mem=500: qemu: memdev option must be specified for either all or no nodes

3. check qemu CLI:

-object memory-backend-ram,size=500M,id=ram-node0,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0,memdev=ram-node0 -numa node,nodeid=1,cpus=1,mem=500

And verify this issue with libvirt-1.2.15-2.el7.x86_64:

1. prepare a vm have xml like this:

# virsh dumpxml test3

  <numatune>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
    <boot dev='cdrom'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='host-passthrough'>
    <numa>
      <cell id='0' cpus='0' memory='512000'/>
      <cell id='1' cpus='1' memory='512000'/>
    </numa>
  </cpu>

2. start it 
# virsh start test3
Domain test3 started

3. check qemu CLI:

-object memory-backend-ram,id=ram-node0,size=524288000,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0,memdev=ram-node0 -object memory-backend-ram,id=ram-node1,size=524288000 -numa node,nodeid=1,cpus=1,memdev=ram-node1

4. remove the <memnode> in the xml:

  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
    <boot dev='cdrom'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='host-passthrough'>
    <numa>
      <cell id='0' cpus='0' memory='512000'/>
      <cell id='1' cpus='1' memory='512000'/>
    </numa>
  </cpu>

5. start the vm:
# virsh start test3
Domain test3 started

6. check qemu CLI

-numa node,nodeid=0,cpus=0,mem=500 -numa node,nodeid=1,cpus=1,mem=500

7. retest with hugepages bind to one node, guest can start

Comment 6 errata-xmlrpc 2015-11-19 06:08:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html