Bug 1534418

Summary: guest can start with huge pages set to non-exist guest numa node
Product: Red Hat Enterprise Linux 7 Reporter: yalzhang <yalzhang>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: chhu
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.6CC: lhuang, mprivozn, xuzhang
Target Milestone: rcKeywords: Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-4.4.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 09:52:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yalzhang@redhat.com 2018-01-15 08:21:57 UTC
Description of problem:
guest can start with huge pages set to non-exist guest numa node

Version-Release number of selected component (if applicable):
libvirt-3.9.0-7.el7.x86_64
qemu-kvm-rhev-2.10.0-16.el7.x86_64

How reproducible:
100%

Steps to Reproduce:

1. set guest xml to use huge page with nodeset='3', while no 'numa' settings in guest cpu element

# virsh edit rhel
...
 <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>1024000</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='3'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static' current='6'>10</vcpu>
...
 <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>
...

2. start the guest

# virsh start rhel
Domain rhel started

# virsh dumpxml rhel 
...
 <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='3'/>
    </hugepages>
  </memoryBacking>
<vcpu placement='static' current='6'>10</vcpu>
....

3. check the qemu command line, guest did use hugepages:
... -m 1000 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/3-rhel -realtime mlock=off ...

Actual results:
in step 2, guest can start with hugepages set to non-exist guest numa node

Expected results:
libvirt fail in xml validation while 'virsh edit', or ignore the wrong 'nodeset' in live xml, or the guest fail to start 

Additional info:
If there is numa settings in cpu element, and hugepages set to non-exist guest node, the guest will fail to start.
# virsh edit rhel
...
 <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='3'/>
    </hugepages>
  </memoryBacking>
...
 <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
    <numa>
      <cell id='0' cpus='0-4' memory='512000' unit='KiB'/>
      <cell id='1' cpus='5-9' memory='512000' unit='KiB'/>
    </numa>
  </cpu>
...
# virsh start rhel
error: Failed to start domain rhel
error: hugepages: node 3 not found

Comment 2 Michal Privoznik 2018-05-18 11:05:15 UTC
Patch posted upstream:

https://www.redhat.com/archives/libvir-list/2018-May/msg01408.html

Comment 3 Michal Privoznik 2018-05-23 07:05:12 UTC
To POST:

commit fa6bdf6afa878b8d7c5ed71664ee72be8967cdc5
Author:     Michal Privoznik <mprivozn>
AuthorDate: Fri May 18 12:54:46 2018 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Wed May 23 09:00:20 2018 +0200

    qemu: Deny hugepages for non-existent NUMA nodes
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1534418
    
    Just like ec982f6d929f3c23 denies hugepages for non-existent
    guest NUMA nodes in case there are some nodes configured.
    Unfortunately, when there are none, qemuBuildNumaArgStr() is not
    called and thus we have to have check in qemuBuildMemPathStr()
    too.
    
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: John Ferlan <jferlan>

v4.3.0-256-gfa6bdf6afa

Comment 5 chhu 2018-06-20 07:10:17 UTC
Verify on packages:
libvirt-4.4.0-2.el7.x86_64
qemu-kvm-rhev-2.12.0-3.el7.x86_64
kernel: 3.10.0-902.el7.x86_64

Steps:
1. Define and try to start a guest with huge page with nodeset='1',
while no 'numa' settings in guest cpu element, get error.

  <memoryBacking>
    <hugepages>
      <page size='2147483648' unit='KiB' nodeset='1'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static' current='6'>10</vcpu>
...
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>
...

# virsh start r7
error: Failed to start domain r7
error: hugepages: node 1 not found

2. Define and start a guest with huge page nodeset and numa settings successfully.
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='1'/>
    </hugepages>
  </memoryBacking>
...
  <cpu mode='host-model'>
    <numa>
      <cell id='0' cpus='0-3' memory='512000' unit='KiB' discard='yes'/>
      <cell id='1' cpus='4-7' memory='512000' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
...

# virsh start r7
Domain r7 started

# ps -ef|grep hugepage
...
 -object memory-backend-ram,id=ram-node0,size=524288000 -numa node,nodeid=0,cpus=0-3,memdev=ram-node0 -object memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/39-r7,share=yes,size=524288000 -numa node,nodeid=1,cpus=4-7,memdev=ram-node1
...

3. Destroy the guest, and modify the hugepage nodeset to '2', 
start the the guest, get error.
# virsh start r7
error: Failed to start domain r7
error: hugepages: node 2 not found

Comment 6 chhu 2018-06-20 07:25:33 UTC
4. Add dimm device with node set to '2', try to start the guest, get error, however, the error message need improvement.

# virsh dumpxml r7|grep dimm -A 8
    <memory model='dimm'>
      <source>
        <nodemask>1</nodemask>
        <pagesize unit='KiB'>2048</pagesize>
      </source>
      <target>
        <size unit='KiB'>256000</size>
        <node>2</node>
      </target>
      <address type='dimm' slot='0'/>
    </memory>
  </devices>
</domain>

# virsh start r7
error: Failed to start domain r7
error: unsupported configuration: can't add memory backend for guest node '2' as the guest has only '2' NUMA nodes configured

5. Delete the numa setting and try to start the guest, get error.
# virsh start r7
error: Failed to start domain r7
error: unsupported configuration: At least one numa node has to be configured when enabling memory hotplug

6. Add the numa setting back and edit the dimm device with <node>2</node>, start the guest successfully.

# virsh start r7
Domain r7 started

# ps -ef|grep qemu| grep dimm
...
-object memory-backend-ram,id=ram-node0,size=524288000 -numa node,nodeid=0,cpus=0-3,memdev=ram-node0 -object memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/53-r7,share=yes,size=524288000 -numa node,nodeid=1,cpus=4-7,memdev=ram-node1 -object memory-backend-file,id=memdimm0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/53-r7,share=yes,size=262144000,host-nodes=1,policy=bind -device pc-dimm,node=1,memdev=memdimm0,id=dimm0,slot=0
...

Comment 7 chhu 2018-06-20 07:27:48 UTC
(In reply to chhu from comment #6)
> 4. Add dimm device with node set to '2', try to start the guest, get error,
> however, the error message need improvement.
> 
> # virsh dumpxml r7|grep dimm -A 8
>     <memory model='dimm'>
>       <source>
>         <nodemask>1</nodemask>
>         <pagesize unit='KiB'>2048</pagesize>
>       </source>
>       <target>
>         <size unit='KiB'>256000</size>
>         <node>2</node>
>       </target>
>       <address type='dimm' slot='0'/>
>     </memory>
>   </devices>
> </domain>
> 
> # virsh start r7
> error: Failed to start domain r7
> error: unsupported configuration: can't add memory backend for guest node
> '2' as the guest has only '2' NUMA nodes configured
> 

Hi, Michal 

Would you like to improve above error message in this bug? Thank you!


Regards,
chhu

Comment 8 Michal Privoznik 2018-06-20 08:43:00 UTC
(In reply to chhu from comment #7)
> (In reply to chhu from comment #6)

> > # virsh start r7
> > error: Failed to start domain r7
> > error: unsupported configuration: can't add memory backend for guest node
> > '2' as the guest has only '2' NUMA nodes configured
> > 
> 
> Hi, Michal 
> 
> Would you like to improve above error message in this bug? Thank you!

I don't quite see what is wrong with it. Is it the confusion that numa nodes are numbered from 0 and thus if you have two numa nodes (#0 and #1) there is no node #2? This is perfectly okay. Anybody who is setting NUMA is expected to know this. If they don't then they should not set up NUMA in the first place.

Comment 9 chhu 2018-06-20 09:38:33 UTC
> > > # virsh start r7
> > > error: Failed to start domain r7
> > > error: unsupported configuration: can't add memory backend for guest node
> > > '2' as the guest has only '2' NUMA nodes configured
> > > 
> > 
> > Hi, Michal 
> > 
> > Would you like to improve above error message in this bug? Thank you!
> 
> I don't quite see what is wrong with it. Is it the confusion that numa nodes
> are numbered from 0 and thus if you have two numa nodes (#0 and #1) there is
> no node #2? This is perfectly okay. Anybody who is setting NUMA is expected
> to know this. If they don't then they should not set up NUMA in the first
> place.

Ok, thank you!

Regards,
chhu

Comment 10 chhu 2018-06-25 01:24:06 UTC
According to comment 5,6,8, set the bug status to VERIFIED.

Comment 12 errata-xmlrpc 2018-10-30 09:52:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3113