Bug 1126777

Summary: guest which set numa in xml can't start success
Product: Red Hat Enterprise Linux 7 Reporter: Luyao Huang <lhuang>
Component: qemu-kvm-rhevAssignee: Eduardo Habkost <ehabkost>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dyuan, hhuang, huding, jiahu, jmiao, juzhang, knoel, lhuang, mprivozn, mrezanin, mzhan, rbalakri, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: All   
Whiteboard:
Fixed In Version: QEMU 2.1.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 09:48:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luyao Huang 2014-08-05 09:13:29 UTC
Description of problem:
guest which set numa in xml can't start success when set memory='101200'

Version-Release number of selected component (if applicable):
libvirt-1.2.7-1.el7.x86_64
kernel-3.10.0-140.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.prepare a guest with :
#  virsh dumpxml test6
  <memory unit='KiB'>1042432</memory>
  <currentMemory unit='KiB'>1041857</currentMemory>
<vcpu placement='static'>4</vcpu>
...........

  <cpu>
    <numa>
      <cell  cpus='0-1' memory='1012000'/>
      <cell  cpus='2-3' memory='1012000'/>
    </numa>
  </cpu>
........
2.# virsh start test6
Domain test6 started


3.Use virt-manager check guest status and guest will cannot start normal.
error in guest: PANIC:early exception 06 rip 10:ffffffffff81c4adb6 error 0 cr2 0

Actual results:
Guest OS can't start success and there only a error message in guest:
PANIC:early exception 06 rip 10:ffffffffff81c4adb6 error 0 cr2 0


Expected results:
Guest OS can start success

Additional info:
when set memory=512000 or memory=1312000 ,guest can start normal.

Comment 1 Michal Privoznik 2014-08-05 09:33:12 UTC
I don't think this is a libvirt bug. The qemu process was started successfully. The only problem I can see from libvirt POV is that the sum of memory defined under <numa/> exceeds the overall memory defined unde <memory/>. But if that's a problem for qemu, it should refuse to start.

Comment 3 Eduardo Habkost 2014-08-06 18:24:25 UTC
Starting on QEMU 2.1, it will refuse to start when an invalid NUMA config like the above is provided.

In either case, QEMU is simply doing exactly what you asked for. Some guests will just complain about the meaningless NUMA config you provided, some guests may crash.

Comment 5 FuXiangChun 2014-08-27 07:34:30 UTC
Reproduce bug with qemu-kvm-1.5.3-69.el7.x86_64

Boot guest with invalid numa configuration(NUMA nodes memory isn't equal RAM size)
...
-m 2048 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0-1,mem=509 -numa node,nodeid=1,cpus=2-3,mem=509
...

result:
guest can start normally. 

Verify this bug with qemu-kvm-rhev-2.1.0-2.el7.x86_64
 
Tested two scenarios.
1.Boot guest with invalid numa configuration(NUMA nodes memory don't equal RAM size)

-m 2048 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0-1,mem=509 -numa node,nodeid=1,cpus=2-3,mem=509

Result:
Guest can not start. and qemu-kvm show warning message.
qemu-kvm: total memory for NUMA nodes (1067450368) should equal RAM size (80000000)

2.Boot guest with valid numa configuration(NUMA nodes memory equal RAM size)

-m 1018 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0-1,mem=509 -numa node,nodeid=1,cpus=2-3,mem=509

Result:
Guest start successfully. 


Another, QE tested it with virt-manager as well, and got the same result as above.

Comment 8 errata-xmlrpc 2015-03-05 09:48:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0624.html