Hide Forgot
Description of problem: Start with the following NUMA VM: <domain type='kvm'> <name>2node0</name> <uuid>4a95fbdf-9fcf-4668-97da-5b7cdb0cc6c8</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <memoryBacking> <nosharepages/> <locked/> </memoryBacking> <vcpu placement='static'>4</vcpu> <numatune> <memory mode='strict' nodeset='0'/> </numatune> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-passthrough'> <numa> <cell id='0' cpus='0-1' memory='2097152' unit='KiB'/> <cell id='1' cpus='2-3' memory='2097152' unit='KiB'/> </numa> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <controller type='pci' index='0' model='pci-root'/> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <memballoon model='none'/> </devices> </domain> This allocates both nodes with no shared pages, locked on host node0. This VM works, however if we want to split the guest nodes between host nodes, we need memnode attributes, so we make the following change: @@ -9,7 +9,8 @@ </memoryBacking> <vcpu placement='static'>4</vcpu> <numatune> - <memory mode='strict' nodeset='0'/> + <memnode cellid='0' mode='strict' nodeset='0'/> + <memnode cellid='1' mode='strict' nodeset='1'/> </numatune> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type> (nodeset can be the same to reproduce on a single node host) This results in the following full XML: <domain type='kvm'> <name>2node0-1</name> <uuid>abc65d75-e745-4ed0-8cf9-1b849c1f031f</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <memoryBacking> <nosharepages/> <locked/> </memoryBacking> <vcpu placement='static'>4</vcpu> <numatune> <memnode cellid='0' mode='strict' nodeset='0'/> <memnode cellid='1' mode='strict' nodeset='1'/> </numatune> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-passthrough'> <numa> <cell id='0' cpus='0-1' memory='2097152' unit='KiB'/> <cell id='1' cpus='2-3' memory='2097152' unit='KiB'/> </numa> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <controller type='pci' index='0' model='pci-root'/> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <memballoon model='none'/> </devices> </domain> (name and uuid also changed) This VM definition will not start and gets the following error: 2015-06-30 14:39:57.013+0000: starting up libvirt version: 1.2.16, package: 1.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-06-04-04:03:16, x86-034.build.eng.bos.redhat.com), qemu version: 2.3.0 (qemu-kvm-rhev-2.3.0-6.el7) LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name 2node0-1 -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off,mem-merge=off -cpu host -m 4096 -realtime mlock=on -smp 4,sockets=4,cores=1,threads=1 -object memory-backend-ram,id=ram-node0,size=2147483648,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-ram,id=ram-node1,size=2147483648,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -uuid abc65d75-e745-4ed0-8cf9-1b849c1f031f -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/2node0-1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -msg timestamp=on Domain id=37 is tainted: host-cpu qemu-kvm: util/qemu-option.c:387: qemu_opt_get_bool_helper: Assertion `opt->desc && opt->desc->type == QEMU_OPT_BOOL' failed. 2015-06-30 14:40:03.938+0000: shutting down gdb gives the following backtrace: #0 0x00007ffff09905d7 in raise () from /lib64/libc.so.6 #1 0x00007ffff0991cc8 in abort () from /lib64/libc.so.6 #2 0x00007ffff0989546 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff09895f2 in __assert_fail () from /lib64/libc.so.6 #4 0x0000555555857e3f in qemu_opt_get_bool_helper (opts=0x5555561677e0, name=name@entry=0x55555588502e "mem-merge", defval=defval@entry=true, del=del@entry=false) at util/qemu-option.c:387 #5 0x000055555585818a in qemu_opt_get_bool (opts=<optimized out>, name=name@entry=0x55555588502e "mem-merge", defval=defval@entry=true) at util/qemu-option.c:397 #6 0x00005555556f2b04 in host_memory_backend_init (obj=0x555556179a90) at backends/hostmem.c:234 #7 0x00005555557a2d39 in object_init_with_type (obj=0x555556179a90, ti=0x55555613ac80) at qom/object.c:309 #8 0x00005555557a31ef in object_initialize_with_type ( data=data@entry=0x555556179a90, size=<optimized out>, type=type@entry=0x55555613ac80) at qom/object.c:343 #9 0x00005555557a3341 in object_new_with_type (type=0x55555613ac80) at qom/object.c:429 #10 0x00005555557a33b5 in object_new ( typename=typename@entry=0x555556179730 "memory-backend-ram") at qom/object.c:439 #11 0x00005555556e1db5 in object_add ( type=0x555556179730 "memory-backend-ram", id=0x5555561796c0 "ram-node0", qdict=qdict@entry=0x5555561783b0, v=0x555556179850, errp=errp@entry=0x7fffffffdbc8) at qmp.c:643 #12 0x00005555556cf144 in object_create (opts=<optimized out>, opaque=<optimized out>) at vl.c:2632 #13 0x0000555555858edb in qemu_opts_foreach (list=<optimized out>, func=func@entry=0x5555556cefa0 <object_create>, opaque=opaque@entry=0x0, abort_on_failure=abort_on_failure@entry=0) at util/qemu-option.c:1059 #14 0x00005555555e0857 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4040 This implicates the mem-merge option, which is controlled via <nosharepages/> Removing the option from the XML allows the VM to start, but should not be necessary. Version-Release number of selected component (if applicable): qemu-kvm-rhev-2.3.0-6.el7.x86_64 libvirt-1.2.16-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Define the above VMs 2. 3. Actual results: Cannot use <nosharepages/> in combination with //numatune/memnode Expected results: //numatune/memnode should be compatible with <nosharepages/> Additional info:
This works with qemu-kvm-rhev-2.1.2-23.el7_1.4.x86_64 therefore adding regression keyword
Here's the history afaict, 2.1.0 was broken wrt numa pinning and fixed by: commit 288d3322022d6ad646407f3ca6f1a6a746565b9a Author: Michael S. Tsirkin <mst> Date: Wed Aug 13 13:50:24 2014 +0200 hostmem: set MPOL_MF_MOVE When memory is allocated on a wrong node, MPOL_MF_STRICT doesn't move it - it just fails the allocation. A simple way to reproduce the failure is with mlock=on realtime feature. The code comment actually says: "ensure policy won't be ignored" so setting MPOL_MF_MOVE seems like a better way to do this. Cc: qemu-stable Signed-off-by: Michael S. Tsirkin <mst> This was part of v2.1.1 and therefore included in our v2.1.2 The VM described in comment 0 continued to work until: commit 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6 Author: Marcel Apfelbaum <marcel.a> Date: Tue Dec 16 16:58:05 2014 +0000 machine: remove qemu_machine_opts global list QEMU has support for options per machine, keeping a global list of options is no longer necessary. Signed-off-by: Marcel Apfelbaum <marcel.a> Reviewed-by: Alexander Graf <agraf> Reviewed-by: Greg Bellows <greg.bellows> Message-id: 1418217570-15517-2-git-send-email-marcel.a Signed-off-by: Peter Maydell <peter.maydell> At which point libvirt would error with: error: Failed to start domain 2node0-1 error: unsupported configuration: disable shared memory is not available with this QEMU binary An attempt was made to fix this in: commit 0a7cf217d81161e36af2344e911d56d4f9fef9c5 Author: Marcel Apfelbaum <marcel> Date: Wed Apr 1 19:47:21 2015 +0300 util/qemu-config: fix regression of qmp_query_command_line_options Commit 49d2e64 (machine: remove qemu_machine_opts global list) made machine options specific to machine sub-type, leaving the qemu_machine_opts desc array empty. Sadly this is the place qmp_query_command_line_options is looking for supported options. As a fix for for 2.3 the machine_qemu_opts (the generic ones) are restored only for qemu-config scope. We need to find a better fix for 2.4. Reported-by: Tony Krowiak <akrowiak.ibm.com> Signed-off-by: Marcel Apfelbaum <marcel> Message-Id: <1427906841-1576-1-git-send-email-marcel> Signed-off-by: Paolo Bonzini <pbonzini> But this just gets us to the current situation where QEMU v2.3 is broken, as well as current upstream, with the following error: error: Failed to start domain 2node0-1 error: internal error: process exited while connecting to monitor: qemu-system-x86_64: util/qemu-option.c:387: qemu_opt_get_bool_helper: Assertion `opt->desc && opt->desc->type == QEMU_OPT_BOOL' failed.
Hi Alex. QE found mem-merge=off cause this bug. If use mem-merge=on(this is default value) qemu-kvm process works. Before, QE tested "-object memory-backend-ram" with mem-merge's default value. Do QE need to cover this two scenarios(mem-merge=on|off)? If need, I will add this scenario to test plan and test case. Thanks. from qemu-kvm manual mem-merge=on|off Enables or disables memory merge support. This feature, when supported by the host, de-duplicates identical memory pages among VMs instances (enabled by default)
AFAICT it's a supported option and something a user might reasonably turn on, especially if they're using hugepages or device assignment.
(In reply to Alex Williamson from comment #6) > AFAICT it's a supported option and something a user might reasonably turn > on, especially if they're using hugepages or device assignment. Got it, the default should be on according to "man qemu-kvm" but I would like to double confirm with you, it's right? (QE can not find this option via 'info qtree' on HMP)
(In reply to FuXiangChun from comment #7) > (In reply to Alex Williamson from comment #6) > > AFAICT it's a supported option and something a user might reasonably turn > > on, especially if they're using hugepages or device assignment. > > Got it, the default should be on according to "man qemu-kvm" but I would > like to double confirm with you, it's right? (QE can not find this option > via 'info qtree' on HMP) Yes, the <nosharepages/> libvirt XML option translates to mem-merge=off on the QEMU commandline. The default is 'on'.
Fix included in qemu-kvm-rhev-2.3.0-13.el7
summary: I re-tested this bug with qemu-kvm-rhev-2.3.0-13.el7.x86_64. And the results prove that this bug has been fixed well. detail(mem-merge=off and mem-merge=on both are OK): #/usr/libexec/qemu-kvm -name 2node0-1 -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off,mem-merge=off -cpu host -m 4096 -realtime mlock=on -smp 4,sockets=4,cores=1,threads=1 -object memory-backend-ram,id=ram-node0,size=2147483648,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-ram,id=ram-node1,size=2147483648,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -uuid abc65d75-e745-4ed0-8cf9-1b849c1f031f -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/2node0-1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -msg timestamp=on -monitor stdio QEMU 2.3.0 monitor - type 'help' for more information (qemu) info status VM status: running # /usr/libexec/qemu-kvm -name 2node0-1 -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off,mem-merge=on -cpu host -m 4096 -realtime mlock=on -smp 4,sockets=4,cores=1,threads=1 -object memory-backend-ram,id=ram-node0,size=2147483648,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-ram,id=ram-node1,size=2147483648,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -uuid abc65d75-e745-4ed0-8cf9-1b849c1f031f -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/2node0-1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -msg timestamp=on -monitor stdio QEMU 2.3.0 monitor - type 'help' for more information (qemu) info status VM status: running
According to comment12, set this issue as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2546.html