Bug 1170093
Summary: | guest NUMA failed to migrate when machine is rhel6.5.0 | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jincheng Miao <jmiao> | |
Component: | qemu-kvm-rhev | Assignee: | Eduardo Habkost <ehabkost> | |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 7.1 | CC: | amit.shah, dgilbert, dyuan, ehabkost, hhuang, honzhang, huding, jen, juzhang, knoel, lersek, lhuang, lmiksik, mprivozn, mzhan, qiguo, quintela, virt-maint, vivianzhang, xfu | |
Target Milestone: | rc | Keywords: | Regression | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-rhev-2.1.2-17.el7 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1175397 (view as bug list) | Environment: | ||
Last Closed: | 2015-03-05 09:59:12 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1175397 |
Description
Jincheng Miao
2014-12-03 08:53:01 UTC
It is caused by the hack added to fix bug 1027565, which breaks when using NUMA and memory-backend objects. hi, Eduardo Habkost I found a similar issue, please help confirm whether they are caused by the same reason and could be fixed with the same patch thanks Description: migration failed with error when configure guest with OVMF bios + machine type=rhel6.5.0 when machine type is set lower than rhel6.5.0, such as rhel6.4.0, migration failed with the same error. Product version libvirt-1.2.8-10.el7.x86_64 qemu-kvm-rhev-2.1.2-15.el7.x86_64 OVMF-20140822-7.git9ece15a.el7.x86_64 How producible 100% Steps: 1. Prepare a migration env with nfs img between source and target host 2. make sure source and target host has been installed OVMF # rpm -q OVMF OVMF-20140822-7.git9ece15a.el7.x86_64 3. install a UEFI guest with virt-manger, make sure the guest with below configuration, set machine type='rhel6.5.0', and OVMF bios in guest xml # virsh dumpxml rhel7new ... <os> <type arch='x86_64' machine='rhel6.5.0'>hvm</type> <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader> <nvram template='/usr/share/OVMF/OVMF_VARS.fd'>/var/lib/libvirt/qemu/nvram/rhel7new_VARS.fd</nvram> <boot dev='hd'/> </os> ... 4. start guest, it works well # # virsh list --all Id Name State ---------------------------------------------------- 27 rhel7new running 5. do migration for this guest, met qemu-kvm error # virsh migrate rhel7new --live qemu+ssh://10.66.6.205/system --verbose root.6.205's password: Migration: [100 %]error: internal error: early end of file from monitor: possible problem: RHEL-6 compat: ich9-usb-uhci1: irq_pin = 3 RHEL-6 compat: ich9-usb-uhci2: irq_pin = 3 RHEL-6 compat: ich9-usb-uhci3: irq_pin = 3 qemu-kvm: /builddir/build/BUILD/qemu-2.1.2/savevm.c:906: shadow_bios: Assertion `bios != ((void *)0)' failed. 6. when modify machine type to pc-i440fx-rhel7.1.0 or pc-i440fx-rhel7.0.0, migration could success 7. when delete OVMF bios configuration <nvram template='/usr/share/OVMF/OVMF_VARS.fd'></nvram>, migration could also success Actual result: migration failed with error Expected result: migration should success when configure guest with OVMF bios + machine type=rhel6.5.0 Fix included in qemu-kvm-rhev-2.1.2-17.el7 Hello Jeff I try with the fix build qemu-kvm-rhev-2.1.2-17.el7 bug migration still failed with OVMF bios + machine type=rhel6.5.0 # virsh migrate rhel7new --live qemu+ssh://10.66.6.205/system --verbose root.6.205's password: Migration: [100 %]error: internal error: early end of file from monitor: possible problem: RHEL-6 compat: ich9-usb-uhci1: irq_pin = 3 RHEL-6 compat: ich9-usb-uhci2: irq_pin = 3 RHEL-6 compat: ich9-usb-uhci3: irq_pin = 3 2014-12-17T05:41:51.612642Z qemu-kvm: usb-redir warning: usb-redir connection broken during migration qemu-kvm: /builddir/build/BUILD/qemu-2.1.2/savevm.c:906: shadow_bios: Assertion `bios != ((void *)0)' failed. so I filed a new bug 1175099 to track this OVMF is completely unsupported on the rhel6.5.0 machine type. As explained in Comment 10, the issue described in Comment 6 (and again in Comment 9) is not a bug, it's an invalid configuration. Please re-verify using the configuration as described in the original problem report. I've proposed patch on the libvirt's upstream list: https://www.redhat.com/archives/libvir-list/2014-December/msg00931.html And just pushed the patch: commit f309db1f4d51009bad0d32e12efc75530b66836b Author: Michal Privoznik <mprivozn> AuthorDate: Thu Dec 18 12:36:48 2014 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Fri Dec 19 07:44:44 2014 +0100 qemu: Create memory-backend-{ram,file} iff needed Libvirt BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1175397 QEMU BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1170093 In qemu there are two interesting arguments: 1) -numa to create a guest NUMA node 2) -object memory-backend-{ram,file} to tell qemu which memory region on which host's NUMA node it should allocate the guest memory from. Combining these two together we can instruct qemu to create a guest NUMA node that is tied to a host NUMA node. And it works just fine. However, depending on machine type used, there might be some issued during migration when OVMF is enabled (see QEMU BZ). While this truly is a QEMU bug, we can help avoiding it. The problem lies within the memory backend objects somewhere. Having said that, fix on our side consists on putting those objects on the command line if and only if needed. For instance, while previously we would construct this (in all ways correct) command line: -object memory-backend-ram,size=256M,id=ram-node0 \ -numa node,nodeid=0,cpus=0,memdev=ram-node0 now we create just: -numa node,nodeid=0,cpus=0,mem=256 because the backend object is obviously not tied to any specific host NUMA node. Signed-off-by: Michal Privoznik <mprivozn> v1.2.11-60-gf309db1 Reproduced this bug with qemu-kvm-rhev-2.1.2-13.el7.x86_64 steps: 1.Boot guest with memory-backend object and numa: # /usr/libexec/qemu-kvm -cpu Penryn -machine rhel6.5.0,accel=kvm,usb=off -m 3072 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -object memory-backend-ram,size=1024M,id=ram-node0 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-ram,size=2048M,id=ram-node1 -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -enable-kvm -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x3 -name test -nodefaults -nodefconfig -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -spice disable-ticketing,port=5001 -vga qxl -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -drive file=/mnt/rhel7-64.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,aio=native,id=scsi-disk0 -device virtio-scsi-pci,id=bus2,bus=pci.0,addr=0x5 -device scsi-hd,bus=bus2.0,drive=scsi-disk0,id=disk0 -netdev tap,id=netdev1,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,bus=pci.0,addr=0x6,netdev=netdev1,id=vn2,mac=02:48:a7:f1:00:48 -boot menu=on -qmp unix:/tmp/q1,server,nowait -monitor unix:/tmp/m1,server,nowait 2.Launch dst qemu with listening mode in dst node: # /usr/libexec/qemu-kvm -cpu Penryn -machine rhel6.5.0,accel=kvm,usb=off -m 3072 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -object memory-backend-ram,size=1024M,id=ram-node0 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-ram,size=2048M,id=ram-node1 -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -enable-kvm -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x3 -name test -nodefaults -nodefconfig -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -spice disable-ticketing,port=5001 -vga qxl -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -drive file=/mnt/rhel7-64.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,aio=native,id=scsi-disk0 -device virtio-scsi-pci,id=bus2,bus=pci.0,addr=0x5 -device scsi-hd,bus=bus2.0,drive=scsi-disk0,id=disk0 -netdev tap,id=netdev1,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,bus=pci.0,addr=0x6,netdev=netdev1,id=vn2,mac=02:48:a7:f1:00:48 -boot menu=on -qmp unix:/tmp/q1,server,nowait -monitor unix:/tmp/m1,server,nowait -incoming tcp:0:4444 3.Migrate Result: In dst, qemu core dumpd: qemu-kvm: /builddir/build/BUILD/qemu-2.1.2/savevm.c:904: shadow_bios: Assertion `ram != ((void *)0)' failed. Aborted (core dumped) So this bug is reproduced Verify this bug with qemu-kvm-rhev-2.1.2-17.el7.x86_64 Steps: Try to boot guest with rhel6.5.0 machine type and together with numa memdev # /usr/libexec/qemu-kvm -cpu Penryn -machine rhel6.5.0,accel=kvm,usb=off -m 3072 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -object memory-backend-ram,size=1024M,id=ram-node0 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-ram,size=2048M,id=ram-node1 -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -enable-kvm -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x3 -name test -nodefaults -nodefconfig -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -spice disable-ticketing,port=5001 -vga qxl -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -drive file=/mnt/rhel7-64.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,aio=native,id=scsi-disk0 -device virtio-scsi-pci,id=bus2,bus=pci.0,addr=0x5 -device scsi-hd,bus=bus2.0,drive=scsi-disk0,id=disk0 -netdev tap,id=netdev1,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,bus=pci.0,addr=0x6,netdev=netdev1,id=vn2,mac=02:48:a7:f1:00:48 -boot menu=on -qmp unix:/tmp/q1,server,nowait -monitor unix:/tmp/m1,server,nowait Results: (qemu) qemu-kvm: -numa memdev is not supported by machine rhel6.5.0 So the fixed qemu-kvm-rhev does not support this configuration, so this bug is fixed. Additionanly If boot only with -numa node,nodeid=0,cpus=0-1,memdev=2048M -numa node,nodeid=1,cpus=2-3,memdev=2048M but w/o -object, the migration can finished successfully both with fix and unfix version. (In reply to Jeff Nelson from comment #11) > As explained in Comment 10, the issue described in Comment 6 (and again in > Comment 9) is not a bug, it's an invalid configuration. > > Please re-verify using the configuration as described in the original > problem report. Hello, Jeff sorry for my late response. I have used the original configuration described in this bug to verify this issue. from libvirt view, I can get below result, please help check is it an expected behaviour? version: libvirt-1.2.8-11.el7.x86_64 qemu-kvm-rhev-2.1.2-17.el7.x86_64 steps: 1. prepare a guest with numa and -M rhel6.5.0 # virsh dumpxml rhel7 .... <os> <type arch='x86_64' machine='rhel6.5.0'>hvm</type> <loader readonly='yes' type='rom'>/usr/share/seabios/bios.bin</loader> <boot dev='hd'/> <bootmenu enable='yes' timeout='3000'/> </os> ... <cpu> <numa> <cell id='0' cpus='0-1' memory='1048576'/> </numa> </cpu> ... 2. start guest, report error to block guest boot up # virsh start rhel7 error: Failed to start domain rhel7 error: internal error: early end of file from monitor: possible problem: 2014-12-24T01:30:15.569450Z qemu-kvm: -numa memdev is not supported by machine rhel6.5.0 Looks OK to me, but deferring to BZ owner to confirm. If you need migration to work using libvirt + rhel6.5.0 machine-type + NUMA, you need the fix for bug 1175397. That means using libvirt-1.2.8-12.el7 or newer. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0624.html |