Red Hat Bugzilla – Bug 1074219
qemu core dump when install a RHEL.7 guest(xhci) with migration
Last modified: 2015-03-05 03:04:30 EST
Description of problem: qemu core dump when install a RHEL.7 guest(xhci) with migration Version-Release number of selected component (if applicable): kernel-3.10.0-99.el7.x86_64 qemu-kvm-rhev-1.5.3-52.el7.x86_64 How reproducible: always Steps to Reproduce: 1. Install a RHEL.7 guest with xhci device -device nec-usb-xhci,id=usb1,bus=pci.0,addr=03 \ -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.0-64-virtio.qcow2 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=04 \ -device virtio-net-pci,mac=9a:cd:ce:cf:d0:d1,id=idRrA4Ny,netdev=idVQ8RcV,bus=pci.0,addr=05 \ 2. Keep ping-pong migration during installation -incoming tcp:0:5200 (qemu) migrate -d tcp:0:5200 3. Actual results: qemu core dump during installation w/ ping-pong migration Expected results: guest installation successful w/o any error Additional info: 1. core dump: Core was generated by `/home/staf-kvm-devel/autotest-devel/client/tests/virt/qemu/qemu -S -name virt-t'. Program terminated with signal 11, Segmentation fault. #0 xhci_lookup_uport (xhci=xhci@entry=0x7f89e4538010, slot_ctx=slot_ctx@entry=0x7f8a1404bc60) at hw/usb/hcd-xhci.c:2104 2104 port = xhci->ports[port-1].uport->index+1; (gdb) bt #0 xhci_lookup_uport (xhci=xhci@entry=0x7f89e4538010, slot_ctx=slot_ctx@entry=0x7f8a1404bc60) at hw/usb/hcd-xhci.c:2104 #1 0x00007f8a141f6388 in usb_xhci_post_load (opaque=0x7f89e4538010, version_id=<optimized out>) at hw/usb/hcd-xhci.c:3457 #2 0x00007f8a142d5180 in vmstate_load_state (f=0x7f8a1673d120, vmsd=0x7f8a14721360 <vmstate_xhci>, opaque=0x7f89e4538010, version_id=1) at /usr/src/debug/qemu-1.5.3/savevm.c:1742 #3 0x00007f8a142d5d26 in qemu_loadvm_state (f=f@entry=0x7f8a1673d120) at /usr/src/debug/qemu-1.5.3/savevm.c:2257 #4 0x00007f8a142110de in process_incoming_migration_co (opaque=0x7f8a1673d120) at migration.c:105 #5 0x00007f8a1417c35a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:118 #6 0x00007f8a0ed2c570 in ?? () from /usr/lib64/libc-2.17.so #7 0x00007fff4bb733a0 in ?? () #8 0x0000000000000000 in ?? () (gdb) q 2. QEMU CML: /home/staf-kvm-devel/autotest-devel/client/tests/virt/qemu/qemu \ -S \ -name 'virt-tests-vm1' \ -sandbox off \ -M pc \ -nodefaults \ -vga cirrus \ -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20140307-172300-C4VjMdFZ,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20140307-172300-C4VjMdFZ,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20140307-172300-C4VjMdFZ,path=/tmp/seabios-20140307-172300-C4VjMdFZ,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20140307-172300-C4VjMdFZ,iobase=0x402 \ -device nec-usb-xhci,id=usb1,bus=pci.0,addr=03 \ -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.0-64-virtio.qcow2 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=04 \ -device virtio-net-pci,mac=9a:cd:ce:cf:d0:d1,id=idRrA4Ny,netdev=idVQ8RcV,bus=pci.0,addr=05 \ -netdev tap,id=idVQ8RcV,vhost=on,vhostfd=23,fd=22 \ -m 8192 \ -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 \ -cpu 'Westmere',+kvm_pv_unhalt \ -drive id=drive_cd1,if=none,snapshot=off,aio=native,media=cdrom,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/isos/linux/RHEL7.0-Server-x86_64.iso \ -device ide-cd,id=cd1,drive=drive_cd1,bootindex=2,bus=ide.0,unit=0 \ -drive id=drive_unattended,if=none,snapshot=off,aio=native,media=cdrom,file=/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/rhel70-64/ks.iso \ -device ide-cd,id=unattended,drive=drive_unattended,bootindex=3,bus=ide.0,unit=1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -kernel '/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/rhel70-64/vmlinuz' \ -append 'ksdevice=link ks=cdrom:/dev/sr1:/ks.cfg nicdelay=60 console=ttyS0,115200 console=tty0' \ -initrd '/home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/rhel70-64/initrd.img' \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=d,menu=off \ -no-kvm-pit-reinjection \ -no-shutdown \ -enable-kvm 3. cpuinfo: processor : 23 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz stepping : 2 microcode : 0x15 cpu MHz : 1596.000 cache size : 12288 KB physical id : 1 siblings : 12 core id : 10 cpu cores : 6 apicid : 53 initial apicid : 53 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5333.19 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management:
could be upstream commit f6969b9fef543da1ffa975d24f4d7b75dc369b03
Fix included in qemu-kvm-1.5.3-66.el7
Reproduce this bug as follow version: Host: # uname -r 3.10.0-99.el7.x86_64 # rpm -q qemu-kvm-rhev qemu-kvm-rhev-1.5.3-59.el7ev.x86_64 Steps: 1. Install a RHEL7 guest with xhci device 2.Keep ping-pong migration during installation Result: [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/bin/qemu-kvm -S -name virt-tests-vm1 -sandbox off -M pc -nodefaults -vga cirru'. Program terminated with signal 11, Segmentation fault. #0 xhci_lookup_uport (xhci=xhci@entry=0x7fb50c738010, slot_ctx=slot_ctx@entry=0x7fb520e98c60) at hw/usb/hcd-xhci.c:2104 2104 port = xhci->ports[port-1].uport->index+1; (gdb) bt #0 xhci_lookup_uport (xhci=xhci@entry=0x7fb50c738010, slot_ctx=slot_ctx@entry=0x7fb520e98c60) at hw/usb/hcd-xhci.c:2104 #1 0x00007fb521047d78 in usb_xhci_post_load (opaque=0x7fb50c738010, version_id=<optimized out>) at hw/usb/hcd-xhci.c:3457 #2 0x00007fb521126bd0 in vmstate_load_state (f=0x7fb5239d5980, vmsd=0x7fb521574320 <vmstate_xhci>, opaque=0x7fb50c738010, version_id=1) at /usr/src/debug/qemu-1.5.3/savevm.c:1749 #3 0x00007fb521127776 in qemu_loadvm_state (f=f@entry=0x7fb5239d5980) at /usr/src/debug/qemu-1.5.3/savevm.c:2264 #4 0x00007fb521062ace in process_incoming_migration_co ( opaque=0x7fb5239d5980) at migration.c:105 #5 0x00007fb520fcdd2a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:118 #6 0x00007fb51baff570 in ?? () from /usr/lib64/libc-2.17.so #7 0x00007fffc172ab50 in ?? () #8 0x0000000000000000 in ?? () Test on latest version: # uname -r 3.10.0-147.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-69.el7.x86_64 machine:model name : AMD Opteron(tm) Processor 6128 Steps as same as reproduce Resutls: #0 xhci_lookup_uport (xhci=xhci@entry=0x7f0786b38010, slot_ctx=slot_ctx@entry=0x7f09bfa86c60) at hw/usb/hcd-xhci.c:2104 2104 port = xhci->ports[port-1].uport->index+1; Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.27.2-3.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-lib-2.1.26-17.el7.x86_64 cyrus-sasl-md5-2.1.26-17.el7.x86_64 cyrus-sasl-plain-2.1.26-17.el7.x86_64 dbus-libs-1.6.12-8.el7.x86_64 flac-libs-1.3.0-4.el7.x86_64 glib2-2.36.3-5.el7.x86_64 glibc-2.17-55.el7.x86_64 glusterfs-api-3.4.0.59rhs-1.el7.x86_64 glusterfs-libs-3.4.0.59rhs-1.el7.x86_64 gmp-5.1.1-5.el7.x86_64 gnutls-3.1.18-8.el7.x86_64 gsm-1.0.13-11.el7.x86_64 json-c-0.11-3.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.11.3-49.el7.x86_64 libICE-1.0.8-7.el7.x86_64 libSM-1.2.1-7.el7.x86_64 libX11-1.6.0-2.1.el7.x86_64 libXau-1.0.8-2.1.el7.x86_64 libXext-1.3.2-2.1.el7.x86_64 libXi-1.7.2-2.1.el7.x86_64 libXtst-1.2.2-2.1.el7.x86_64 libaio-0.3.109-12.el7.x86_64 libasyncns-0.8-7.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-4.el7.x86_64 libdb-5.3.21-17.el7.x86_64 libgcc-4.8.2-16.el7.x86_64 libgcrypt-1.5.3-4.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libibverbs-1.1.7-6.el7.x86_64 libiscsi-1.9.0-6.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64 libnl-1.1.4-3.el7.x86_64 libogg-1.3.0-7.el7.x86_64 libpng-1.5.13-5.el7.x86_64 librdmacm-1.0.17.1-1.el7.x86_64 libseccomp-2.1.1-2.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libsndfile-1.0.25-9.el7.x86_64 libtasn1-3.3-3.el7.x86_64 libusbx-1.0.15-4.el7.x86_64 libuuid-2.23.2-16.el7.x86_64 libvorbis-1.3.3-8.el7.x86_64 libxcb-1.9-5.el7.x86_64 nettle-2.7.1-2.el7.x86_64 nspr-4.10.2-4.el7.x86_64 nss-3.15.4-6.el7.x86_64 nss-softokn-freebl-3.15.4-2.el7.x86_64 nss-util-3.15.4-2.el7.x86_64 openssl-libs-1.0.1e-34.el7.x86_64 p11-kit-0.18.7-4.el7.x86_64 pcre-8.32-12.el7.x86_64 pixman-0.32.4-3.el7.x86_64 pulseaudio-libs-3.0-22.el7.x86_64 tcp_wrappers-libs-7.6-77.el7.x86_64 usbredir-0.6-7.el7.x86_64 xz-libs-5.1.2-8alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64 (gdb) bt #0 xhci_lookup_uport (xhci=xhci@entry=0x7f0786b38010, slot_ctx=slot_ctx@entry=0x7f09bfa86c60) at hw/usb/hcd-xhci.c:2104 #1 0x00007f09bfc2e758 in usb_xhci_post_load (opaque=0x7f0786b38010, version_id=<optimized out>) at hw/usb/hcd-xhci.c:3457 #2 0x00007f09bfd0ca00 in vmstate_load_state (f=0x7f09c157e770, vmsd=0x7f09c0159480 <vmstate_xhci>, opaque=0x7f0786b38010, version_id=1) at /usr/src/debug/qemu-1.5.3/savevm.c:1914 #3 0x00007f09bfd0d5b6 in qemu_loadvm_state (f=f@entry=0x7f09c157e770) at /usr/src/debug/qemu-1.5.3/savevm.c:2472 #4 0x00007f09bfc4951e in process_incoming_migration_co ( opaque=0x7f09c157e770) at migration.c:105 #5 0x00007f09bfbb490a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:118 #6 0x00007f09ba767570 in ?? () from /lib64/libc.so.6 #7 0x00007fff78b6a6c0 in ?? () #8 0x0000000000000000 in ?? () (gdb) Addtional info 1.I can not reproduce this bug use manual test, use auto script( repeadly ping-pong migration while guest installation) can hit . According to above test ,this bug has not fixed.
Please retest with qemu-kvm-1.5.3-76.el7 or newer.
(In reply to Gerd Hoffmann from comment #13) > Please retest with qemu-kvm-1.5.3-76.el7 or newer. Hi Flang, Could you retest it? Best Regards, Junyi
Test on latest version, hit the same problems Version: Host: # rpm -q qemu-kvm qemu-kvm-1.5.3-77.el7.x86_64 # uname -r 3.10.0-191.el7.x86_64 Steps: 1.Install a RHEL7 guest with xhci device 2.Keep ping-pong migration during installation Results: Core was generated by `/bin/qemu-kvm -S -name virt-tests-vm1 -sandbox off -M pc -nodefaults -vga cirru'. Program terminated with signal 11, Segmentation fault. #0 xhci_lookup_uport (xhci=xhci@entry=0x7fe1d9b52010, slot_ctx=slot_ctx@entry=0x7fe1fccc9c60) at hw/usb/hcd-xhci.c:2257 2257 port = xhci->ports[port-1].uport->index+1; Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.27.2-3.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-lib-2.1.26-17.el7.x86_64 cyrus-sasl-md5-2.1.26-17.el7.x86_64 cyrus-sasl-plain-2.1.26-17.el7.x86_64 dbus-libs-1.6.12-8.el7.x86_64 flac-libs-1.3.0-4.el7.x86_64 glib2-2.36.3-5.el7.x86_64 glibc-2.17-55.el7.x86_64 glusterfs-api-3.6.0.29-2.el7.x86_64 glusterfs-libs-3.6.0.29-2.el7.x86_64 gmp-5.1.1-5.el7.x86_64 gnutls-3.1.18-8.el7.x86_64 gsm-1.0.13-11.el7.x86_64 json-c-0.11-3.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.11.3-49.el7.x86_64 libICE-1.0.8-7.el7.x86_64 libSM-1.2.1-7.el7.x86_64 libX11-1.6.0-2.1.el7.x86_64 libXau-1.0.8-2.1.el7.x86_64 libXext-1.3.2-2.1.el7.x86_64 libXi-1.7.2-2.1.el7.x86_64 libXtst-1.2.2-2.1.el7.x86_64 libaio-0.3.109-12.el7.x86_64 libasyncns-0.8-7.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-4.el7.x86_64 libdb-5.3.21-17.el7.x86_64 libgcc-4.8.2-16.el7.x86_64 libgcrypt-1.5.3-4.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libibverbs-1.1.7-6.el7.x86_64 libiscsi-1.9.0-6.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64 libnl-1.1.4-3.el7.x86_64 libogg-1.3.0-7.el7.x86_64 libpng-1.5.13-5.el7.x86_64 librdmacm-1.0.17.1-1.el7.x86_64 libseccomp-2.1.1-2.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libsndfile-1.0.25-9.el7.x86_64 libtasn1-3.3-3.el7.x86_64 libusbx-1.0.15-4.el7.x86_64 libuuid-2.23.2-16.el7.x86_64 libvorbis-1.3.3-8.el7.x86_64 libxcb-1.9-5.el7.x86_64 nettle-2.7.1-2.el7.x86_64 nspr-4.10.2-4.el7.x86_64 nss-3.15.4-6.el7.x86_64 nss-softokn-freebl-3.15.4-2.el7.x86_64 nss-util-3.15.4-2.el7.x86_64 openssl-libs-1.0.1e-34.el7.x86_64 p11-kit-0.18.7-4.el7.x86_64 pcre-8.32-12.el7.x86_64 pixman-0.32.4-3.el7.x86_64 pulseaudio-libs-3.0-22.el7.x86_64 tcp_wrappers-libs-7.6-77.el7.x86_64 usbredir-0.6-7.el7.x86_64 xz-libs-5.1.2-8alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64 (gdb) bt #0 xhci_lookup_uport (xhci=xhci@entry=0x7fe1d9b52010, slot_ctx=slot_ctx@entry=0x7fe1fccc9c60) at hw/usb/hcd-xhci.c:2257 #1 0x00007fe1fce7497a in usb_xhci_post_load (opaque=0x7fe1d9b52010, version_id=<optimized out>) at hw/usb/hcd-xhci.c:3639 #2 0x00007fe1fcf53c60 in vmstate_load_state (f=0x7fe1fe287d60, vmsd=0x7fe1fd3a0420 <vmstate_xhci>, opaque=0x7fe1d9b52010, version_id=1) at /usr/src/debug/qemu-1.5.3/savevm.c:1914 #3 0x00007fe1fcf54816 in qemu_loadvm_state (f=f@entry=0x7fe1fe287d60) at /usr/src/debug/qemu-1.5.3/savevm.c:2472 #4 0x00007fe1fce9039e in process_incoming_migration_co (opaque=0x7fe1fe287d60) at migration.c:105 #5 0x00007fe1fcdfb22a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:118 #6 0x00007fe1f79a6570 in ?? () from /lib64/libc.so.6 #7 0x00007fff3f4bb4a0 in ?? () #8 0x0000000000000000 in ?? () (gdb)
Hmm, doesn't reproduce. Do you hit it on the first migration? Or is it a bit random? Could you upload a coredump? As this seems to be autotest, can you upload the autotest logs too?
(In reply to Gerd Hoffmann from comment #16) > Hmm, doesn't reproduce. > > Do you hit it on the first migration? > Or is it a bit random? > Not first migration, the script is ping/pong migration repeadly. Not easy reproduce for manual test > Could you upload a coredump? > As this seems to be autotest, can you upload the autotest logs too? Sure ,please see attachments.
Created attachment 954325 [details] the autotest debug
Test on latest qemu-kvm-rhev version. Hit this problem. # uname -r 3.10.0-191.el7.x86_64 # rpm -q qemu-kvm-rhev qemu-kvm-rhev-2.1.2-6.el7.x86_64 Results: see attachment I will open new bug for qemu-kvm-rhev component to track this issue.
Created attachment 954772 [details] the log of core
http://people.redhat.com/ghoffman/bz1074219 please test
http://patchwork.ozlabs.org/patch/410006/
patch posted.
(In reply to Gerd Hoffmann from comment #21) > http://people.redhat.com/ghoffman/bz1074219 > > please test Test the fixed version, not hit core dump problem Version: # uname -r 3.10.0-205.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-77.el7.bz1074219.2.x86_64 Guest:rhel7 Results: Migration successfully,not hit core dump.
Fix included in qemu-kvm-1.5.3-83.el7
Test on latest version: Host: # uname -r 3.10.0-210.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-83.el7.x86_64 Guest:rhel7 Steps: 1.Install a RHEL7 guest with xhci device 2.Keep ping-pong migration during installation Results: migration successfully.I will tried to test on AMD machine. According to above test ,this bug has been fixed.
Test on AMD machine, work well. # uname -r 3.10.0-210.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-83.el7.x86_64 Tried windows guest , will hit another issue,not hit core dump. Bug 1047242 - Live migration during Windows installation causes guest desktop to freeze So this bug is verified on intel/AMD machine.
*** Bug 1172950 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0349.html