Bug 1074219
Summary: | qemu core dump when install a RHEL.7 guest(xhci) with migration | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | CongLi <coli> | ||||||
Component: | qemu-kvm | Assignee: | Gerd Hoffmann <kraxel> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 7.0 | CC: | bhamrick, coli, dgilbert, flang, hhuang, huding, jkurik, juli, juzhang, knoel, kraxel, michen, mrezanin, rbalakri, rmainz, sdenham, shuang, virt-bugs, virt-maint, xwei | ||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | qemu-kvm-1.5.3-83.el7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1161397 1180410 (view as bug list) | Environment: | |||||||
Last Closed: | 2015-03-05 08:04:30 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 833649, 1103193, 1146483, 1146486, 1161397, 1180410 | ||||||||
Attachments: |
|
Description
CongLi
2014-03-09 01:52:49 UTC
could be upstream commit f6969b9fef543da1ffa975d24f4d7b75dc369b03 Fix included in qemu-kvm-1.5.3-66.el7 Reproduce this bug as follow version: Host: # uname -r 3.10.0-99.el7.x86_64 # rpm -q qemu-kvm-rhev qemu-kvm-rhev-1.5.3-59.el7ev.x86_64 Steps: 1. Install a RHEL7 guest with xhci device 2.Keep ping-pong migration during installation Result: [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/bin/qemu-kvm -S -name virt-tests-vm1 -sandbox off -M pc -nodefaults -vga cirru'. Program terminated with signal 11, Segmentation fault. #0 xhci_lookup_uport (xhci=xhci@entry=0x7fb50c738010, slot_ctx=slot_ctx@entry=0x7fb520e98c60) at hw/usb/hcd-xhci.c:2104 2104 port = xhci->ports[port-1].uport->index+1; (gdb) bt #0 xhci_lookup_uport (xhci=xhci@entry=0x7fb50c738010, slot_ctx=slot_ctx@entry=0x7fb520e98c60) at hw/usb/hcd-xhci.c:2104 #1 0x00007fb521047d78 in usb_xhci_post_load (opaque=0x7fb50c738010, version_id=<optimized out>) at hw/usb/hcd-xhci.c:3457 #2 0x00007fb521126bd0 in vmstate_load_state (f=0x7fb5239d5980, vmsd=0x7fb521574320 <vmstate_xhci>, opaque=0x7fb50c738010, version_id=1) at /usr/src/debug/qemu-1.5.3/savevm.c:1749 #3 0x00007fb521127776 in qemu_loadvm_state (f=f@entry=0x7fb5239d5980) at /usr/src/debug/qemu-1.5.3/savevm.c:2264 #4 0x00007fb521062ace in process_incoming_migration_co ( opaque=0x7fb5239d5980) at migration.c:105 #5 0x00007fb520fcdd2a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:118 #6 0x00007fb51baff570 in ?? () from /usr/lib64/libc-2.17.so #7 0x00007fffc172ab50 in ?? () #8 0x0000000000000000 in ?? () Test on latest version: # uname -r 3.10.0-147.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-69.el7.x86_64 machine:model name : AMD Opteron(tm) Processor 6128 Steps as same as reproduce Resutls: #0 xhci_lookup_uport (xhci=xhci@entry=0x7f0786b38010, slot_ctx=slot_ctx@entry=0x7f09bfa86c60) at hw/usb/hcd-xhci.c:2104 2104 port = xhci->ports[port-1].uport->index+1; Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.27.2-3.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-lib-2.1.26-17.el7.x86_64 cyrus-sasl-md5-2.1.26-17.el7.x86_64 cyrus-sasl-plain-2.1.26-17.el7.x86_64 dbus-libs-1.6.12-8.el7.x86_64 flac-libs-1.3.0-4.el7.x86_64 glib2-2.36.3-5.el7.x86_64 glibc-2.17-55.el7.x86_64 glusterfs-api-3.4.0.59rhs-1.el7.x86_64 glusterfs-libs-3.4.0.59rhs-1.el7.x86_64 gmp-5.1.1-5.el7.x86_64 gnutls-3.1.18-8.el7.x86_64 gsm-1.0.13-11.el7.x86_64 json-c-0.11-3.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.11.3-49.el7.x86_64 libICE-1.0.8-7.el7.x86_64 libSM-1.2.1-7.el7.x86_64 libX11-1.6.0-2.1.el7.x86_64 libXau-1.0.8-2.1.el7.x86_64 libXext-1.3.2-2.1.el7.x86_64 libXi-1.7.2-2.1.el7.x86_64 libXtst-1.2.2-2.1.el7.x86_64 libaio-0.3.109-12.el7.x86_64 libasyncns-0.8-7.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-4.el7.x86_64 libdb-5.3.21-17.el7.x86_64 libgcc-4.8.2-16.el7.x86_64 libgcrypt-1.5.3-4.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libibverbs-1.1.7-6.el7.x86_64 libiscsi-1.9.0-6.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64 libnl-1.1.4-3.el7.x86_64 libogg-1.3.0-7.el7.x86_64 libpng-1.5.13-5.el7.x86_64 librdmacm-1.0.17.1-1.el7.x86_64 libseccomp-2.1.1-2.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libsndfile-1.0.25-9.el7.x86_64 libtasn1-3.3-3.el7.x86_64 libusbx-1.0.15-4.el7.x86_64 libuuid-2.23.2-16.el7.x86_64 libvorbis-1.3.3-8.el7.x86_64 libxcb-1.9-5.el7.x86_64 nettle-2.7.1-2.el7.x86_64 nspr-4.10.2-4.el7.x86_64 nss-3.15.4-6.el7.x86_64 nss-softokn-freebl-3.15.4-2.el7.x86_64 nss-util-3.15.4-2.el7.x86_64 openssl-libs-1.0.1e-34.el7.x86_64 p11-kit-0.18.7-4.el7.x86_64 pcre-8.32-12.el7.x86_64 pixman-0.32.4-3.el7.x86_64 pulseaudio-libs-3.0-22.el7.x86_64 tcp_wrappers-libs-7.6-77.el7.x86_64 usbredir-0.6-7.el7.x86_64 xz-libs-5.1.2-8alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64 (gdb) bt #0 xhci_lookup_uport (xhci=xhci@entry=0x7f0786b38010, slot_ctx=slot_ctx@entry=0x7f09bfa86c60) at hw/usb/hcd-xhci.c:2104 #1 0x00007f09bfc2e758 in usb_xhci_post_load (opaque=0x7f0786b38010, version_id=<optimized out>) at hw/usb/hcd-xhci.c:3457 #2 0x00007f09bfd0ca00 in vmstate_load_state (f=0x7f09c157e770, vmsd=0x7f09c0159480 <vmstate_xhci>, opaque=0x7f0786b38010, version_id=1) at /usr/src/debug/qemu-1.5.3/savevm.c:1914 #3 0x00007f09bfd0d5b6 in qemu_loadvm_state (f=f@entry=0x7f09c157e770) at /usr/src/debug/qemu-1.5.3/savevm.c:2472 #4 0x00007f09bfc4951e in process_incoming_migration_co ( opaque=0x7f09c157e770) at migration.c:105 #5 0x00007f09bfbb490a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:118 #6 0x00007f09ba767570 in ?? () from /lib64/libc.so.6 #7 0x00007fff78b6a6c0 in ?? () #8 0x0000000000000000 in ?? () (gdb) Addtional info 1.I can not reproduce this bug use manual test, use auto script( repeadly ping-pong migration while guest installation) can hit . According to above test ,this bug has not fixed. Please retest with qemu-kvm-1.5.3-76.el7 or newer. (In reply to Gerd Hoffmann from comment #13) > Please retest with qemu-kvm-1.5.3-76.el7 or newer. Hi Flang, Could you retest it? Best Regards, Junyi Test on latest version, hit the same problems Version: Host: # rpm -q qemu-kvm qemu-kvm-1.5.3-77.el7.x86_64 # uname -r 3.10.0-191.el7.x86_64 Steps: 1.Install a RHEL7 guest with xhci device 2.Keep ping-pong migration during installation Results: Core was generated by `/bin/qemu-kvm -S -name virt-tests-vm1 -sandbox off -M pc -nodefaults -vga cirru'. Program terminated with signal 11, Segmentation fault. #0 xhci_lookup_uport (xhci=xhci@entry=0x7fe1d9b52010, slot_ctx=slot_ctx@entry=0x7fe1fccc9c60) at hw/usb/hcd-xhci.c:2257 2257 port = xhci->ports[port-1].uport->index+1; Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.27.2-3.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-lib-2.1.26-17.el7.x86_64 cyrus-sasl-md5-2.1.26-17.el7.x86_64 cyrus-sasl-plain-2.1.26-17.el7.x86_64 dbus-libs-1.6.12-8.el7.x86_64 flac-libs-1.3.0-4.el7.x86_64 glib2-2.36.3-5.el7.x86_64 glibc-2.17-55.el7.x86_64 glusterfs-api-3.6.0.29-2.el7.x86_64 glusterfs-libs-3.6.0.29-2.el7.x86_64 gmp-5.1.1-5.el7.x86_64 gnutls-3.1.18-8.el7.x86_64 gsm-1.0.13-11.el7.x86_64 json-c-0.11-3.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.11.3-49.el7.x86_64 libICE-1.0.8-7.el7.x86_64 libSM-1.2.1-7.el7.x86_64 libX11-1.6.0-2.1.el7.x86_64 libXau-1.0.8-2.1.el7.x86_64 libXext-1.3.2-2.1.el7.x86_64 libXi-1.7.2-2.1.el7.x86_64 libXtst-1.2.2-2.1.el7.x86_64 libaio-0.3.109-12.el7.x86_64 libasyncns-0.8-7.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-4.el7.x86_64 libdb-5.3.21-17.el7.x86_64 libgcc-4.8.2-16.el7.x86_64 libgcrypt-1.5.3-4.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libibverbs-1.1.7-6.el7.x86_64 libiscsi-1.9.0-6.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64 libnl-1.1.4-3.el7.x86_64 libogg-1.3.0-7.el7.x86_64 libpng-1.5.13-5.el7.x86_64 librdmacm-1.0.17.1-1.el7.x86_64 libseccomp-2.1.1-2.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libsndfile-1.0.25-9.el7.x86_64 libtasn1-3.3-3.el7.x86_64 libusbx-1.0.15-4.el7.x86_64 libuuid-2.23.2-16.el7.x86_64 libvorbis-1.3.3-8.el7.x86_64 libxcb-1.9-5.el7.x86_64 nettle-2.7.1-2.el7.x86_64 nspr-4.10.2-4.el7.x86_64 nss-3.15.4-6.el7.x86_64 nss-softokn-freebl-3.15.4-2.el7.x86_64 nss-util-3.15.4-2.el7.x86_64 openssl-libs-1.0.1e-34.el7.x86_64 p11-kit-0.18.7-4.el7.x86_64 pcre-8.32-12.el7.x86_64 pixman-0.32.4-3.el7.x86_64 pulseaudio-libs-3.0-22.el7.x86_64 tcp_wrappers-libs-7.6-77.el7.x86_64 usbredir-0.6-7.el7.x86_64 xz-libs-5.1.2-8alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64 (gdb) bt #0 xhci_lookup_uport (xhci=xhci@entry=0x7fe1d9b52010, slot_ctx=slot_ctx@entry=0x7fe1fccc9c60) at hw/usb/hcd-xhci.c:2257 #1 0x00007fe1fce7497a in usb_xhci_post_load (opaque=0x7fe1d9b52010, version_id=<optimized out>) at hw/usb/hcd-xhci.c:3639 #2 0x00007fe1fcf53c60 in vmstate_load_state (f=0x7fe1fe287d60, vmsd=0x7fe1fd3a0420 <vmstate_xhci>, opaque=0x7fe1d9b52010, version_id=1) at /usr/src/debug/qemu-1.5.3/savevm.c:1914 #3 0x00007fe1fcf54816 in qemu_loadvm_state (f=f@entry=0x7fe1fe287d60) at /usr/src/debug/qemu-1.5.3/savevm.c:2472 #4 0x00007fe1fce9039e in process_incoming_migration_co (opaque=0x7fe1fe287d60) at migration.c:105 #5 0x00007fe1fcdfb22a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:118 #6 0x00007fe1f79a6570 in ?? () from /lib64/libc.so.6 #7 0x00007fff3f4bb4a0 in ?? () #8 0x0000000000000000 in ?? () (gdb) Hmm, doesn't reproduce. Do you hit it on the first migration? Or is it a bit random? Could you upload a coredump? As this seems to be autotest, can you upload the autotest logs too? (In reply to Gerd Hoffmann from comment #16) > Hmm, doesn't reproduce. > > Do you hit it on the first migration? > Or is it a bit random? > Not first migration, the script is ping/pong migration repeadly. Not easy reproduce for manual test > Could you upload a coredump? > As this seems to be autotest, can you upload the autotest logs too? Sure ,please see attachments. Created attachment 954325 [details]
the autotest debug
Test on latest qemu-kvm-rhev version. Hit this problem. # uname -r 3.10.0-191.el7.x86_64 # rpm -q qemu-kvm-rhev qemu-kvm-rhev-2.1.2-6.el7.x86_64 Results: see attachment I will open new bug for qemu-kvm-rhev component to track this issue. Created attachment 954772 [details]
the log of core
http://people.redhat.com/ghoffman/bz1074219 please test patch posted. (In reply to Gerd Hoffmann from comment #21) > http://people.redhat.com/ghoffman/bz1074219 > > please test Test the fixed version, not hit core dump problem Version: # uname -r 3.10.0-205.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-77.el7.bz1074219.2.x86_64 Guest:rhel7 Results: Migration successfully,not hit core dump. Fix included in qemu-kvm-1.5.3-83.el7 Test on latest version: Host: # uname -r 3.10.0-210.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-83.el7.x86_64 Guest:rhel7 Steps: 1.Install a RHEL7 guest with xhci device 2.Keep ping-pong migration during installation Results: migration successfully.I will tried to test on AMD machine. According to above test ,this bug has been fixed. Test on AMD machine, work well. # uname -r 3.10.0-210.el7.x86_64 # rpm -q qemu-kvm qemu-kvm-1.5.3-83.el7.x86_64 Tried windows guest , will hit another issue,not hit core dump. Bug 1047242 - Live migration during Windows installation causes guest desktop to freeze So this bug is verified on intel/AMD machine. *** Bug 1172950 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0349.html |