Bug 1019583

Summary: qemu-kvm core dump after several unsuccessful hot-plug/hot-unplug virtio block device
Product: Red Hat Enterprise Linux 7 Reporter: Jun Li <juli>
Component: qemu-kvmAssignee: Markus Armbruster <armbru>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 7.0CC: acathrow, akong, hhuang, juzhang, michen, sluo, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-14 02:20:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jun Li 2013-10-16 06:15:29 UTC
Description of problem:
After several unsuccessful hot-plug/hot-unplug virtio block device, qemu-kvm will core dump.

Version-Release number of selected component (if applicable):
qemu-img-1.5.3-9.el7.x86_64
3.10.0-34.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot guest.
# gdb --args /usr/libexec/qemu-kvm -monitor stdio -chardev socket,id=serial_id_20120515-041452-KkUY,path=/tmp/serial-20120515-041452-KkUY,server,nowait -device isa-serial,chardev=serial_id_20120515-041452-KkUY -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=0x4 \
-drive file=/mnt/rhel7base.qcow2_v3,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,boot=on,snapshot=off,readonly=off,format=qcow2,aio=native \
-device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 \
-device virtio-net-pci,netdev=hostnet0,mac=9a:6e:47:a6:d8:f9,id=ndev00idLYjg29,bus=pci.0,addr=0x3 \
-netdev tap,id=hostnet0,vhost=on,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown \
-m 2048 -smp 4,cores=2,threads=1,sockets=2 \
-cpu SandyBridge -drive index=1,if=none,id=drive-ide0-0-0,media=cdrom,boot=off,snapshot=off,readonly=on,format=raw \
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0 -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host,driftfix=slew -M pc -boot order=cdn,once=c,menu=off -enable-kvm -monitor unix:/tmp/monitor-unix,server,nowait 

2.run the following shell script inside host.
# cat hotplug-unplug.sh 
#!/bin/sh
i=1
while true
do
#echo "info pci" | nc -U /tmp/monitor-unix
echo "drive_add pci_addr=auto id=hot1,file=/root/hot1.raw,format=raw,media=disk" | nc -U /tmp/monitor-unix
echo "device_add virtio-blk-pci,id=hot_virtio,drive=hot1" | nc -U /tmp/monitor-unix;
sleep 8
#echo "info pci" | nc -U /tmp/monitor-unix
echo "device_del hot_virtio" | nc -U /tmp/monitor-unix
#echo "info pci" | nc -U /tmp/monitor-unix
#echo "stop" | nc -U /tmp/monitor-unix
#echo "cont" | nc -U /tmp/monitor-unix
echo $i time########
i=$(($i+1))
done

3.After ten minutes later, check the qemu-kvm monitor. 

Actual results:
(qemu) 
Program received signal SIGSEGV, Segmentation fault.
0x000055555567d7f2 in pci_unplug_device (qdev=<optimized out>)
    at hw/pci/pci.c:1759
1759	    return dev->bus->hotplug(dev->bus->hotplug_qdev, dev,
Missing separate debuginfos, use: debuginfo-install alsa-lib-1.0.27.2-1.el7.x86_64 celt051-0.5.1.3-6.el7.x86_64 cyrus-sasl-lib-2.1.26-12.1.el7.x86_64 cyrus-sasl-md5-2.1.26-12.1.el7.x86_64 cyrus-sasl-plain-2.1.26-12.1.el7.x86_64 cyrus-sasl-scram-2.1.26-12.1.el7.x86_64 dbus-libs-1.6.12-5.el7.x86_64 flac-libs-1.3.0-2.el7.x86_64 glib2-2.36.3-2.el7.x86_64 glibc-2.17-33.el7.x86_64 glusterfs-api-3.4.0.34rhs-1.el7.x86_64 glusterfs-libs-3.4.0.34rhs-1.el7.x86_64 gmp-5.1.1-2.el7.x86_64 gnutls-3.1.13-1.el7.x86_64 gsm-1.0.13-9.el7.x86_64 json-c-0.11-1.el7.x86_64 keyutils-libs-1.5.8-1.el7.x86_64 krb5-libs-1.11.3-23.el7.x86_64 libICE-1.0.8-5.el7.x86_64 libSM-1.2.1-5.el7.x86_64 libX11-1.6.0-1.el7.x86_64 libXau-1.0.8-1.el7.x86_64 libXext-1.3.2-1.el7.x86_64 libXi-1.7.2-1.el7.x86_64 libXtst-1.2.2-1.el7.x86_64 libaio-0.3.109-9.el7.x86_64 libasyncns-0.8-5.el7.x86_64 libattr-2.4.46-10.el7.x86_64 libcap-2.22-6.el7.x86_64 libcom_err-1.42.8-2.el7.x86_64 libdb-5.3.21-11.el7.x86_64 libgcc-4.8.1-11.el7.x86_64 libgcrypt-1.5.3-1.el7.x86_64 libgpg-error-1.12-1.el7.x86_64 libiscsi-1.9.0-2.el7.x86_64 libjpeg-turbo-1.2.90-2.el7.x86_64 libogg-1.3.0-5.el7.x86_64 libpng-1.5.13-2.el7.x86_64 libseccomp-2.1.0-0.el7.x86_64 libselinux-2.1.13-21.el7.x86_64 libsndfile-1.0.25-7.el7.x86_64 libtasn1-3.3-1.el7.x86_64 libusbx-1.0.15-2.el7.x86_64 libuuid-2.23.2-6.el7.x86_64 libvorbis-1.3.3-4.el7.x86_64 libxcb-1.9-3.el7.x86_64 nettle-2.6-2.el7.x86_64 nspr-4.10-3.el7.x86_64 nss-3.15.1-3.el7.x86_64 nss-softokn-freebl-3.15.1-2.el7.x86_64 nss-util-3.15.1-2.el7.x86_64 openssl-libs-1.0.1e-21.el7.x86_64 p11-kit-0.18.5-1.el7.x86_64 pcre-8.32-7.el7.x86_64 pixman-0.30.0-1.el7.x86_64 pulseaudio-libs-3.0-10.el7.x86_64 spice-server-0.12.4-2.el7.x86_64 tcp_wrappers-libs-7.6-75.el7.x86_64 usbredir-0.6-5.el7.x86_64 zlib-1.2.7-10.el7.x86_64
(gdb) bt
#0  0x000055555567d7f2 in pci_unplug_device (qdev=<optimized out>)
    at hw/pci/pci.c:1759
#1  0x000055555563ca8f in qdev_unplug (dev=0x5555567183b0, 
    errp=errp@entry=0x7fffffffc770) at hw/core/qdev.c:219
#2  0x00005555556e4072 in qmp_device_del (id=<optimized out>, 
    errp=errp@entry=0x7fffffffc770) at qdev-monitor.c:628
#3  0x000055555561ba3b in hmp_device_del (mon=0x555556554360, 
    qdict=<optimized out>) at hmp.c:1177
#4  0x0000555555793439 in handle_user_command (mon=mon@entry=0x555556554360, 
    cmdline=<optimized out>) at /usr/src/debug/qemu-1.5.3/monitor.c:4006
#5  0x000055555579373b in monitor_command_cb (mon=0x555556554360, 
    cmdline=<optimized out>, opaque=<optimized out>)
    at /usr/src/debug/qemu-1.5.3/monitor.c:4622
#6  0x00005555556f9b60 in readline_handle_byte (rs=0x55555665c770, 
    ch=<optimized out>) at readline.c:374
#7  0x00005555557936a4 in monitor_read (opaque=<optimized out>, 
    buf=<optimized out>, size=<optimized out>)
    at /usr/src/debug/qemu-1.5.3/monitor.c:4608
#8  0x00005555556e7af9 in qemu_chr_be_write (len=<optimized out>, 
    buf=0x7fffffffc920 "\n.LVUU", s=0x5555564c3650) at qemu-char.c:167
#9  tcp_chr_read (chan=<optimized out>, cond=<optimized out>, 
    opaque=0x5555564c3650) at qemu-char.c:2493
---Type <return> to continue, or q <return> to quit---
#10 0x00007ffff76ede06 in g_main_context_dispatch ()
   from /lib64/libglib-2.0.so.0
#11 0x00005555556beeaa in glib_pollfds_poll () at main-loop.c:187
#12 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:232
#13 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:464
#14 0x00005555555c3d51 in main_loop () at vl.c:1986
#15 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4379

Expected results:
qemu-kvm will work well.

Additional info:

Comment 2 Amos Kong 2014-01-14 02:20:38 UTC
The device wasn't rightly free in error state.
I can't reproduce this bug after applied this patch[1]


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1046248#c3

*** This bug has been marked as a duplicate of bug 1046248 ***

Comment 3 Jun Li 2014-01-14 05:06:54 UTC
(In reply to Amos Kong from comment #2)
> The device wasn't rightly free in error state.
> I can't reproduce this bug after applied this patch[1]
> 
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1046248#c3
> 
> *** This bug has been marked as a duplicate of bug 1046248 ***

Why this bug is duplicate to bug 1046248, this bug is filed before bug 1046248.

Comment 4 Amos Kong 2014-01-14 05:48:37 UTC
(In reply to Jun Li from comment #3)
> (In reply to Amos Kong from comment #2)
> > The device wasn't rightly free in error state.
> > I can't reproduce this bug after applied this patch[1]
> > 
> > 
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1046248#c3
> > 
> > *** This bug has been marked as a duplicate of bug 1046248 ***
> 
> Why this bug is duplicate to bug 1046248, this bug is filed before bug
> 1046248.

Yes, we always duplicate the new bug with existed bug.

But bug 1046248 contains some useful description and analysis, it's already fixed. I just see this bug today.

Another reason is that those two bugs were assigned to two persons, some further work needs to be done. Tracking upstream patch, backporting.