Bug 1452512

Summary: qemu coredump when add more than 12 usb-storage devices to ehci
Product: Red Hat Enterprise Linux 7 Reporter: yduan
Component: qemu-kvm-rhevAssignee: Gerd Hoffmann <kraxel>
Status: CLOSED ERRATA QA Contact: hachen <hachen>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: aliang, chayang, coli, jinzhao, juzhang, knoel, lmiksik, michen, mrezanin, ovasik, qzhang, virt-maint, xfu, xuma, yduan
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.9.0-9.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-02 04:41:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yduan 2017-05-19 06:12:44 UTC
Description of problem:
qemu coredump when add more than 12 usb-storage devices to ehci.

Version-Release number of selected component (if applicable):
Host:
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.9.0-5.el7.x86_64
# uname -r
3.10.0-663.el7.x86_64
Guest:
3.10.0-663.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest with more than 12 usb-storage devices to ehci:
/usr/libexec/qemu-kvm \
 -S \
 -name 'RHEL7.4' \
 -machine pc \
 -m 4096 \
 -smp 2,maxcpus=2,sockets=1,cores=2,threads=1 \
 -cpu SandyBridge,enforce \
 -rtc base=localtime,clock=host,driftfix=slew \
 -nodefaults \
 -device AC97 \
 -vga qxl \
 -chardev socket,id=seabioslog_log,path=/tmp/seabios-log,server,nowait \
 -device isa-debugcon,chardev=seabioslog_log,iobase=0x402 \
 -boot menu=on \
 -enable-kvm \
 -monitor stdio \
 -spice port=5900,disable-ticketing \
 -qmp tcp:0:9999,server,nowait \
 -netdev tap,id=netdev0,vhost=on \
 -device virtio-net-pci,mac=BA:BC:13:83:4F:BD,id=net0,netdev=netdev0,status=on \
 -drive file=images/rhel74-64-virtio.qcow2,format=qcow2,id=drive_sysdisk,if=none,cache=none,aio=native,werror=stop,rerror=stop \
 -device virtio-blk-pci,drive=drive_sysdisk,id=device_sysdisk,bootindex=0 \
 -readconfig ich9-ehci-uhci.cfg \
 -drive file=images/stg1.qcow2,id=drive_usb1,format=qcow2,if=none,cache=none,aio=native -device usb-storage,drive=drive_usb1,id=device_usb1 \
 -drive file=images/stg2.qcow2,id=drive_usb2,format=qcow2,if=none,cache=none,aio=native -device usb-storage,drive=drive_usb2,id=device_usb2 \
 ......
 -drive file=images/stg13.qcow2,id=drive_usb13,format=qcow2,if=none,cache=none,aio=native -device usb-storage,drive=drive_usb13,id=device_usb13 \

# cat ich9-ehci-uhci.cfg
[device "ehci"]
  driver = "ich9-usb-ehci1"
  addr = "1d.7"
  multifunction = "on"

[device "uhci-1"]
  driver = "ich9-usb-uhci1"
  addr = "1d.0"
  multifunction = "on"
  masterbus = "ehci.0"
  firstport = "0"

[device "uhci-2"]
  driver = "ich9-usb-uhci2"
  addr = "1d.1"
  multifunction = "on"
  masterbus = "ehci.0"
  firstport = "2"

[device "uhci-3"]
  driver = "ich9-usb-uhci3"
  addr = "1d.2"
  multifunction = "on"
  masterbus = "ehci.0"
  firstport = "4"

Actual results:
qemu coredumped.

# sh uhci_ehci.sh 
QEMU 2.9.0 monitor - type 'help' for more information
(qemu) ehci: Bad asynchronous state 0. Resetting to active
**
ERROR:hw/usb/hcd-ehci.c:2155:ehci_advance_async_state: code should not be reached
uhci_ehci.sh: line 37: 14539 Aborted                 (core dumped)

Expected results:
VM should boot up successfully.

Additional info:
# gdb core.14539
(gdb) bt
#0  0x00007ff5b2e691d7 in raise () at /lib64/libc.so.6
#1  0x00007ff5b2e6a8c8 in abort () at /lib64/libc.so.6
#2  0x00007ff5b49402a5 in g_assertion_message () at /lib64/libglib-2.0.so.0
#3  0x00007ff5b494033a in g_assertion_message_expr () at /lib64/libglib-2.0.so.0
#4  0x0000560e7905a5b2 in ehci_advance_async_state (ehci=0x560e80019df0) at hw/usb/hcd-ehci.c:2155
#5  0x0000560e7905a90c in ehci_frame_timer (opaque=0x560e80019df0) at hw/usb/hcd-ehci.c:2299
#6  0x0000560e791830c1 in aio_bh_poll (bh=0x560e804d7260) at util/async.c:90
#7  0x0000560e791830c1 in aio_bh_poll (ctx=ctx@entry=0x560e7b055700) at util/async.c:118
#8  0x0000560e79186164 in aio_poll (ctx=ctx@entry=0x560e7b055700, blocking=<optimized out>) at util/aio-posix.c:682
#9  0x0000560e79112f74 in bdrv_drain_recurse (bs=bs@entry=0x560e7da92800) at block/io.c:164
#10 0x0000560e7911378d in bdrv_drained_begin (bs=bs@entry=0x560e7da92800) at block/io.c:248
#11 0x0000560e79113989 in bdrv_drain (bs=0x560e7da92800) at block/io.c:282
#12 0x0000560e79108016 in blk_drain (blk=<optimized out>) at block/block-backend.c:1383
#13 0x0000560e7904be2f in scsi_device_purge_requests (sdev=sdev@entry=0x560e808c2000, sense=...) at hw/scsi/scsi-bus.c:1942
#14 0x0000560e79044077 in scsi_disk_reset (dev=0x560e808c2000) at hw/scsi/scsi-disk.c:2218
#15 0x0000560e78fe1269 in qdev_reset_one (dev=<optimized out>, opaque=<optimized out>) at hw/core/qdev.c:310
#16 0x0000560e78fe4068 in qbus_walk_children (bus=0x560e808baed0, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x560e78fe1260 <qdev_reset_one>, post_busfn=0x560e78fdf9a0 <qbus_reset_one>, opaque=0x0) at hw/core/bus.c:59
#17 0x0000560e78fe0bd8 in qdev_walk_children (dev=0x560e808b9800, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x560e78fe1260 <qdev_reset_one>, post_busfn=0x560e78fdf9a0 <qbus_reset_one>, opaque=0x0) at hw/core/qdev.c:617
#18 0x0000560e78fe4068 in qbus_walk_children (bus=0x560e80019df0, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x560e78fe1260 <qdev_reset_one>, post_busfn=0x560e78fdf9a0 <qbus_reset_one>, opaque=0x0) at hw/core/bus.c:59
#19 0x0000560e78fe0bd8 in qdev_walk_children (dev=0x560e80019400, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x560e78fe1260 <qdev_reset_one>, post_busfn=0x560e78fdf9a0 <qbus_reset_one>, opaque=0x0) at hw/core/qdev.c:617
#20 0x0000560e78fe4068 in qbus_walk_children (bus=0x560e7b0f4d00, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x560e78fe1260 <qdev_reset_one>, post_busfn=0x560e78fdf9a0 <qbus_reset_one>, opaque=0x0) at hw/core/bus.c:59
#21 0x0000560e78fe0bd8 in qdev_walk_children (dev=0x560e7e174000, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x560e78fe1260 <qdev_reset_one>, post_busfn=0x560e78fdf9a0 <qbus_reset_one>, opaque=0x0) at hw/core/qdev.c:617
#22 0x0000560e78fe4068 in qbus_walk_children (bus=0x560e7b032380, pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x560e78fe1260 <qdev_reset_one>, post_busfn=0x560e78fdf9a0 <qbus_reset_one>, opaque=0x0) at hw/core/bus.c:59
#23 0x0000560e78fe41fd in qemu_devices_reset () at hw/core/reset.c:69
#24 0x0000560e78f11206 in pc_machine_reset () at /usr/src/debug/qemu-2.9.0/hw/i386/pc.c:2236
#25 0x0000560e78f92046 in qemu_system_reset (report=report@entry=false) at vl.c:1697
#26 0x0000560e78e71b11 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4690
(gdb)

Comment 2 Ademar Reis 2017-05-22 17:01:38 UTC
Is this a regression?

Comment 3 yduan 2017-05-23 08:10:56 UTC
Hi Ademar,

  I can reproduce this bug with qemu-kvm-rhev-2.8.0-6.el7.x86_64 but when add more than 13 usb-storage devices to ehci.
  So I think it's not a regression.

Thanks,
yduan

Comment 4 Gerd Hoffmann 2017-05-23 08:48:49 UTC
https://patchwork.ozlabs.org/patch/765821/

Comment 5 Gerd Hoffmann 2017-06-06 14:04:08 UTC
upstream commit 26022652c6fd067b9fa09280f5a6d6284a21c73f

Comment 6 Gerd Hoffmann 2017-06-06 15:24:05 UTC
backport posted.

Comment 7 Miroslav Rezanina 2017-06-08 16:27:47 UTC
Fix included in qemu-kvm-rhev-2.9.0-9.el7

Comment 9 hachen 2017-06-13 05:41:38 UTC
test it on:
host
qemu-kvm-rhev-2.9.0-9.el7.x86_64
kernel-3.10.0-679.el7.x86_64
guest
kernel-3.10.0-679.el7.x86_64

Boot guest with 16 stgs on echi, qemu works fine.

Bug verified.

Comment 11 errata-xmlrpc 2017-08-02 04:41:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392