RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1449609 - qemu coredump when dd on multiple usb-storage devices concurrently in guest
Summary: qemu coredump when dd on multiple usb-storage devices concurrently in guest
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Gerd Hoffmann
QA Contact: Gu Nini
URL:
Whiteboard:
Depends On:
Blocks: 1473046
TreeView+ depends on / blocked
 
Reported: 2017-05-10 11:13 UTC by yduan
Modified: 2018-04-11 00:21 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-rhev-2.10.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-11 00:19:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Full_CMD (3.63 KB, text/plain)
2017-05-10 11:13 UTC, yduan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:1104 0 None None None 2018-04-11 00:21:56 UTC

Description yduan 2017-05-10 11:13:52 UTC
Created attachment 1277600 [details]
Full_CMD

Description of problem:
qemu coredump when dd on multiple usb-storage devices concurrently in guest

Version-Release number of selected component (if applicable):
Host:
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.9.0-3.el7.x86_64
# uname -r
3.10.0-664.el7.x86_64
Guest:
3.10.0-663.el7.x86_64

How reproducible:
1/2

Steps to Reproduce:
1.Boot a VM with multiple usb-storage devices:
/usr/libexec/qemu-kvm \
 ......
 -device usb-ehci,id=usb1 \
 -drive file=images/stg1.qcow2,format=qcow2,id=drive_disk1,if=none,cache=none,aio=native,werror=stop,rerror=stop \
 -device usb-storage,drive=drive_disk1,id=device_disk1 \
 ......
 -drive file=images/stg6.qcow2,format=qcow2,id=drive_disk6,if=none,cache=none,aio=native,werror=stop,rerror=stop \
 -device usb-storage,drive=drive_disk6,id=device_disk6 \
 -device usb-ehci,id=usb2 \
 -drive file=images/stg7.qcow2,format=qcow2,id=drive_disk7,if=none,cache=none,aio=native,werror=stop,rerror=stop \
 -device usb-storage,drive=drive_disk7,id=device_disk7 \
 ......
 -drive file=images/stg12.qcow2,format=qcow2,id=drive_disk12,if=none,cache=none,aio=native,werror=stop,rerror=stop \
 -device usb-storage,drive=drive_disk12,id=device_disk12 \
 -device usb-ehci,id=usb3 \
 -drive file=images/stg13.qcow2,format=qcow2,id=drive_disk13,if=none,cache=none,aio=native,werror=stop,rerror=stop \
 -device usb-storage,drive=drive_disk13,id=device_disk13 \
 ......
 -drive file=images/stg16.qcow2,format=qcow2,id=drive_disk16,if=none,cache=none,aio=native,werror=stop,rerror=stop \
 -device usb-storage,drive=drive_disk16,id=device_disk16 \

 Full command line is as attachment.

2."fdisk -l" in the guest:
# fdisk -l

Disk /dev/vda: 21.5 GB, 21474836480 bytes, 41943040 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000cdb05

   Device Boot      Start         End      Blocks   Id  System
/dev/vda1   *        2048     2099199     1048576   83  Linux
/dev/vda2         2099200    41943039    19921920   8e  Linux LVM

Disk /dev/mapper/rhel_bootp--73--199--233-root: 18.2 GB, 18249416704 bytes, 35643392 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/rhel_bootp--73--199--233-swap: 2147 MB, 2147483648 bytes, 4194304 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sda: 209 MB, 209715200 bytes, 409600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


......


Disk /dev/sdp: 209 MB, 209715200 bytes, 409600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

3. # sh dd
# cat dd
dd if=/dev/zero of=/dev/sda bs=1M count=100 &
dd if=/dev/zero of=/dev/sdb bs=1M count=100 &
dd if=/dev/zero of=/dev/sdc bs=1M count=100 &
dd if=/dev/zero of=/dev/sdd bs=1M count=100 &
dd if=/dev/zero of=/dev/sde bs=1M count=100 &
dd if=/dev/zero of=/dev/sdf bs=1M count=100 &
dd if=/dev/zero of=/dev/sdg bs=1M count=100 &
dd if=/dev/zero of=/dev/sdh bs=1M count=100 &
dd if=/dev/zero of=/dev/sdi bs=1M count=100 &
dd if=/dev/zero of=/dev/sdj bs=1M count=100 &
dd if=/dev/zero of=/dev/sdk bs=1M count=100 &
dd if=/dev/zero of=/dev/sdl bs=1M count=100 &
dd if=/dev/zero of=/dev/sdm bs=1M count=100 &
dd if=/dev/zero of=/dev/sdn bs=1M count=100 &
dd if=/dev/zero of=/dev/sdo bs=1M count=100 &
dd if=/dev/zero of=/dev/sdp bs=1M count=100 &

Actual results:
qemu-kvm: hw/usb/core.c:492: usb_cancel_packet: Assertion `usb_packet_is_inflight(p)' failed.
max_usb.sh: line 60: 16187 Aborted                 (core dumped)

Expected results:
There should be no coredump.

Additional info:
# gdb core.16187
(gdb) bt
#0  0x00007f2dd6afd1d7 in raise () from /lib64/libc.so.6
#1  0x00007f2dd6afe8c8 in abort () from /lib64/libc.so.6
#2  0x00007f2dd6af6146 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007f2dd6af61f2 in __assert_fail () from /lib64/libc.so.6
#4  0x000055bbd4c8250d in usb_cancel_packet (p=0x55bbd7dbcb80) at hw/usb/core.c:492
#5  0x000055bbd4c8a658 in ehci_free_packet (p=0x55bbd7dbcb40) at hw/usb/hcd-ehci.c:522
#6  0x000055bbd4c8a875 in ehci_cancel_queue (q=q@entry=0x55bbd7d7ceb0) at hw/usb/hcd-ehci.c:580
#7  0x000055bbd4c8a8cb in ehci_free_queue (q=0x55bbd7d7ceb0, warn=0x0) at hw/usb/hcd-ehci.c:607
#8  0x000055bbd4c8ac0a in ehci_queues_rip_unseen (ehci=ehci@entry=0x55bbddde09f0, async=1) at hw/usb/hcd-ehci.c:656
#9  0x000055bbd4c8c2b7 in ehci_advance_async_state (ehci=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2144
#10 0x000055bbd4c8c59c in ehci_frame_timer (opaque=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2299
#11 0x000055bbd4db3ea1 in aio_bh_call (bh=0x55bbddaf1150) at util/async.c:90
#12 aio_bh_poll (ctx=ctx@entry=0x55bbd7d85700) at util/async.c:118
#13 0x000055bbd4db6f44 in aio_poll (ctx=0x55bbd7d85700, blocking=blocking@entry=true) at util/aio-posix.c:682
#14 0x000055bbd4d465d1 in bdrv_aio_cancel (acb=0x55bbdbb0c800) at block/io.c:2129
#15 0x000055bbd4c8a658 in ehci_free_packet (p=0x55bbd7dbcb40) at hw/usb/hcd-ehci.c:522
#16 0x000055bbd4c8a875 in ehci_cancel_queue (q=q@entry=0x55bbd7d7ceb0) at hw/usb/hcd-ehci.c:580
#17 0x000055bbd4c8a8cb in ehci_free_queue (q=0x55bbd7d7ceb0, warn=0x0) at hw/usb/hcd-ehci.c:607
#18 0x000055bbd4c8ac0a in ehci_queues_rip_unseen (ehci=ehci@entry=0x55bbddde09f0, async=1) at hw/usb/hcd-ehci.c:656
#19 0x000055bbd4c8c2b7 in ehci_advance_async_state (ehci=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2144
#20 0x000055bbd4c8c59c in ehci_frame_timer (opaque=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2299
#21 0x000055bbd4db3ea1 in aio_bh_call (bh=0x55bbddaf1150) at util/async.c:90
#22 aio_bh_poll (ctx=ctx@entry=0x55bbd7d85700) at util/async.c:118
#23 0x000055bbd4db6bb0 in aio_dispatch (ctx=0x55bbd7d85700) at util/aio-posix.c:429
#24 0x000055bbd4db3d7e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
#25 0x00007f2dd83abd7a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#26 0x000055bbd4db5e6c in glib_pollfds_poll () at util/main-loop.c:213
#27 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
#28 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:517
#29 0x000055bbd4aa701c in main_loop () at vl.c:1898
#30 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4720



(gdb) bt full
#0  0x00007f2dd6afd1d7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f2dd6afe8c8 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007f2dd6af6146 in __assert_fail_base () from /lib64/libc.so.6
No symbol table info available.
#3  0x00007f2dd6af61f2 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4  0x000055bbd4c8250d in usb_cancel_packet (p=0x55bbd7dbcb80) at hw/usb/core.c:492
        callback = false
        __PRETTY_FUNCTION__ = "usb_cancel_packet"
#5  0x000055bbd4c8a658 in ehci_free_packet (p=0x55bbd7dbcb40) at hw/usb/hcd-ehci.c:522
No locals.
#6  0x000055bbd4c8a875 in ehci_cancel_queue (q=q@entry=0x55bbd7d7ceb0) at hw/usb/hcd-ehci.c:580
        p = <optimized out>
        packets = 0
#7  0x000055bbd4c8a8cb in ehci_free_queue (q=0x55bbd7d7ceb0, warn=0x0) at hw/usb/hcd-ehci.c:607
        head = 0x55bbddde1140
        cancelled = <optimized out>
#8  0x000055bbd4c8ac0a in ehci_queues_rip_unseen (ehci=ehci@entry=0x55bbddde09f0, async=1) at hw/usb/hcd-ehci.c:656
        head = 0x55bbddde1140
        q = <optimized out>
        tmp = 0x55bbd7d7c7f0
#9  0x000055bbd4c8c2b7 in ehci_advance_async_state (ehci=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2144
        ehci = 0x55bbddde09f0
#10 0x000055bbd4c8c59c in ehci_frame_timer (opaque=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2299
        ehci = 0x55bbddde09f0
        need_timer = 1
        expire_time = <optimized out>
        t_now = 3074643889872
        ns_elapsed = <optimized out>
        uframes = 6
        skipped_uframes = <optimized out>
        i = <optimized out>
#11 0x000055bbd4db3ea1 in aio_bh_call (bh=0x55bbddaf1150) at util/async.c:90
No locals.
#12 aio_bh_poll (ctx=ctx@entry=0x55bbd7d85700) at util/async.c:118
        bh = <optimized out>
        bhp = <optimized out>
        next = 0x55bbddaf00d0
---Type <return> to continue, or q <return> to quit---
        ret = 1
        deleted = false
#13 0x000055bbd4db6f44 in aio_poll (ctx=0x55bbd7d85700, blocking=blocking@entry=true) at util/aio-posix.c:682
        node = <optimized out>
        i = <optimized out>
        ret = 0
        progress = <optimized out>
        timeout = <optimized out>
        start = 0
        __PRETTY_FUNCTION__ = "aio_poll"
#14 0x000055bbd4d465d1 in bdrv_aio_cancel (acb=0x55bbdbb0c800) at block/io.c:2129
        __PRETTY_FUNCTION__ = "bdrv_aio_cancel"
#15 0x000055bbd4c8a658 in ehci_free_packet (p=0x55bbd7dbcb40) at hw/usb/hcd-ehci.c:522
No locals.
#16 0x000055bbd4c8a875 in ehci_cancel_queue (q=q@entry=0x55bbd7d7ceb0) at hw/usb/hcd-ehci.c:580
        p = <optimized out>
        packets = 0
#17 0x000055bbd4c8a8cb in ehci_free_queue (q=0x55bbd7d7ceb0, warn=0x0) at hw/usb/hcd-ehci.c:607
        head = 0x55bbddde1140
        cancelled = <optimized out>
#18 0x000055bbd4c8ac0a in ehci_queues_rip_unseen (ehci=ehci@entry=0x55bbddde09f0, async=1) at hw/usb/hcd-ehci.c:656
        head = 0x55bbddde1140
        q = <optimized out>
        tmp = 0x55bbd7d7c7f0
#19 0x000055bbd4c8c2b7 in ehci_advance_async_state (ehci=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2144
        ehci = 0x55bbddde09f0
#20 0x000055bbd4c8c59c in ehci_frame_timer (opaque=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2299
        ehci = 0x55bbddde09f0
        need_timer = 1
        expire_time = <optimized out>
        t_now = 3074643076714
        ns_elapsed = <optimized out>
        uframes = 1
        skipped_uframes = <optimized out>
        i = <optimized out>
#21 0x000055bbd4db3ea1 in aio_bh_call (bh=0x55bbddaf1150) at util/async.c:90
No locals.
#22 aio_bh_poll (ctx=ctx@entry=0x55bbd7d85700) at util/async.c:118
        bh = <optimized out>
        bhp = <optimized out>
---Type <return> to continue, or q <return> to quit---
        next = 0x55bbddaf00d0
        ret = 1
        deleted = false
#23 0x000055bbd4db6bb0 in aio_dispatch (ctx=0x55bbd7d85700) at util/aio-posix.c:429
No locals.
#24 0x000055bbd4db3d7e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
        ctx = <optimized out>
#25 0x00007f2dd83abd7a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
No symbol table info available.
#26 0x000055bbd4db5e6c in glib_pollfds_poll () at util/main-loop.c:213
        context = 0x55bbd7de4a50
        pfds = <optimized out>
#27 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
        context = 0x55bbd7de4a50
        ret = 1
        spin_counter = 0
#28 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:517
        ret = 1
        timeout = 4294967295
        timeout_ns = <optimized out>
#29 0x000055bbd4aa701c in main_loop () at vl.c:1898
No locals.
#30 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4720
        i = <optimized out>
        snapshot = <optimized out>
        linux_boot = <optimized out>
        initrd_filename = <optimized out>
        kernel_filename = <optimized out>
        kernel_cmdline = <optimized out>
        boot_order = <optimized out>
        boot_once = 0x0
        cyls = <optimized out>
        heads = <optimized out>
        secs = <optimized out>
        translation = <optimized out>
        opts = <optimized out>
        machine_opts = <optimized out>
        hda_opts = <optimized out>
        icount_opts = <optimized out>
        accel_opts = <optimized out>
---Type <return> to continue, or q <return> to quit---
        olist = <optimized out>
        optind = 113
        optarg = 0x7ffc77472497 "usb-storage,drive=drive_disk16,id=device_disk16"
        loadvm = <optimized out>
        machine_class = 0x0
        cpu_model = <optimized out>
        vga_model = 0x7ffc774717be "qxl"
        qtest_chrdev = <optimized out>
        qtest_log = <optimized out>
        pid_file = <optimized out>
        incoming = <optimized out>
        defconfig = <optimized out>
        userconfig = <optimized out>
        nographic = <optimized out>
        display_type = <optimized out>
        display_remote = <optimized out>
        log_mask = <optimized out>
        log_file = <optimized out>
        trace_file = <optimized out>
        maxram_size = <optimized out>
        ram_slots = <optimized out>
        vmstate_dump_file = <optimized out>
        main_loop_err = 0x0
        err = 0x0
        list_data_dirs = <optimized out>
        bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffc774707c0}
        __func__ = "main"
        __FUNCTION__ = "main"

Comment 4 Gerd Hoffmann 2017-05-11 14:22:29 UTC
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13179248
please test

Comment 5 hachen 2017-05-19 09:30:44 UTC
I tried 5 times on
kernel-3.10.0-656.el7.x86_64
qemu-kvm-rhev-2.9.0-5.el7.x86_64.
I could not reproduce this bug.

Sorry Gerd,I did not save the build you provided in comment 4.
Could you please provide another build that fixed this issue?
Many thanks.

Comment 6 Gerd Hoffmann 2017-05-19 10:28:34 UTC
It's a race, so there is no reliable reproducer.

new scratch build:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13238187

Comment 7 hachen 2017-05-22 05:00:52 UTC
I tested 5 times on 
kernel-3.10.0-656.el7.x86_64
qemu-kvm-rhev-2.9.0-3.el7.bz1449609.1.x86_64 (provided in the comment 6)

It works fine. I didn't encounter the qemu dump issue.

And I tired 3 times on the reporter's machine, try to reproduce it.
But it did not happen.

Comment 8 Gerd Hoffmann 2017-05-22 06:24:10 UTC
> And I tired 3 times on the reporter's machine, try to reproduce it.
> But it did not happen.

Well, it's a race ...

You can try to put some load to the test machine, that might make it more likely to happen.

Comment 9 hachen 2017-05-22 06:28:37 UTC
(In reply to Gerd Hoffmann from comment #8)
> > And I tired 3 times on the reporter's machine, try to reproduce it.
> > But it did not happen.
> 
> Well, it's a race ...
> 
> You can try to put some load to the test machine, that might make it more
> likely to happen.

I understand that.
Since there is no reliable reproducer, I will not to try to reproduce it.
The comment 7 is just to show that the fix works, at least in my 5 tests.
Thanks.

Comment 10 Gerd Hoffmann 2017-05-22 08:30:22 UTC
submitted upstream
https://patchwork.ozlabs.org/patch/764642/

Comment 11 CongLi 2017-05-22 09:28:27 UTC
(In reply to Gerd Hoffmann from comment #6)
> It's a race, so there is no reliable reproducer.
> 
> new scratch build:
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13238187

Hi Yanbin,

Could you please try to test this bug with the build above?

Thanks.

Comment 12 yduan 2017-05-23 04:30:58 UTC
Hi,

Can not reproduce this bug no matter with qemu-kvm-rhev-2.9.0-3.el7.x86_64 or qemu-kvm-rhev-2.9.0-3.el7.bz1449609.1.x86_64.

I met this problem once.

Thanks.

Comment 13 Gerd Hoffmann 2017-06-06 14:00:28 UTC
upstream commit 3bfecee2cb71f21cd39d6183f18b446c01917573
(also pick 3ae7eb88c472799f17beb653f4bc3fb8b124c72e while being at it).

Comment 14 Miroslav Rezanina 2017-06-08 16:27:22 UTC
Fix included in qemu-kvm-rhev-2.9.0-9.el7

Comment 18 Gerd Hoffmann 2017-06-12 08:38:52 UTC
https://patchwork.ozlabs.org/patch/774447/

Comment 20 Gerd Hoffmann 2017-07-12 13:21:21 UTC
(In reply to Gerd Hoffmann from comment #18)
> https://patchwork.ozlabs.org/patch/774447/

commit ad3c5412f2704672bb212bb82035c9b1a72db782

Comment 25 errata-xmlrpc 2018-04-11 00:19:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104


Note You need to log in before you can comment on or make changes to this bug.