Bug 1449609
Summary: | qemu coredump when dd on multiple usb-storage devices concurrently in guest | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | yduan | ||||
Component: | qemu-kvm-rhev | Assignee: | Gerd Hoffmann <kraxel> | ||||
Status: | CLOSED ERRATA | QA Contact: | Gu Nini <ngu> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.4 | CC: | chayang, coli, hachen, jinzhao, juzhang, knoel, kraxel, lmiksik, mrezanin, ngu, ovasik, virt-maint, xfu, yduan | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-kvm-rhev-2.10.0-1.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-04-11 00:19:31 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1473046 | ||||||
Attachments: |
|
I tried 5 times on kernel-3.10.0-656.el7.x86_64 qemu-kvm-rhev-2.9.0-5.el7.x86_64. I could not reproduce this bug. Sorry Gerd,I did not save the build you provided in comment 4. Could you please provide another build that fixed this issue? Many thanks. It's a race, so there is no reliable reproducer. new scratch build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13238187 I tested 5 times on kernel-3.10.0-656.el7.x86_64 qemu-kvm-rhev-2.9.0-3.el7.bz1449609.1.x86_64 (provided in the comment 6) It works fine. I didn't encounter the qemu dump issue. And I tired 3 times on the reporter's machine, try to reproduce it. But it did not happen. > And I tired 3 times on the reporter's machine, try to reproduce it.
> But it did not happen.
Well, it's a race ...
You can try to put some load to the test machine, that might make it more likely to happen.
(In reply to Gerd Hoffmann from comment #8) > > And I tired 3 times on the reporter's machine, try to reproduce it. > > But it did not happen. > > Well, it's a race ... > > You can try to put some load to the test machine, that might make it more > likely to happen. I understand that. Since there is no reliable reproducer, I will not to try to reproduce it. The comment 7 is just to show that the fix works, at least in my 5 tests. Thanks. submitted upstream https://patchwork.ozlabs.org/patch/764642/ (In reply to Gerd Hoffmann from comment #6) > It's a race, so there is no reliable reproducer. > > new scratch build: > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13238187 Hi Yanbin, Could you please try to test this bug with the build above? Thanks. Hi, Can not reproduce this bug no matter with qemu-kvm-rhev-2.9.0-3.el7.x86_64 or qemu-kvm-rhev-2.9.0-3.el7.bz1449609.1.x86_64. I met this problem once. Thanks. upstream commit 3bfecee2cb71f21cd39d6183f18b446c01917573 (also pick 3ae7eb88c472799f17beb653f4bc3fb8b124c72e while being at it). Fix included in qemu-kvm-rhev-2.9.0-9.el7 (In reply to Gerd Hoffmann from comment #18) > https://patchwork.ozlabs.org/patch/774447/ commit ad3c5412f2704672bb212bb82035c9b1a72db782 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |
Created attachment 1277600 [details] Full_CMD Description of problem: qemu coredump when dd on multiple usb-storage devices concurrently in guest Version-Release number of selected component (if applicable): Host: # rpm -q qemu-kvm-rhev qemu-kvm-rhev-2.9.0-3.el7.x86_64 # uname -r 3.10.0-664.el7.x86_64 Guest: 3.10.0-663.el7.x86_64 How reproducible: 1/2 Steps to Reproduce: 1.Boot a VM with multiple usb-storage devices: /usr/libexec/qemu-kvm \ ...... -device usb-ehci,id=usb1 \ -drive file=images/stg1.qcow2,format=qcow2,id=drive_disk1,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_disk1,id=device_disk1 \ ...... -drive file=images/stg6.qcow2,format=qcow2,id=drive_disk6,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_disk6,id=device_disk6 \ -device usb-ehci,id=usb2 \ -drive file=images/stg7.qcow2,format=qcow2,id=drive_disk7,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_disk7,id=device_disk7 \ ...... -drive file=images/stg12.qcow2,format=qcow2,id=drive_disk12,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_disk12,id=device_disk12 \ -device usb-ehci,id=usb3 \ -drive file=images/stg13.qcow2,format=qcow2,id=drive_disk13,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_disk13,id=device_disk13 \ ...... -drive file=images/stg16.qcow2,format=qcow2,id=drive_disk16,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_disk16,id=device_disk16 \ Full command line is as attachment. 2."fdisk -l" in the guest: # fdisk -l Disk /dev/vda: 21.5 GB, 21474836480 bytes, 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: dos Disk identifier: 0x000cdb05 Device Boot Start End Blocks Id System /dev/vda1 * 2048 2099199 1048576 83 Linux /dev/vda2 2099200 41943039 19921920 8e Linux LVM Disk /dev/mapper/rhel_bootp--73--199--233-root: 18.2 GB, 18249416704 bytes, 35643392 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/rhel_bootp--73--199--233-swap: 2147 MB, 2147483648 bytes, 4194304 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/sda: 209 MB, 209715200 bytes, 409600 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ...... Disk /dev/sdp: 209 MB, 209715200 bytes, 409600 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes 3. # sh dd # cat dd dd if=/dev/zero of=/dev/sda bs=1M count=100 & dd if=/dev/zero of=/dev/sdb bs=1M count=100 & dd if=/dev/zero of=/dev/sdc bs=1M count=100 & dd if=/dev/zero of=/dev/sdd bs=1M count=100 & dd if=/dev/zero of=/dev/sde bs=1M count=100 & dd if=/dev/zero of=/dev/sdf bs=1M count=100 & dd if=/dev/zero of=/dev/sdg bs=1M count=100 & dd if=/dev/zero of=/dev/sdh bs=1M count=100 & dd if=/dev/zero of=/dev/sdi bs=1M count=100 & dd if=/dev/zero of=/dev/sdj bs=1M count=100 & dd if=/dev/zero of=/dev/sdk bs=1M count=100 & dd if=/dev/zero of=/dev/sdl bs=1M count=100 & dd if=/dev/zero of=/dev/sdm bs=1M count=100 & dd if=/dev/zero of=/dev/sdn bs=1M count=100 & dd if=/dev/zero of=/dev/sdo bs=1M count=100 & dd if=/dev/zero of=/dev/sdp bs=1M count=100 & Actual results: qemu-kvm: hw/usb/core.c:492: usb_cancel_packet: Assertion `usb_packet_is_inflight(p)' failed. max_usb.sh: line 60: 16187 Aborted (core dumped) Expected results: There should be no coredump. Additional info: # gdb core.16187 (gdb) bt #0 0x00007f2dd6afd1d7 in raise () from /lib64/libc.so.6 #1 0x00007f2dd6afe8c8 in abort () from /lib64/libc.so.6 #2 0x00007f2dd6af6146 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007f2dd6af61f2 in __assert_fail () from /lib64/libc.so.6 #4 0x000055bbd4c8250d in usb_cancel_packet (p=0x55bbd7dbcb80) at hw/usb/core.c:492 #5 0x000055bbd4c8a658 in ehci_free_packet (p=0x55bbd7dbcb40) at hw/usb/hcd-ehci.c:522 #6 0x000055bbd4c8a875 in ehci_cancel_queue (q=q@entry=0x55bbd7d7ceb0) at hw/usb/hcd-ehci.c:580 #7 0x000055bbd4c8a8cb in ehci_free_queue (q=0x55bbd7d7ceb0, warn=0x0) at hw/usb/hcd-ehci.c:607 #8 0x000055bbd4c8ac0a in ehci_queues_rip_unseen (ehci=ehci@entry=0x55bbddde09f0, async=1) at hw/usb/hcd-ehci.c:656 #9 0x000055bbd4c8c2b7 in ehci_advance_async_state (ehci=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2144 #10 0x000055bbd4c8c59c in ehci_frame_timer (opaque=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2299 #11 0x000055bbd4db3ea1 in aio_bh_call (bh=0x55bbddaf1150) at util/async.c:90 #12 aio_bh_poll (ctx=ctx@entry=0x55bbd7d85700) at util/async.c:118 #13 0x000055bbd4db6f44 in aio_poll (ctx=0x55bbd7d85700, blocking=blocking@entry=true) at util/aio-posix.c:682 #14 0x000055bbd4d465d1 in bdrv_aio_cancel (acb=0x55bbdbb0c800) at block/io.c:2129 #15 0x000055bbd4c8a658 in ehci_free_packet (p=0x55bbd7dbcb40) at hw/usb/hcd-ehci.c:522 #16 0x000055bbd4c8a875 in ehci_cancel_queue (q=q@entry=0x55bbd7d7ceb0) at hw/usb/hcd-ehci.c:580 #17 0x000055bbd4c8a8cb in ehci_free_queue (q=0x55bbd7d7ceb0, warn=0x0) at hw/usb/hcd-ehci.c:607 #18 0x000055bbd4c8ac0a in ehci_queues_rip_unseen (ehci=ehci@entry=0x55bbddde09f0, async=1) at hw/usb/hcd-ehci.c:656 #19 0x000055bbd4c8c2b7 in ehci_advance_async_state (ehci=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2144 #20 0x000055bbd4c8c59c in ehci_frame_timer (opaque=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2299 #21 0x000055bbd4db3ea1 in aio_bh_call (bh=0x55bbddaf1150) at util/async.c:90 #22 aio_bh_poll (ctx=ctx@entry=0x55bbd7d85700) at util/async.c:118 #23 0x000055bbd4db6bb0 in aio_dispatch (ctx=0x55bbd7d85700) at util/aio-posix.c:429 #24 0x000055bbd4db3d7e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 #25 0x00007f2dd83abd7a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #26 0x000055bbd4db5e6c in glib_pollfds_poll () at util/main-loop.c:213 #27 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261 #28 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:517 #29 0x000055bbd4aa701c in main_loop () at vl.c:1898 #30 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4720 (gdb) bt full #0 0x00007f2dd6afd1d7 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x00007f2dd6afe8c8 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x00007f2dd6af6146 in __assert_fail_base () from /lib64/libc.so.6 No symbol table info available. #3 0x00007f2dd6af61f2 in __assert_fail () from /lib64/libc.so.6 No symbol table info available. #4 0x000055bbd4c8250d in usb_cancel_packet (p=0x55bbd7dbcb80) at hw/usb/core.c:492 callback = false __PRETTY_FUNCTION__ = "usb_cancel_packet" #5 0x000055bbd4c8a658 in ehci_free_packet (p=0x55bbd7dbcb40) at hw/usb/hcd-ehci.c:522 No locals. #6 0x000055bbd4c8a875 in ehci_cancel_queue (q=q@entry=0x55bbd7d7ceb0) at hw/usb/hcd-ehci.c:580 p = <optimized out> packets = 0 #7 0x000055bbd4c8a8cb in ehci_free_queue (q=0x55bbd7d7ceb0, warn=0x0) at hw/usb/hcd-ehci.c:607 head = 0x55bbddde1140 cancelled = <optimized out> #8 0x000055bbd4c8ac0a in ehci_queues_rip_unseen (ehci=ehci@entry=0x55bbddde09f0, async=1) at hw/usb/hcd-ehci.c:656 head = 0x55bbddde1140 q = <optimized out> tmp = 0x55bbd7d7c7f0 #9 0x000055bbd4c8c2b7 in ehci_advance_async_state (ehci=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2144 ehci = 0x55bbddde09f0 #10 0x000055bbd4c8c59c in ehci_frame_timer (opaque=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2299 ehci = 0x55bbddde09f0 need_timer = 1 expire_time = <optimized out> t_now = 3074643889872 ns_elapsed = <optimized out> uframes = 6 skipped_uframes = <optimized out> i = <optimized out> #11 0x000055bbd4db3ea1 in aio_bh_call (bh=0x55bbddaf1150) at util/async.c:90 No locals. #12 aio_bh_poll (ctx=ctx@entry=0x55bbd7d85700) at util/async.c:118 bh = <optimized out> bhp = <optimized out> next = 0x55bbddaf00d0 ---Type <return> to continue, or q <return> to quit--- ret = 1 deleted = false #13 0x000055bbd4db6f44 in aio_poll (ctx=0x55bbd7d85700, blocking=blocking@entry=true) at util/aio-posix.c:682 node = <optimized out> i = <optimized out> ret = 0 progress = <optimized out> timeout = <optimized out> start = 0 __PRETTY_FUNCTION__ = "aio_poll" #14 0x000055bbd4d465d1 in bdrv_aio_cancel (acb=0x55bbdbb0c800) at block/io.c:2129 __PRETTY_FUNCTION__ = "bdrv_aio_cancel" #15 0x000055bbd4c8a658 in ehci_free_packet (p=0x55bbd7dbcb40) at hw/usb/hcd-ehci.c:522 No locals. #16 0x000055bbd4c8a875 in ehci_cancel_queue (q=q@entry=0x55bbd7d7ceb0) at hw/usb/hcd-ehci.c:580 p = <optimized out> packets = 0 #17 0x000055bbd4c8a8cb in ehci_free_queue (q=0x55bbd7d7ceb0, warn=0x0) at hw/usb/hcd-ehci.c:607 head = 0x55bbddde1140 cancelled = <optimized out> #18 0x000055bbd4c8ac0a in ehci_queues_rip_unseen (ehci=ehci@entry=0x55bbddde09f0, async=1) at hw/usb/hcd-ehci.c:656 head = 0x55bbddde1140 q = <optimized out> tmp = 0x55bbd7d7c7f0 #19 0x000055bbd4c8c2b7 in ehci_advance_async_state (ehci=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2144 ehci = 0x55bbddde09f0 #20 0x000055bbd4c8c59c in ehci_frame_timer (opaque=0x55bbddde09f0) at hw/usb/hcd-ehci.c:2299 ehci = 0x55bbddde09f0 need_timer = 1 expire_time = <optimized out> t_now = 3074643076714 ns_elapsed = <optimized out> uframes = 1 skipped_uframes = <optimized out> i = <optimized out> #21 0x000055bbd4db3ea1 in aio_bh_call (bh=0x55bbddaf1150) at util/async.c:90 No locals. #22 aio_bh_poll (ctx=ctx@entry=0x55bbd7d85700) at util/async.c:118 bh = <optimized out> bhp = <optimized out> ---Type <return> to continue, or q <return> to quit--- next = 0x55bbddaf00d0 ret = 1 deleted = false #23 0x000055bbd4db6bb0 in aio_dispatch (ctx=0x55bbd7d85700) at util/aio-posix.c:429 No locals. #24 0x000055bbd4db3d7e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 ctx = <optimized out> #25 0x00007f2dd83abd7a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 No symbol table info available. #26 0x000055bbd4db5e6c in glib_pollfds_poll () at util/main-loop.c:213 context = 0x55bbd7de4a50 pfds = <optimized out> #27 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261 context = 0x55bbd7de4a50 ret = 1 spin_counter = 0 #28 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:517 ret = 1 timeout = 4294967295 timeout_ns = <optimized out> #29 0x000055bbd4aa701c in main_loop () at vl.c:1898 No locals. #30 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4720 i = <optimized out> snapshot = <optimized out> linux_boot = <optimized out> initrd_filename = <optimized out> kernel_filename = <optimized out> kernel_cmdline = <optimized out> boot_order = <optimized out> boot_once = 0x0 cyls = <optimized out> heads = <optimized out> secs = <optimized out> translation = <optimized out> opts = <optimized out> machine_opts = <optimized out> hda_opts = <optimized out> icount_opts = <optimized out> accel_opts = <optimized out> ---Type <return> to continue, or q <return> to quit--- olist = <optimized out> optind = 113 optarg = 0x7ffc77472497 "usb-storage,drive=drive_disk16,id=device_disk16" loadvm = <optimized out> machine_class = 0x0 cpu_model = <optimized out> vga_model = 0x7ffc774717be "qxl" qtest_chrdev = <optimized out> qtest_log = <optimized out> pid_file = <optimized out> incoming = <optimized out> defconfig = <optimized out> userconfig = <optimized out> nographic = <optimized out> display_type = <optimized out> display_remote = <optimized out> log_mask = <optimized out> log_file = <optimized out> trace_file = <optimized out> maxram_size = <optimized out> ram_slots = <optimized out> vmstate_dump_file = <optimized out> main_loop_err = 0x0 err = 0x0 list_data_dirs = <optimized out> bdo_queue = {sqh_first = 0x0, sqh_last = 0x7ffc774707c0} __func__ = "main" __FUNCTION__ = "main"