Description of problem: Boot up a guest with a data-plane disk using NBD protocol, after guest boots up, hot-unplug this NBD data disk, then both qemu-kvm and guest will hang. Version-Release number of selected component (if applicable): Host kernel: 4.11.0-28.el7a.ppc64le qemu-kvm: qemu-kvm-2.9.0-21.el7a Guest kernel: 4.11.0-22.el7a.ppc64le How reproducible: 100% Steps to Reproduce: 1. Export an image file on the host who acts as NBD server as well: # qemu-img create -f qcow2 -o preallocation=full /home/yilzhang/NBD-server-virt5.qcow2 5G # qemu-nbd -f raw /home/yilzhang/NBD-server-virt5.qcow2 -p 9000 -t & 2. Boot up one guest, which attaches the above disk image (as data disk) /usr/libexec/qemu-kvm \ -smp 8,sockets=2,cores=4,threads=1 -m 8192 \ -serial unix:/tmp/3dp-serial.log,server,nowait \ -nodefaults \ -rtc base=localtime,clock=host \ -boot menu=on \ -monitor stdio \ -monitor unix:/tmp/monitor1,server,nowait \ -qmp tcp:0:777,server,nowait \ \ -device pci-bridge,id=bridge1,chassis_nr=1,bus=pci.0 \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,bus=bridge1,addr=0x1f,id=scsi0,iothread=iothread0 \ -drive file=rhel74-64-virtio.qcow2,media=disk,if=none,cache=none,id=drive_sysdisk,aio=native,format=qcow2,werror=stop,rerror=stop \ -device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0 \ \ -drive file=nbd:10.66.10.208:9000,if=none,cache=none,id=drive_ddisk_3,aio=native,format=qcow2,werror=stop,rerror=stop \ -device scsi-hd,drive=drive_ddisk_3,bus=scsi0.0,id=ddisk_3 \ \ -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \ -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c3:e7:8a,bus=bridge1,addr=0x1e 3. After guest boots up, login guest and check that this data disk exists and works well: [guest]# dd if=/dev/zero of=/dev/sdb bs=1M count=1 oflag=sync 4. Hot-unplug this data disk { "execute": "__com.redhat_drive_del", "arguments": { "id": "drive_ddisk_3" }} Actual results: Both qemu-kvm and guest hang there, no response. Expected results: No hang, and hot-unplug should succeed; after that, writing to this deleted data disk should hit EIO error. Additional info: If this NBD device is without data-plane, then hot-unplug it, there is no hang (everything works well)
This also could be reproduced on x86 and P8 P8 version: Host kernel: 3.10.0-693.el7.ppc64le qemu version: qemu-kvm-rhev-2.9.0-16.el7_4.3.ppc64le Guest kernel: 3.10.0-693.el7.ppc64le x86 version: Host kernel: 3.10.0-693.el7.x86_64 qemu version: qemu-kvm-rhev-2.9.0-16.el7_4.3.x86_64 Guest kernel: 3.10.0-648.el7.x86_64
When qemu-kvm hangs, typing "Ctrl +C" to kill this process may cause qemu-kvm core dumped, I totally hit twice(2/20, one time on Power9, the other time on x86) [root@c155f2-u23 dataplane]# gdb /usr/libexec/qemu-kvm core.4109 GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "ppc64le-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/libexec/qemu-kvm...Reading symbols from /usr/lib/debug/usr/libexec/qemu-kvm.debug...done. done. [New LWP 4111] [New LWP 4109] [New LWP 4136] [New LWP 4134] [New LWP 4133] [New LWP 4130] [New LWP 4132] [New LWP 4135] [New LWP 4129] [New LWP 4110] [New LWP 4143] [New LWP 4137] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/libexec/qemu-kvm -smp 8,sockets=2,cores=4,threads=1 -m 8192 -serial unix:/'. Program terminated with signal 6, Aborted. #0 0x00003fffa74deff0 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.ppc64le cyrus-sasl-lib-2.1.26-21.el7.ppc64le cyrus-sasl-plain-2.1.26-21.el7.ppc64le elfutils-libelf-0.168-8.el7.ppc64le elfutils-libs-0.168-8.el7.ppc64le glib2-2.50.3-3.el7.ppc64le glibc-2.17-196.el7.ppc64le gmp-6.0.0-15.el7.ppc64le gnutls-3.3.26-9.el7.ppc64le gperftools-libs-2.4-8.el7.ppc64le keyutils-libs-1.5.8-3.el7.ppc64le krb5-libs-1.15.1-8.el7.ppc64le libaio-0.3.109-13.el7.ppc64le libattr-2.4.46-12.el7.ppc64le libcap-2.22-9.el7.ppc64le libcom_err-1.42.9-10.el7.ppc64le libcurl-7.29.0-42.el7.ppc64le libdb-5.3.21-20.el7.ppc64le libfdt-1.4.3-1.el7.ppc64le libffi-3.0.13-18.el7.ppc64le libgcc-4.8.5-16.el7.ppc64le libgcrypt-1.5.3-14.el7.ppc64le libgpg-error-1.12-3.el7.ppc64le libibverbs-13-7.el7.ppc64le libidn-1.28-4.el7.ppc64le libiscsi-1.9.0-7.el7.ppc64le libnl3-3.2.28-4.el7.ppc64le libpng-1.5.13-7.el7_2.ppc64le librdmacm-13-7.el7.ppc64le libseccomp-2.3.1-3.el7.ppc64le libselinux-2.5-11.el7.ppc64le libssh2-1.4.3-10.el7_2.1.ppc64le libstdc++-4.8.5-16.el7.ppc64le libtasn1-4.10-1.el7.ppc64le libusbx-1.0.20-1.el7.ppc64le lzo-2.06-8.el7.ppc64le nettle-2.7.1-8.el7.ppc64le nspr-4.13.1-1.0.el7_3.ppc64le nss-3.28.4-12.el7_4.ppc64le nss-softokn-freebl-3.28.3-8.el7_4.ppc64le nss-util-3.28.4-3.el7.ppc64le numactl-libs-2.0.9-6.el7_2.ppc64le openldap-2.4.44-5.el7.ppc64le openssl-libs-1.0.2k-8.el7.ppc64le p11-kit-0.23.5-3.el7.ppc64le pcre-8.32-17.el7.ppc64le pixman-0.34.0-1.el7.ppc64le snappy-1.1.0-3.el7.ppc64le systemd-libs-219-42.el7.ppc64le xz-libs-5.2.2-1.el7.ppc64le zlib-1.2.7-17.el7.ppc64le (gdb) bt #0 0x00003fffa74deff0 in raise () from /lib64/libc.so.6 #1 0x00003fffa74e136c in abort () from /lib64/libc.so.6 #2 0x00003fffa74d4c44 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00003fffa74d4d34 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000043b40990 in virtio_scsi_ctx_check (s=<optimized out>, s=<optimized out>, d=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245 #5 0x0000000043c0d42c in virtio_scsi_ctx_check (s=<optimized out>, s=<optimized out>, d=0x577cde80) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245 #6 virtio_scsi_handle_cmd_req_prepare (req=0x58b64380, s=0x58abc510) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:558 #7 virtio_scsi_handle_cmd_vq (s=0x58abc510, vq=0x58b40100) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:598 #8 0x0000000043c0e5d0 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0x58b40100) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi-dataplane.c:60 #9 0x0000000043c1c19c in virtio_queue_notify_aio_vq (vq=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:1510 #10 0x0000000043c1d014 in virtio_queue_notify_aio_vq (vq=0x58b40100) at /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:1506 #11 virtio_queue_host_notifier_aio_poll (opaque=0x58b40168) at /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:2410 #12 0x0000000043f58f14 in run_poll_handlers_once (ctx=0x576f17c0) at util/aio-posix.c:490 #13 0x0000000043f5a088 in run_poll_handlers (max_ns=<optimized out>, ctx=0x576f17c0) at util/aio-posix.c:527 #14 try_poll_mode (blocking=<optimized out>, ctx=0x576f17c0) at util/aio-posix.c:555 #15 aio_poll (ctx=0x576f17c0, blocking=<optimized out>) at util/aio-posix.c:595 #16 0x0000000043d40548 in iothread_run (opaque=0x57840840) at iothread.c:59 #17 0x00003fffa7698af4 in start_thread () from /lib64/libpthread.so.0 #18 0x00003fffa75c4ef4 in clone () from /lib64/libc.so.6 (gdb) bt full #0 0x00003fffa74deff0 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x00003fffa74e136c in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x00003fffa74d4c44 in __assert_fail_base () from /lib64/libc.so.6 No symbol table info available. #3 0x00003fffa74d4d34 in __assert_fail () from /lib64/libc.so.6 No symbol table info available. #4 0x0000000043b40990 in virtio_scsi_ctx_check (s=<optimized out>, s=<optimized out>, d=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245 No locals. #5 0x0000000043c0d42c in virtio_scsi_ctx_check (s=<optimized out>, s=<optimized out>, d=0x577cde80) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245 No locals. #6 virtio_scsi_handle_cmd_req_prepare (req=0x58b64380, s=0x58abc510) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:558 vs = 0x58abc510 rc = <optimized out> #7 virtio_scsi_handle_cmd_vq (s=0x58abc510, vq=0x58b40100) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:598 req = 0x58b64380 next = <optimized out> ret = <optimized out> progress = true reqs = {tqh_first = 0x0, tqh_last = 0x3fffa590def8} #8 0x0000000043c0e5d0 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0x58b40100) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi-dataplane.c:60 progress = <optimized out> s = 0x58abc510 #9 0x0000000043c1c19c in virtio_queue_notify_aio_vq (vq=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:1510 vdev = <optimized out> #10 0x0000000043c1d014 in virtio_queue_notify_aio_vq (vq=0x58b40100) at /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:1506 No locals. #11 virtio_queue_host_notifier_aio_poll (opaque=0x58b40168) at /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:2410 n = 0x58b40168 vq = 0x58b40100 progress = <optimized out> #12 0x0000000043f58f14 in run_poll_handlers_once (ctx=0x576f17c0) at util/aio-posix.c:490 progress = <optimized out> node = 0x578689a0 #13 0x0000000043f5a088 in run_poll_handlers (max_ns=<optimized out>, ctx=0x576f17c0) at util/aio-posix.c:527 progress = <optimized out> end_time = 174854958775656 #14 try_poll_mode (blocking=<optimized out>, ctx=0x576f17c0) at util/aio-posix.c:555 max_ns = <optimized out> #15 aio_poll (ctx=0x576f17c0, blocking=<optimized out>) at util/aio-posix.c:595 node = <optimized out> i = <optimized out> ret = 0 progress = <optimized out> timeout = <optimized out> start = 174854958757593 __PRETTY_FUNCTION__ = "aio_poll" #16 0x0000000043d40548 in iothread_run (opaque=0x57840840) at iothread.c:59 iothread = 0x57840840 #17 0x00003fffa7698af4 in start_thread () from /lib64/libpthread.so.0 ---Type <return> to continue, or q <return> to quit--- No symbol table info available. #18 0x00003fffa75c4ef4 in clone () from /lib64/libc.so.6 No symbol table info available.
Libvirt currently does not allow the hot-unplug of an NBD device. Tested by adding this to a libvirt domain: + <disk type='network' device='disk'> + <driver name='qemu' type='qcow2'/> + <source protocol='nbd' name='bar'> + <host name='localhost' port='10809'/> + </source> + <backingStore/> + <target dev='sdc' bus='ide'/> + </disk> then booting, proving that it can access the NBD disk from the guest, then in the host: # virsh detach-disk $dom sdc --live error: Failed to detach disk error: Operation not supported: This type of disk cannot be hot unplugged So, by that measure, this bug is less important to RHEL (since we tend to require libvirt as the driving force). But I'm still investigating if there is a bug in qemu that needs fixing.
I spoke a bit soon - IDE devices can't be hot-unplugged, but SCSI can; changing the <disk> XML slightly to: + <disk type='network' device='disk'> + <driver name='qemu' type='qcow2'/> + <source protocol='nbd' name='bar'> + <host name='localhost' port='10809'/> + </source> + <target dev='sdc' bus='scsi'/> + </disk> lets libvirt hot-unplug the disk, so this DOES look relevant to RHEV after all. However, when running the test under Fedora 26, with qemu-kvm-2.10.0-1.fc26.x86_64, things don't fail. So now I'm trying to reproduce the test on RHEV; in which case it may be something that upstream has already fixed.
(In reply to yilzhang from comment #0) > Description of problem: > Boot up a guest with a data-plane disk using NBD protocol, after guest boots > up, hot-unplug this NBD data disk, then both qemu-kvm and guest will hang. > > 3. After guest boots up, login guest and check that this data disk exists > and works well: > [guest]# dd if=/dev/zero of=/dev/sdb bs=1M count=1 oflag=sync > > 4. Hot-unplug this data disk > { "execute": "__com.redhat_drive_del", "arguments": { "id": "drive_ddisk_3" > }} Are you unplugging the disk WHILE the guest is in the middle of the dd? Or did the dd complete first? I can't seem to reproduce a hang while the guest is not actively accessing the device, so I'm wondering if I'm missing something in my attempts to reproduce things.
The dd process completes first(In step3, I just checked that data disk could work), then unplug that disk. I did not hot-unplug that NBD data disk WHILE the guest is in the middle of the dd.
*** Bug 1539537 has been marked as a duplicate of this bug. ***
The hang is occurring in SCSI aio code; stepping through it, I coming up with this trace as the trigger: Thread 1 (Thread 0x7ffff7f8c280 (LWP 26222)): #0 0x00007fffeddbd4d6 in ppoll () at /lib64/libc.so.6 #1 0x0000555555c27cfd in qemu_poll_ns (fds=0x555556828f50, nfds=1, timeout=-1) at util/qemu-timer.c:322 #2 0x0000555555c2aa29 in aio_poll (ctx=0x5555568053e0, blocking=true) at util/aio-posix.c:629 #3 0x0000555555b82d42 in bdrv_flush (bs=0x55555681d240) at block/io.c:2308 #4 0x0000555555b14e43 in bdrv_close (bs=0x55555681d240) at block.c:3164 #5 0x0000555555b15522 in bdrv_delete (bs=0x55555681d240) at block.c:3354 #6 0x0000555555b1747e in bdrv_unref (bs=0x55555681d240) at block.c:4347 #7 0x0000555555b128c8 in bdrv_root_unref_child (child=0x555556828370) at block.c:2073 #8 0x0000555555b6ad8f in blk_remove_bs (blk=0x55555681cfe0) at block/block-backend.c:674 #9 0x00005555558dd8a4 in qmp___com_redhat_drive_del (id=0x555556cd94f0 "drive_ddisk_3", errp=0x7fffffffc240) at blockdev.c:3024 where bdrv_flush() says it is not in coroutine context, and so ends up trying to call aio_flush: if (qemu_in_coroutine()) { /* Fast-path if already in coroutine context */ bdrv_flush_co_entry(&flush_co); } else { co = qemu_coroutine_create(bdrv_flush_co_entry, &flush_co); bdrv_coroutine_enter(bs, co); BDRV_POLL_WHILE(bs, flush_co.ret == NOT_DONE); } This also matches the crash backtrace from comment #3 > #0 0x00003fffa74deff0 in raise () from /lib64/libc.so.6 > #1 0x00003fffa74e136c in abort () from /lib64/libc.so.6 > #2 0x00003fffa74d4c44 in __assert_fail_base () from /lib64/libc.so.6 > #3 0x00003fffa74d4d34 in __assert_fail () from /lib64/libc.so.6 > #4 0x0000000043b40990 in virtio_scsi_ctx_check (s=<optimized out>, > s=<optimized out>, d=<optimized out>) at > /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245 > #5 0x0000000043c0d42c in virtio_scsi_ctx_check (s=<optimized out>, > s=<optimized out>, d=0x577cde80) at > /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245 > #6 virtio_scsi_handle_cmd_req_prepare (req=0x58b64380, s=0x58abc510) at > /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:558 > #7 virtio_scsi_handle_cmd_vq (s=0x58abc510, vq=0x58b40100) at > /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:598 > #8 0x0000000043c0e5d0 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized > out>, vq=0x58b40100) at > /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi-dataplane.c:60 > #9 0x0000000043c1c19c in virtio_queue_notify_aio_vq (vq=<optimized out>) at > /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:1510 > #10 0x0000000043c1d014 in virtio_queue_notify_aio_vq (vq=0x58b40100) at > /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:1506 > #11 virtio_queue_host_notifier_aio_poll (opaque=0x58b40168) at > /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:2410 > #12 0x0000000043f58f14 in run_poll_handlers_once (ctx=0x576f17c0) at > util/aio-posix.c:490 > #13 0x0000000043f5a088 in run_poll_handlers (max_ns=<optimized out>, > ctx=0x576f17c0) at util/aio-posix.c:527 > #14 try_poll_mode (blocking=<optimized out>, ctx=0x576f17c0) at > util/aio-posix.c:555 > #15 aio_poll (ctx=0x576f17c0, blocking=<optimized out>) at > util/aio-posix.c:595 so it looks like something with iothreads is making aio code not play well with hot unplug, and NBD is just an easy way to trigger the problem.
For my own reference, a smaller reproducer: $ qemu-img create -f qcow2 -o preallocation=full file4 1G $ ./qemu-nbd -f raw file4 -p 9000 -t & $ ./x86_64-softmmu/qemu-system-x86_64 -m 8192 -nodefaults -nographic \ -qmp stdio -device pci-bridge,id=bridge1,chassis_nr=1,bus=pci.0 \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,bus=bridge1,addr=0x1f,id=scsi0,iothread=iothread0 \ -drive file=nbd:localhost:9000,if=none,cache=none,id=drive_ddisk_3,aio=native,format=qcow2,werror=stop,rerror=stop \ -device scsi-hd,drive=drive_ddisk_3,bus=scsi0.0,id=ddisk_3 {'execute':'qmp_capabilities'} {'execute':'__com.redhat_drive_del','arguments':{'id':'drive_ddisk_3'}} reproduced hang on v2.10.0-19.el7 __com.redhat_drive_del is downstream only; if given a node name it tries the same thing as blockdev-del (but drive_ddisk_3 is not a node name), so this may be a downstream-only problem.
(In reply to Eric Blake from comment #10) > For my own reference, a smaller reproducer: > > $ qemu-img create -f qcow2 -o preallocation=full file4 1G > $ ./qemu-nbd -f raw file4 -p 9000 -t & > $ ./x86_64-softmmu/qemu-system-x86_64 -m 8192 -nodefaults -nographic \ > -qmp stdio -device pci-bridge,id=bridge1,chassis_nr=1,bus=pci.0 \ > -object iothread,id=iothread0 \ > -device virtio-scsi-pci,bus=bridge1,addr=0x1f,id=scsi0,iothread=iothread0 \ > -drive > file=nbd:localhost:9000,if=none,cache=none,id=drive_ddisk_3,aio=native, > format=qcow2,werror=stop,rerror=stop \ > -device scsi-hd,drive=drive_ddisk_3,bus=scsi0.0,id=ddisk_3 > {'execute':'qmp_capabilities'} > {'execute':'__com.redhat_drive_del','arguments':{'id':'drive_ddisk_3'}} > > reproduced hang on v2.10.0-19.el7 > > __com.redhat_drive_del is downstream only; if given a node name it tries the > same thing as blockdev-del (but drive_ddisk_3 is not a node name), so this > may be a downstream-only problem. Given __com.redhat_drive_del is going away with blockdev (it is, right?), how relevant is this now?
__com.redhat_drive_del still exists in RHEL 7, but we should be doing whatever possible to make sure RHEL 8 does not keep it. However, I am not sure what our rules for cross-version compatability are in RHEL 7, whether we have reached a point where we can assume that libvirt can use the up-to-date interfaces instead of the downstream-only interface. If it is not affecting customers, can we just push this off to RHEL 8, where it goes away at the same time as the downstream interface?
Hi coli, With blockdev + virtio_scsi + nbd + iothread, the device could be unpluged successfully, but when delete the "node-name", qemu would hang while the guest worked well. Host: kernel-4.18.0-67.el8.x86_64 qemu-kvm-2.12.0-61.module+el8+2786+5afd5ae3.x86_64 Guest: 4.18.0-67.el8.x86_64 Steps: 1) Create nbd device; # qemu-img create -f qcow2 -o preallocation=full data.qcow2 10G # qemu-nbd -f raw data.qcow2 -p 9000 -t & 2) Boot the guest with above device as data disk; /usr/libexec/qemu-kvm \ -S \ -name 'rhel-8.0' \ -sandbox off \ -machine pc \ -nodefaults \ -device qxl-vga \ -object iothread,id=iothread0 \ -object iothread,id=iothread1 \ -device virtio-scsi-pci,id=scsi0,iothread=iothread0 \ -blockdev driver=file,cache.direct=on,cache.no-flush=off,node-name=file_win1,filename=rhel80-64-virtio.qcow2,aio=native \ -blockdev driver=qcow2,node-name=drive_win1,file=file_win1 \ -device scsi-hd,id=image2,drive=drive_win1,bootindex=0 \ -blockdev driver=nbd,cache.direct=on,cache.no-flush=off,node-name=file_data1,server.host=10.73.130.201,server.port=9000,server.type=inet \ -blockdev driver=raw,node-name=drive_data,file=file_data1 \ -device scsi-hd,id=data1,drive=drive_data \ -device virtio-net-pci,mac=6c:ae:8b:20:80:70,id=iddd,vectors=4,netdev=idttt \ -netdev tap,id=idttt,vhost=on \ -m 4G \ -smp 12,maxcpus=12,cores=6,threads=1,sockets=2 \ -cpu 'SandyBridge' \ -rtc base=utc,clock=host,driftfix=slew \ -enable-kvm \ -monitor stdio \ -device qemu-xhci,id=usb1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -qmp tcp:0:4441,server,nowait \ -vnc :1 3) Check the device in guest, and run dd on it; # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdb 8:16 0 10G 0 disk # dd if=/dev/zero of=/dev/sdb bs=1M count=100 oflag=sync 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 10.5793 s, 9.9 MB/s 4) Unplug the device and its backends in qmp; { 'execute':'device_del','arguments':{'id':'data1'}} {"timestamp": {"seconds": 1550560722, "microseconds": 129107}, "event": "DEVICE_DELETED", "data": {"device": "data1", "path": "/machine/peripheral/data1"}} {"return": {}} { 'execute':'blockdev-del','arguments':{'node-name':'drive_data'}} {"return": {}} { 'execute':'blockdev-del','arguments':{'node-name':'file_data1'}} => no response => In hmp terminal, also could not input any words. P.S. Tried virtio_blk device, this issue did not reproduce. command line: -blockdev driver=nbd,cache.direct=on,cache.no-flush=off,node-name=file_data1,server.host=10.73.130.201,server.port=9000,server.type=inet \ -blockdev driver=raw,node-name=drive_data,file=file_data1 \ -device virtio-blk-pci,id=data1,drive=drive_data,iothread=iothread1 \ Unplug the device and its backends: { 'execute':'device_del','arguments':{'id':'data1'}} {"return": {}} {"timestamp": {"seconds": 1550560356, "microseconds": 875152}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/data1/virtio-backend"}} {"timestamp": {"seconds": 1550560356, "microseconds": 878512}, "event": "DEVICE_DELETED", "data": {"device": "data1", "path": "/machine/peripheral/data1"}} { 'execute':'blockdev-del','arguments':{'node-name':'drive_data'}} {"return": {}} { 'execute':'blockdev-del','arguments':{'node-name':'file_data1'}} {"return": {}}
This issue has been fixed by the AioContext-related patches introduced in qemu-4.0.0. Just in case it's needed as a reference, some of those patches have been backported to 2.12.0 too, and I can't reproduce this on qemu-kvm-2.12.0-86.module+el8.1.0+4146+4ed2d185 either.
Can we get the qa_ack+ on this.
Verified on qemu-kvm-4.1.0-10.module+el8.1.0+4234+33aa4f57.x86_64. Both qemu and guest work well after drive-del. QEMU CML: -object iothread,id=iothread0 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4,iothread=iothread0 \ -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel810-64-virtio-scsi.qcow2 \ -device scsi-hd,id=image1,drive=drive_image1 \ -drive id=drive_image2,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=nbd:127.0.0.1:10809 \ -device scsi-hd,id=image2,drive=drive_image2 \ (qemu) info block drive_image1 (#block110): /home/kvm_autotest_root/images/rhel810-64-virtio-scsi.qcow2 (qcow2) Attached to: image1 Cache mode: writeback, direct drive_image2 (#block361): nbd://127.0.0.1:10809 (raw) Attached to: image2 Cache mode: writeback, direct (qemu) drive_del drive_image2 (qemu) info status VM status: running (qemu) info block drive_image1 (#block110): /home/kvm_autotest_root/images/rhel810-64-virtio-scsi.qcow2 (qcow2) Attached to: image1 Cache mode: writeback, direct image2: [not inserted] Attached to: image2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3723