Description of problem: qemu crash when hotunplug disk that is doing io. 02:45:13 INFO | Start to unplug devices "stg0" by monitor qmpmonitor1. 02:45:13 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Sending command 'device_del' 02:45:13 DEBUG| Send command: {'execute': 'device_del', 'arguments': {'id': 'stg0'}, 'id': 'NPwoGmu6'} 02:45:13 INFO | [qemu output] qemu-kvm: /builddir/build/BUILD/qemu-4.2.0/hw/scsi/virtio-scsi.c:250: virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed. 02:45:36 WARNI| registers is not alive. Can't query the avocado-vt-vm1 status 02:45:42 DEBUG| Attempting to log into 'avocado-vt-vm1' (timeout 360s) 02:45:42 INFO | [qemu output] /tmp/aexpect_UtyR1AXR/aexpect-rly3turi.sh: line 1: 268982 Aborted (core dumped) MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm -S -name 'avocado-vt-vm1' -sandbox on -machine q35 -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 -nodefaults -device VGA,bus=pcie.0,addr=0x2 -m 30720 -smp 12,maxcpus=12,cores=6,threads=1,dies=1,sockets=2 -cpu 'Skylake-Server',+kvm_pv_unhalt -chardev socket,server,path=/var/tmp/avocado_6j4emk2s/monitor-qmpmonitor1-20200605-023351-cxpRnbd9,id=qmp_id_qmpmonitor1,nowait -mon chardev=qmp_id_qmpmonitor1,mode=control -chardev socket,server,path=/var/tmp/avocado_6j4emk2s/monitor-catch_monitor-20200605-023351-cxpRnbd9,id=qmp_id_catch_monitor,nowait -mon chardev=qmp_id_catch_monitor,mode=control -device pvpanic,ioport=0x505,id=id0yBSp5 -chardev socket,server,path=/var/tmp/avocado_6j4emk2s/serial-serial0-20200605-023351-cxpRnbd9,id=chardev_serial0,nowait -device isa-serial,id=serial0,chardev=chardev_serial0 -chardev socket,id=seabioslog_id_20200605-023351-cxpRnbd9,path=/var/tmp/avocado_6j4emk2s/seabios-20200605-023351-cxpRnbd9,server,nowait -device isa-debugcon,chardev=seabioslog_id_20200605-023351-cxpRnbd9,iobase=0x402 -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -object iothread,id=iothread0 -object iothread,id=iothread1 -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0,iothread=iothread0 -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel830-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 -device scsi-hd,id=image1,drive=drive_image1,write-cache=on -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 -device virtio-net-pci,mac=9a:03:3d:56:61:b2,id=idYARjM3,netdev=idhVJiQ3,bus=pcie-root-port-3,addr=0x0 -netdev tap,id=idhVJiQ3,vhost=on,vhostfd=21,fd=6 -vnc :0 -rtc base=utc,clock=host,driftfix=slew -boot menu=off,order=cdn,once=c,strict=off -enable-kvm -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 #0 0x00007fa9476d370f in raise () at /lib64/libc.so.6 #1 0x00007fa9476bdb25 in abort () at /lib64/libc.so.6 #2 0x00007fa9476bd9f9 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007fa9476cbcc6 in .annobin_assert.c_end () at /lib64/libc.so.6 #4 0x0000561e59c587c8 in virtio_scsi_ctx_check (d=0x561e5c347970, s=<optimized out>, s=<optimized out>) at /usr/src/debug/qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64/hw/scsi/virtio-scsi.c:250 #5 0x0000561e59c587c8 in virtio_scsi_ctx_check (s=0x561e5d049f70, s=0x561e5d049f70, d=0x561e5c347970) at /usr/src/debug/qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64/hw/scsi/virtio-scsi.c:247 #6 0x0000561e59c587c8 in virtio_scsi_handle_cmd_req_prepare (req=0x7fa930012ad0, s=0x561e5d049f70) at /usr/src/debug/qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64/hw/scsi/virtio-scsi.c:569 #7 0x0000561e59c587c8 in virtio_scsi_handle_cmd_vq (s=s@entry=0x561e5d049f70, vq=vq@entry=0x7fa91c6a5140) at /usr/src/debug/qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64/hw/scsi/virtio-scsi.c:612 #8 0x0000561e59c5948e in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0x7fa91c6a5140) at /usr/src/debug/qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64/hw/scsi/virtio-scsi---Type <RET> for more, q to quit, c to continue without paging--c dataplane.c:60 #9 0x0000561e59c66fbe in virtio_queue_notify_aio_vq (vq=<optimized out>) at /usr/src/debug/qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64/hw/virtio/virtio.c:2243 #10 0x0000561e59f2a046 in run_poll_handlers_once (ctx=ctx@entry=0x561e5be1bb70, timeout=timeout@entry=0x7fa93f70e658) at util/aio-posix.c:517 #11 0x0000561e59f2afba in run_poll_handlers (timeout=0x7fa93f70e658, max_ns=4000, ctx=0x561e5be1bb70) at util/aio-posix.c:562 #12 0x0000561e59f2afba in try_poll_mode (timeout=0x7fa93f70e658, ctx=0x561e5be1bb70) at util/aio-posix.c:597 #13 0x0000561e59f2afba in aio_poll (ctx=0x561e5be1bb70, blocking=blocking@entry=true) at util/aio-posix.c:639 #14 0x0000561e59d057f4 in iothread_run (opaque=0x561e5be02b60) at iothread.c:75 #15 0x0000561e59f2cd84 in qemu_thread_start (args=0x561e5be1c080) at util/qemu-thread-posix.c:519 #16 0x00007fa947a662de in start_thread () at /lib64/libpthread.so.0 #17 0x00007fa947797e83 in clone () at /lib64/libc.so.6 Version-Release number of selected component (if applicable): How reproducible: 70% happened in automation. Steps to Reproduce: 1.boot vm /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35 \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -m 30720 \ -smp 12,maxcpus=12,cores=6,threads=1,dies=1,sockets=2 \ -cpu 'Skylake-Server',+kvm_pv_unhalt \ -chardev socket,server,path=/var/tmp/avocado_6j4emk2s/monitor-qmpmonitor1-20200605-023351-cxpRnbd9,id=qmp_id_qmpmonitor1,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,server,path=/var/tmp/avocado_6j4emk2s/monitor-catch_monitor-20200605-023351-cxpRnbd9,id=qmp_id_catch_monitor,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=id0yBSp5 \ -chardev socket,server,path=/var/tmp/avocado_6j4emk2s/serial-serial0-20200605-023351-cxpRnbd9,id=chardev_serial0,nowait \ -device isa-serial,id=serial0,chardev=chardev_serial0 \ -chardev socket,id=seabioslog_id_20200605-023351-cxpRnbd9,path=/var/tmp/avocado_6j4emk2s/seabios-20200605-023351-cxpRnbd9,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20200605-023351-cxpRnbd9,iobase=0x402 \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -object iothread,id=iothread0 \ -object iothread,id=iothread1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0,iothread=iothread0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel830-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-net-pci,mac=9a:03:3d:56:61:b2,id=idYARjM3,netdev=idhVJiQ3,bus=pcie-root-port-3,addr=0x0 \ -netdev tap,id=idhVJiQ3,vhost=on,vhostfd=21,fd=6 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 2.hotpulg one disk {"execute": "blockdev-add", "arguments": {"node-name": "file_stg0", "driver": "file", "aio": "threads", "filename": "/home/kvm_autotest_root/images/storage0.qcow2", "cache": {"direct": true, "no-flush": false}}, "id": "bfCXHvne"} {"execute": "blockdev-add", "arguments": {"node-name": "drive_stg0", "driver": "qcow2", "cache": {"direct": true, "no-flush": false}, "file": "file_stg0"}, "id": "i2JYBkct"} {"execute": "device_add", "arguments": {"driver": "scsi-hd", "id": "stg0", "drive": "drive_stg0", "write-cache": "on", "bus": "virtio_scsi_pci0.0"}, "id": "ODAFuMrY"} 3.execute io test on the new disk in guest /home/iozone_inst/src/current/iozone -az -g 10G -y 32k -i 0 -i 1 -I -f /mnt/sdb1/iozone_test 4.hot-unplug disk {"execute": "device_del", "arguments": {"id": "stg0"}, "id": "O6HOCEti"} 5 remove blockdev node {"execute": "blockdev-del", "arguments": {"node-name": "drive_stg0"}, "id": "xVEiOfoF"} {"execute": "blockdev-del", "arguments": {"node-name": "file_stg0"}, "id": "wlFMohtd"} 6. repeat step 2-5 50 times Actual results: crash happened on step Expected results: no crash Additional info: automation script: python3 ConfigTest.py --testcase=hotplug_unplug_during_io_repeat.default.q35 --iothread_scheme=roundrobin --nr_iothreads=2 --platform=x86_64 --guestname=RHEL.8.3.0 --driveformat=virtio_scsi --nicmodel=virtio_net --imageformat=qcow2 --machines=q35 --customsparams="image_aio=threads" --clone=no
Version info: host 4.18.0-193.2.1.el8_2.x86_64 qemu-kvm-core-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64 Test log: http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/qbugs/1844343/2020-06-05-0324/job-2020-06-05T02.33-7695316/ coredump file: http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/qbugs/1844343/core.qemu-kvm.0.a044351966c04920bece765d4fd501e7.268982.1591339513000000.lz4
Hit same issue on {'kvm_version': '4.18.0-193.13.el8.x86_64', 'qemu_version': 'qemu-kvm-4.2.0-19.module+el8.3.0+6478+69f490bb.x86_64'}
Not hit this issue on 3.10.0-1127.el7.x86_64 qemu-kvm-common-rhev-2.12.0-44.el7_8.2.x86_64
I'm getting a "403 Forbidden" when trying to download the coredump from http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/qbugs/1844343/core.qemu-kvm.0.a044351966c04920bece765d4fd501e7.268982.1591339513000000.lz4. Could you please give me access or upload it to some other space? Thanks, Sergio.
I've reproduced the issue and started a discussion about the root cause upstream: - https://lists.nongnu.org/archive/html/qemu-block/2020-06/msg00634.html
The root cause is that virtio-scsi may access a scsi disk that it's not actually available. Maxim sent a patchset that addresses this issue: https://patchew.org/QEMU/20200511160951.8733-1-mlevitsk@redhat.com/ I tried this on my test environment (where I reproduced the issue before), and seems to fix the problem.
(In reply to Sergio Lopez from comment #8) > The root cause is that virtio-scsi may access a scsi disk that it's not > actually available. Maxim sent a patchset that addresses this issue: > > https://patchew.org/QEMU/20200511160951.8733-1-mlevitsk@redhat.com/ > > I tried this on my test environment (where I reproduced the issue before), > and seems to fix the problem. See bug 1812399 for reference - we might be able to duplicate to that. Markus has also done a lot of work recently upstream related to device realization timing with when a device is really ready for use.
(In reply to John Ferlan from comment #10) > (In reply to Sergio Lopez from comment #8) > > The root cause is that virtio-scsi may access a scsi disk that it's not > > actually available. Maxim sent a patchset that addresses this issue: > > > > https://patchew.org/QEMU/20200511160951.8733-1-mlevitsk@redhat.com/ > > > > I tried this on my test environment (where I reproduced the issue before), > > and seems to fix the problem. > > See bug 1812399 for reference - we might be able to duplicate to that. > Markus has also done a lot of work recently upstream related to device > realization timing with when a device is really ready for use. I'd say the root cause is the same, but the reproducers are different and trigger the issue from different code paths, so perhaps it's worth keeping both BZs.
Hit similar issue on {'kvm_version': '4.18.0-211.el8.x86_64', 'qemu_version': 'qemu-kvm-core-5.0.0-0.module+el8.3.0+6620+5d5e1420.x86_64'} http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/qlogs/%5b8.3-AV%5d-2-Q35+Seabios+8.3+Luks+Virtio_scsi+Local+qemu-5.0/test-results/012-Host_RHEL.m8.u3.product_av.luks.virtio_scsi.up.virtio_net.Guest.RHEL.8.3.0.x86_64.io-github-autotest-qemu.hotplug_unplug_during_io_repeat.q35/
Hit same issue on qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64 #0 0x00007f5ee90d77ff in raise () at /lib64/libc.so.6 #1 0x00007f5ee90c1c35 in abort () at /lib64/libc.so.6 #2 0x00007f5ee90c1b09 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007f5ee90cfde6 in .annobin_assert.c_end () at /lib64/libc.so.6 #4 0x0000556962420748 in virtio_scsi_ctx_check (d=0x556964205d80, s=<optimized out>, s=<optimized out>) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/hw/scsi/virtio-scsi.c:250 #5 0x0000556962420748 in virtio_scsi_ctx_check (s=0x55696500eec0, s=0x55696500eec0, d=0x556964205d80) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/hw/scsi/virtio-scsi.c:247 #6 0x0000556962420748 in virtio_scsi_handle_cmd_req_prepare (req=0x7f5ed403cae0, s=0x55696500eec0) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/hw/scsi/virtio-scsi.c:569 #7 0x0000556962420748 in virtio_scsi_handle_cmd_vq (s=s@entry=0x55696500eec0, vq=vq@entry=0x7f5ee002d140) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/hw/scsi/virtio-scsi.c:612 #8 0x000055696242149e in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0x7f5ee002d140) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/hw/scsi/virt--Type <RET> for more, q to quit, c to continue without paging-- io-scsi-dataplane.c:60 #9 0x000055696242f0ae in virtio_queue_notify_aio_vq (vq=<optimized out>) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/hw/virtio/virtio.c:2324 #10 0x0000556962700b09 in aio_dispatch_handler (ctx=ctx@entry=0x556963dbf510, node=0x7f5b30006e50) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/util/aio-posix.c:328 #11 0x00005569627015bc in aio_dispatch_ready_handlers (ready_list=<optimized out>, ctx=<optimized out>) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/util/aio-posix.c:358 #12 0x00005569627015bc in aio_poll (ctx=0x556963dbf510, blocking=blocking@entry=true) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/util/aio-posix.c:653 #13 0x00005569624cf544 in iothread_run (opaque=0x556963ce6360) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/iothread.c:75 #14 0x0000556962703db4 in qemu_thread_start (args=0x556963dc3570) at /usr/src/debug/qemu-kvm-5.0.0-0.module+el8.3.0+6612+6b86f0c9.x86_64/util/qemu-thread-posix.c:519 #15 0x00007f5ee947014a in start_thread () at /lib64/libpthread.so.0 #16 0x00007f5ee919d123 in clone () at /lib64/libc.so.6 http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/qbugs/1844343/core.qemu-kvm.0.84149c9a23614ecd9afae03eb38e4c7a.371720.1593679868000000.lz4
*** Bug 1869994 has been marked as a duplicate of this bug. ***
Since the patches for bug 1812399 have been shown by comment 21 to resolve this issue, I am duplicating this bug to there. It's perhaps notable that bug 1888131 was cloned/generated to be included in the first zstream for RHEL 8.3.0.z which thus would get the resolution to the customer sooner than the RHEL-AV 8.3.1 release. I'll leave it up to Marina to adjust the customer tracker. *** This bug has been marked as a duplicate of bug 1812399 ***