Description of problem: qemu in Rawhide fails when I test injecting EIO errors into requests using nbdkit: nbdkit: memory[1]: debug: error: pread count=1024 offset=102400 flags=0x0 nbdkit: memory[1]: error: injecting EIO error into pread nbdkit: memory[1]: debug: sending error reply: Input/output error qemu-system-x86_64: /builddir/build/BUILD/qemu-3.1.0-rc1/hw/scsi/scsi-bus.c:1374: scsi_req_complete: Assertion `req->status == -1' failed. Version-Release number of selected component (if applicable): qemu 2:3.1.0-0.1.rc1.fc30 How reproducible: Unknown. Steps to Reproduce: 1. Unknown, I'll try to come up with a reproducer if I can make one work locally.
Created attachment 1506918 [details] build.log build.log from Koji showing the failure
Actually yes this is easily reproducible with qemu from git. (1) nbdkit -f -v --filter=error memory size=64M error-rate=100% (2) x86_64-softmmu/qemu-system-x86_64 -device virtio-scsi,id=scsi -drive file=nbd:localhost:10809,format=raw,id=hd0,if=none -device scsi-hd,drive=hd0 qemu-system-x86_64: hw/scsi/scsi-bus.c:1374: scsi_req_complete: Assertion `req->status == -1' failed. Aborted (core dumped) Stack trace: (gdb) bt #0 0x00007f7f18d4253f in raise () at /lib64/libc.so.6 #1 0x00007f7f18d2c895 in abort () at /lib64/libc.so.6 #2 0x00007f7f18d2c769 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007f7f18d3a9f6 in .annobin_assert.c_end () at /lib64/libc.so.6 #4 0x000055ce0f920fb0 in scsi_req_complete (req=<optimized out>, status=<optimized out>) at hw/scsi/scsi-bus.c:1374 #5 0x000055ce0f91b850 in scsi_dma_complete_noio (r=0x55ce116ea090, ret=<optimized out>) at hw/scsi/scsi-disk.c:281 #6 0x000055ce0f91b8ff in scsi_dma_complete (opaque=0x55ce116ea090, ret=-5) at hw/scsi/scsi-disk.c:302 #7 0x000055ce0f8103c7 in dma_complete (ret=-5, dbs=0x55ce11d36c00) at dma-helpers.c:116 #8 0x000055ce0f8103c7 in dma_blk_cb (opaque=0x55ce11d36c00, ret=-5) at dma-helpers.c:138 #9 0x000055ce0fa42cce in blk_aio_complete (acb=0x55ce10d36300) at block/block-backend.c:1345 #10 0x000055ce0fafce6b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116 #11 0x00007f7f18d58200 in __start_context () at /lib64/libc.so.6 #12 0x00007fff50b87130 in () #13 0x0000000000000000 in ()
40dce4ee61c68395f6d463fae792f61b7c003bce is the first bad commit commit 40dce4ee61c68395f6d463fae792f61b7c003bce Author: Paolo Bonzini <pbonzini> Date: Sat Oct 13 11:52:34 2018 +0200 scsi-disk: fix rerror/werror=ignore rerror=ignore was returning true from scsi_handle_rw_error but the callers were not calling scsi_req_complete when rerror=ignore returns true (this is the correct thing to do when true is returned after executing a passthrough command). Fix this by calling it in scsi_handle_rw_error. Signed-off-by: Paolo Bonzini <pbonzini> :040000 040000 311386b9b91d77840a849459ab6ae41a37fd7f42 8adcda67d7487bcc18966f096c9923da3b8dc0b9 M hw
Rich reported this upstream: https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg03508.html
Reported upstream: https://bugs.launchpad.net/qemu/+bug/1804323
This was fixed in 3.1.0-rc3. Since 3.1.0 (final) was released a few months ago and is present in Fedora 30 and Rawhide I'm going to close this bug now.