RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1665903 - hw/scsi/scsi-bus.c:1374: scsi_req_complete: Assertion `req->status == -1' failed.
Summary: hw/scsi/scsi-bus.c:1374: scsi_req_complete: Assertion `req->status == -1' fai...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 8.2
Assignee: Philippe Mathieu-Daudé
QA Contact: qing.wang
URL:
Whiteboard:
Depends On: 1650975
Blocks: TRACKER-bugs-affecting-libguestfs 1755139
TreeView+ depends on / blocked
 
Reported: 2019-01-14 12:10 UTC by Xueqiang Wei
Modified: 2020-01-30 17:08 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1650975
Environment:
Last Closed: 2020-01-30 16:25:59 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Xueqiang Wei 2019-01-14 12:10:53 UTC
+++ This bug was initially created as a clone of Bug #1650975 +++

Description of problem:

qemu in Rawhide fails when I test injecting EIO errors into requests
using nbdkit:

nbdkit: memory[1]: debug: error: pread count=1024 offset=102400 flags=0x0
nbdkit: memory[1]: error: injecting EIO error into pread
nbdkit: memory[1]: debug: sending error reply: Input/output error
qemu-system-x86_64: /builddir/build/BUILD/qemu-3.1.0-rc1/hw/scsi/scsi-bus.c:1374: scsi_req_complete: Assertion `req->status == -1' failed.

Version-Release number of selected component (if applicable):

qemu 2:3.1.0-0.1.rc1.fc30

How reproducible:

Unknown.

Steps to Reproduce:
1. Unknown, I'll try to come up with a reproducer if I can make one work locally.

--- Additional comment from Richard W.M. Jones on 2018-11-18 11:37 UTC ---

build.log from Koji showing the failure

--- Additional comment from Richard W.M. Jones on 2018-11-18 11:53:50 UTC ---

Actually yes this is easily reproducible with qemu from git.

(1) nbdkit -f -v --filter=error memory size=64M error-rate=100%

(2) x86_64-softmmu/qemu-system-x86_64 -device virtio-scsi,id=scsi -drive file=nbd:localhost:10809,format=raw,id=hd0,if=none -device scsi-hd,drive=hd0

qemu-system-x86_64: hw/scsi/scsi-bus.c:1374: scsi_req_complete: Assertion `req->status == -1' failed.
Aborted (core dumped)

Stack trace:

(gdb) bt
#0  0x00007f7f18d4253f in raise () at /lib64/libc.so.6
#1  0x00007f7f18d2c895 in abort () at /lib64/libc.so.6
#2  0x00007f7f18d2c769 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3  0x00007f7f18d3a9f6 in .annobin_assert.c_end () at /lib64/libc.so.6
#4  0x000055ce0f920fb0 in scsi_req_complete (req=<optimized out>, status=<optimized out>) at hw/scsi/scsi-bus.c:1374
#5  0x000055ce0f91b850 in scsi_dma_complete_noio (r=0x55ce116ea090, ret=<optimized out>) at hw/scsi/scsi-disk.c:281
#6  0x000055ce0f91b8ff in scsi_dma_complete (opaque=0x55ce116ea090, ret=-5)
    at hw/scsi/scsi-disk.c:302
#7  0x000055ce0f8103c7 in dma_complete (ret=-5, dbs=0x55ce11d36c00)
    at dma-helpers.c:116
#8  0x000055ce0f8103c7 in dma_blk_cb (opaque=0x55ce11d36c00, ret=-5)
    at dma-helpers.c:138
#9  0x000055ce0fa42cce in blk_aio_complete (acb=0x55ce10d36300)
    at block/block-backend.c:1345
#10 0x000055ce0fafce6b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
#11 0x00007f7f18d58200 in __start_context () at /lib64/libc.so.6
#12 0x00007fff50b87130 in  ()
#13 0x0000000000000000 in  ()

--- Additional comment from Richard W.M. Jones on 2018-11-18 15:11:57 UTC ---

40dce4ee61c68395f6d463fae792f61b7c003bce is the first bad commit
commit 40dce4ee61c68395f6d463fae792f61b7c003bce
Author: Paolo Bonzini <pbonzini>
Date:   Sat Oct 13 11:52:34 2018 +0200

    scsi-disk: fix rerror/werror=ignore
    
    rerror=ignore was returning true from scsi_handle_rw_error but the callers were not
    calling scsi_req_complete when rerror=ignore returns true (this is the correct thing
    to do when true is returned after executing a passthrough command).  Fix this by
    calling it in scsi_handle_rw_error.
    
    Signed-off-by: Paolo Bonzini <pbonzini>

:040000 040000 311386b9b91d77840a849459ab6ae41a37fd7f42 8adcda67d7487bcc18966f096c9923da3b8dc0b9 M	hw

--- Additional comment from Cole Robinson on 2018-11-19 15:00:22 UTC ---

Rich reported this upstream: https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg03508.html

--- Additional comment from Richard W.M. Jones on 2018-11-20 22:16:43 UTC ---

Reported upstream:
https://bugs.launchpad.net/qemu/+bug/1804323

Comment 1 Xueqiang Wei 2019-01-14 12:34:03 UTC
Versions:

kernel-4.18.0-58.el8.x86_64
qemu-kvm-2.12.0-57.module+el8+2683+02b3b955




1. Configure the multipath device on host 

# multipath -ll
mpatha (36001405c6eefcbb279645ef9d02f3f91) dm-3 LIO-ORG,stor0
size=40G features='0' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 15:0:0:0 sdb 8:16 active ready running
`-+- policy='service-time 0' prio=50 status=enabled
  `- 16:0:0:0 sdc 8:32 active ready running

2. passthrough it withount "werror=stop,rerror=stop"

    -device virtio-scsi-pci,id=scsi1,bus=pci.0 \
    -drive file=/dev/sdb,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,aio=threads \
    -device scsi-block,bus=scsi1.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \

3. create partition on passthrough disk


after step 3, qemu core dump



if add "werror=stop,rerror=stop" to cmd lines, qemu does not core dump but failed to create partition. It will prompt "the requested resource is in use" when create partition.


This issue block bug [1], so set status to high.

[1] Bug 1633179 - Hit io-error when run validate SCSI-3 Persistent Reservation test on win2012 guest

Comment 3 Ademar Reis 2019-01-15 20:53:34 UTC
Rich Jones reported it as a regression from 40dce4ee61[1] in Bug #1650975, but then Xueqiang Wei reproduced it with QEMU-2.12 in RHEL8. We probably have different issues here.

Xueqiang Wei: why do you think your problem (comment #1) is the same as the one reported by Rich?

The commit in question is not in RHEL-8, it's in QEMU-3.1 (RHEL-8 virt fast train).

1:
commit 40dce4ee61c68395f6d463fae792f61b7c003bce
Author: Paolo Bonzini <pbonzini>
Date:   Sat Oct 13 11:52:34 2018 +0200

    scsi-disk: fix rerror/werror=ignore
    
    rerror=ignore was returning true from scsi_handle_rw_error but the callers were not
    calling scsi_req_complete when rerror=ignore returns true (this is the correct thing
    to do when true is returned after executing a passthrough command).  Fix this by
    calling it in scsi_handle_rw_error.
    
    Signed-off-by: Paolo Bonzini <pbonzini>

Comment 4 Xueqiang Wei 2019-01-16 02:59:48 UTC
(In reply to Ademar Reis from comment #3)
> Rich Jones reported it as a regression from 40dce4ee61[1] in Bug #1650975,
> but then Xueqiang Wei reproduced it with QEMU-2.12 in RHEL8. We probably
> have different issues here.
> 
> Xueqiang Wei: why do you think your problem (comment #1) is the same as the
> one reported by Rich?
> 
> The commit in question is not in RHEL-8, it's in QEMU-3.1 (RHEL-8 virt fast
> train).
> 
> 1:
> commit 40dce4ee61c68395f6d463fae792f61b7c003bce
> Author: Paolo Bonzini <pbonzini>
> Date:   Sat Oct 13 11:52:34 2018 +0200
> 
>     scsi-disk: fix rerror/werror=ignore
>     
>     rerror=ignore was returning true from scsi_handle_rw_error but the
> callers were not
>     calling scsi_req_complete when rerror=ignore returns true (this is the
> correct thing
>     to do when true is returned after executing a passthrough command).  Fix
> this by
>     calling it in scsi_handle_rw_error.
>     
>     Signed-off-by: Paolo Bonzini <pbonzini>


The output and core dump file log are same.

QEMU 2.12.0 monitor - type 'help' for more information
(qemu) qemu-kvm: hw/scsi/scsi-bus.c:1374: scsi_req_complete: Assertion `req->status == -1' failed.


(gdb) bt
#0  0x00007f3e4bcf193f in raise () at /lib64/libc.so.6
#1  0x00007f3e4bcdbc95 in abort () at /lib64/libc.so.6
#2  0x00007f3e4bcdbb69 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3  0x00007f3e4bce9df6 in .annobin_assert.c_end () at /lib64/libc.so.6
#4  0x000055a83bea7040 in scsi_req_complete
    (req=<optimized out>, status=<optimized out>) at hw/scsi/scsi-bus.c:1374
#5  0x000055a83bea138a in scsi_handle_rw_error
    (acct_failed=false, error=0, r=0x55a83e852880) at hw/scsi/scsi-disk.c:471
#6  0x000055a83bea138a in scsi_disk_req_check_error
    (r=0x55a83e852880, ret=<optimized out>, acct_failed=<optimized out>)
    at hw/scsi/scsi-disk.c:196
#7  0x000055a83bea2225 in scsi_dma_complete_noio
    (r=0x55a83e852880, ret=<optimized out>) at hw/scsi/scsi-disk.c:269
#8  0x000055a83bea2303 in scsi_dma_complete (opaque=0x55a83e852880, ret=0)
    at hw/scsi/scsi-disk.c:300
#9  0x000055a83bdea4d2 in dma_complete (ret=0, dbs=0x7f3e38091ee0)
    at dma-helpers.c:116
#10 0x000055a83bdea4d2 in dma_blk_cb (opaque=0x7f3e38091ee0, ret=0)
    at dma-helpers.c:138
#11 0x000055a83bf69a9e in blk_aio_complete (acb=0x7f3e30000f20)
    at block/block-backend.c:1346
#12 0x000055a83c0080f3 in coroutine_trampoline
    (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
--Type <RET> for more, q to quit, c to continue without paging--
#13 0x00007f3e4bd07600 in __start_context () at /lib64/libc.so.6
#14 0x00007f3e22ffbc10 in  ()
#15 0x0000000000000000 in  ()

Comment 6 Paolo Bonzini 2019-02-05 12:12:39 UTC
There is no fix yet.  We can delay it.

Comment 8 Paolo Bonzini 2019-09-20 12:51:53 UTC
The bug has been fixed for a long time in RHEL AV, so moving the bug for simplicity to RHEL qemu-kvm.

It's fixed by upstream commit 1c7f618f689b0b5b6bbed23a7c159e7dad7b996f.

Comment 11 Philippe Mathieu-Daudé 2020-01-30 11:11:08 UTC
Reassigning (with ack from Paolo).

Comment 13 Richard W.M. Jones 2020-01-30 17:08:04 UTC
Fixed in current release, right?


Note You need to log in before you can comment on or make changes to this bug.