Bug 1480202

Summary: Occurred core dump with multi-object when quitted qemu during doing IO
Product: Red Hat Enterprise Linux 7 Reporter: Yongxue Hong <yhong>
Component: qemu-kvm-rhevAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED ERRATA QA Contact: aihua liang <aliang>
Severity: high Docs Contact:
Priority: high    
Version: 7.4-AltCC: aliang, chayang, coli, famz, jen, juzhang, knoel, michen, mrezanin, mtessun, ngu, qzhang, virt-maint, xianwang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1566586 (view as bug list) Environment:
Last Closed: 2018-11-01 11:01:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1558351, 1566586    

Description Yongxue Hong 2017-08-10 11:57:45 UTC
Description of problem:
Occurred core dump with multi-object when quitted qemu during doing IO.

Version-Release number of selected component (if applicable):
Host : 4.11.0-22.el7a.ppc64le
Guest : 4.11.0-22.el7a.ppc64le
Qemu-kvm : qemu-kvm-2.9.0-20.el7a

How reproducible:
100%

Steps to Reproduce:
1.boot a guest with a scsi data disk and two the object of iothread which one of them don't attach to a device.
eg:
[root@c155f1-u15 command]# cat rhel74-alt-8395.sh
/usr/libexec/qemu-kvm \
-name 'yhong-rhel74-alt-8395' \
-machine pseries-rhel7.4.0alt \
-m 8G \
-nodefaults \
-smp 4,cores=4,threads=1,sockets=1 \
-boot order=cdn,once=d,menu=on,strict=on  \
-device nec-usb-xhci,id=xhci \
-device usb-tablet,id=usb-tablet0 \
-device usb-kbd,id=usb-kbd0 \
-enable-kvm \
-object iothread,id=iothread0 \
-object iothread,id=iothread1 \
-device virtio-scsi-pci,bus=pci.0,addr=0x06,id=scsi-pci-0 \
-device virtio-scsi-pci,bus=pci.0,addr=0x07,id=scsi-pci-1,iothread=iothread1 \
-drive file=/home/yhong/image/data-disk-30G.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-1 \
-device scsi-hd,bus=scsi-pci-1.0,id=scsi-hd-1,drive=drive-1,channel=0,scsi-id=0,lun=1 \
-drive file=/home/yhong/image/rhel74-alt-8379-sys-disk-20G.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-0 \
-device scsi-hd,bus=scsi-pci-0.0,id=scsi-hd-0,drive=drive-0,channel=0,scsi-id=0,lun=0,bootindex=0 \
-netdev tap,id=hostnet0,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=40:f2:e9:5d:9c:13 \
-qmp tcp:0:3003,server,nowait \
-chardev socket,id=serial_id_serial,path=/var/tmp/serial-yhong8382,server,nowait \
-device spapr-vty,reg=0x30000000,chardev=serial_id_serial \
-monitor stdio
2.do dd testing on data disk
eg: 
[root@localhost ~]# dd if=/dev/zero of=/dev/sdb bs=4K count=10000 oflag=direct status=progress

3.quit qemu during doing IO
eg:
[root@c155f1-u15 command]# sh rhel74-alt-8395.sh
QEMU 2.9.0 monitor - type 'help' for more information
(qemu) VNC server running on ::1:5900

(qemu) 
(qemu) q


Actual results:
[root@c155f1-u15 command]# sh rhel74-alt-8395.sh
QEMU 2.9.0 monitor - type 'help' for more information
(qemu) VNC server running on ::1:5900

(qemu) 
(qemu) q
qemu-kvm: /builddir/build/BUILD/qemu-2.9.0/hw/scsi/virtio-scsi.c:245: virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed.
rhel74-alt-8395.sh: line 25: 16315 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name 'yhong-rhel74-alt-8395' -machine pseries-rhel7.4.0alt -m 8G -nodefaults -smp 4,cores=4,threads=1,sockets=1 -boot order=cdn,once=d,menu=on,strict=on -device nec-usb-xhci,id=xhci -device usb-tablet,id=usb-tablet0 -device usb-kbd,id=usb-kbd0 -enable-kvm -object iothread,id=iothread0 -object iothread,id=iothread1 -device virtio-scsi-pci,bus=pci.0,addr=0x06,id=scsi-pci-0 -device virtio-scsi-pci,bus=pci.0,addr=0x07,id=scsi-pci-1,iothread=iothread1 -drive file=/home/yhong/image/data-disk-30G.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-1 -device scsi-hd,bus=scsi-pci-1.0,id=scsi-hd-1,drive=drive-1,channel=0,scsi-id=0,lun=1 -drive file=/home/yhong/image/rhel74-alt-8379-sys-disk-20G.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-0 -device scsi-hd,bus=scsi-pci-0.0,id=scsi-hd-0,drive=drive-0,channel=0,scsi-id=0,lun=0,bootindex=0 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=40:f2:e9:5d:9c:13 -qmp tcp:0:3003,server,nowait -chardev socket,id=serial_id_serial,path=/var/tmp/serial-yhong8382,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial -monitor stdio


Expected results:
No core dump

Additional info:
backtrace info:
[root@c155f1-u15 command]# gdb -c core.7730
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "ppc64le-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
[New LWP 7733]
[New LWP 7870]
[New LWP 7730]
[New LWP 7750]
[New LWP 7759]
[New LWP 7942]
[New LWP 7849]
[New LWP 7874]
[New LWP 7946]
[New LWP 7731]
[New LWP 7941]
[New LWP 7935]
[New LWP 7948]
[New LWP 7752]
[New LWP 7945]
[New LWP 7938]
[New LWP 7952]
[New LWP 7936]
[New LWP 7947]
[New LWP 7943]
[New LWP 7956]
[New LWP 7937]
[New LWP 7953]
[New LWP 7949]
[New LWP 7960]
[New LWP 7939]
[New LWP 7954]
[New LWP 7958]
[New LWP 7964]
[New LWP 7940]
[New LWP 7957]
[New LWP 7962]
[New LWP 7968]
[New LWP 7944]
[New LWP 7961]
[New LWP 7966]
[New LWP 7970]
[New LWP 7950]
[New LWP 7965]
[New LWP 7969]
[New LWP 7973]
[New LWP 7951]
[New LWP 7971]
[New LWP 7974]
[New LWP 7978]
[New LWP 7955]
[New LWP 7976]
[New LWP 7981]
[New LWP 7749]
[New LWP 7751]
[New LWP 7977]
[New LWP 7959]
[New LWP 7984]
[New LWP 7963]
[New LWP 7967]
[New LWP 7972]
[New LWP 7975]
[New LWP 7979]
[New LWP 7980]
[New LWP 7734]
Reading symbols from /usr/libexec/qemu-kvm...Reading symbols from /usr/lib/debug/usr/libexec/qemu-kvm.debug...done.
done.
Missing separate debuginfo for 
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/2a/cc80a904b442f4d66ca26c87fff77f740d88b2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm -name yhong-rhel74-alt-8395 -machine pseries-rhel7.4.0alt'.
Program terminated with signal 6, Aborted.
#0  0x00003fff82b6eff0 in __GI_raise (sig=<optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.ppc64le cyrus-sasl-gssapi-2.1.26-21.el7.ppc64le cyrus-sasl-lib-2.1.26-21.el7.ppc64le cyrus-sasl-md5-2.1.26-21.el7.ppc64le cyrus-sasl-plain-2.1.26-21.el7.ppc64le elfutils-libelf-0.168-8.el7.ppc64le elfutils-libs-0.168-8.el7.ppc64le glib2-2.50.3-3.el7.ppc64le gmp-6.0.0-15.el7.ppc64le gnutls-3.3.26-9.el7.ppc64le gperftools-libs-2.4-8.el7.ppc64le keyutils-libs-1.5.8-3.el7.ppc64le krb5-libs-1.15.1-8.el7.ppc64le libaio-0.3.109-13.el7.ppc64le libattr-2.4.46-12.el7.ppc64le libcap-2.22-9.el7.ppc64le libcom_err-1.42.9-10.el7.ppc64le libcurl-7.29.0-42.el7.ppc64le libdb-5.3.21-20.el7.ppc64le libfdt-1.4.3-1.el7.ppc64le libffi-3.0.13-18.el7.ppc64le libgcc-4.8.5-16.el7.ppc64le libgcrypt-1.5.3-14.el7.ppc64le libgpg-error-1.12-3.el7.ppc64le libibverbs-13-7.el7.ppc64le libidn-1.28-4.el7.ppc64le libiscsi-1.9.0-7.el7.ppc64le libnl3-3.2.28-4.el7.ppc64le libpng-1.5.13-7.el7_2.ppc64le librdmacm-13-7.el7.ppc64le libseccomp-2.3.1-3.el7.ppc64le libselinux-2.5-11.el7.ppc64le libssh2-1.4.3-10.el7_2.1.ppc64le libstdc++-4.8.5-16.el7.ppc64le libtasn1-4.10-1.el7.ppc64le libusbx-1.0.20-1.el7.ppc64le lzo-2.06-8.el7.ppc64le nettle-2.7.1-8.el7.ppc64le nspr-4.13.1-1.0.el7_3.ppc64le nss-3.28.4-8.el7.ppc64le nss-softokn-freebl-3.28.3-6.el7.ppc64le nss-util-3.28.4-3.el7.ppc64le numactl-libs-2.0.9-6.el7_2.ppc64le openldap-2.4.44-5.el7.ppc64le openssl-libs-1.0.2k-8.el7.ppc64le p11-kit-0.23.5-3.el7.ppc64le pcre-8.32-17.el7.ppc64le pixman-0.34.0-1.el7.ppc64le snappy-1.1.0-3.el7.ppc64le systemd-libs-219-41.el7.ppc64le xz-libs-5.2.2-1.el7.ppc64le zlib-1.2.7-17.el7.ppc64le
(gdb) bt
#0  0x00003fff82b6eff0 in __GI_raise (sig=<optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00003fff82b7136c in __GI_abort () at abort.c:90
#2  0x00003fff82b64c44 in __assert_fail_base (
    fmt=0x3fff82cc4410 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x433e2d68 "blk_get_aio_context(d->conf.blk) == s->ctx", 
    file=0x433e2d18 "/builddir/build/BUILD/qemu-2.9.0/hw/scsi/virtio-scsi.c", 
    line=<optimized out>, function=<optimized out>) at assert.c:92
#3  0x00003fff82b64d34 in __GI___assert_fail (
    assertion=0x433e2d68 "blk_get_aio_context(d->conf.blk) == s->ctx", 
    file=0x433e2d18 "/builddir/build/BUILD/qemu-2.9.0/hw/scsi/virtio-scsi.c", 
    line=<optimized out>, 
    function=0x433e2a40 <__PRETTY_FUNCTION__.30729> "virtio_scsi_ctx_check")
    at assert.c:101
#4  0x0000000042f90990 in virtio_scsi_ctx_check (s=<optimized out>, 
    s=<optimized out>, d=<optimized out>)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245
#5  0x000000004305d42c in virtio_scsi_ctx_check (s=<optimized out>, 
    s=<optimized out>, d=0x466c3e80)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245
#6  virtio_scsi_handle_cmd_req_prepare (req=0x44971cc0, s=0x4617a510)
---Type <return> to continue, or q <return> to quit---
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:558
#7  virtio_scsi_handle_cmd_vq (s=0x4617a510, vq=0x46320100)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:598
#8  0x000000004305e5d0 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, 
    vq=0x46320100) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi-dataplane.c:60
#9  0x000000004306c19c in virtio_queue_notify_aio_vq (vq=<optimized out>)
    at /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:1510
#10 0x00000000433a8f3c in aio_dispatch_handlers (ctx=0x44971900)
    at util/aio-posix.c:399
#11 0x00000000433a9dd4 in aio_poll (ctx=0x44971900, blocking=<optimized out>)
    at util/aio-posix.c:685
#12 0x00000000431903c8 in iothread_run (opaque=0x44ac08f0) at iothread.c:59
#13 0x00003fff82d28af4 in start_thread (arg=0x3fff8079eb10) at pthread_create.c:310
#14 0x00003fff82c54ef4 in clone ()
    at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:109

(gdb) bt full
#0  0x00003fff82b6eff0 in __GI_raise (sig=<optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
        r4 = 7733
        r7 = 1
        arg2 = 7733
        r5 = 6
        r8 = 0
        arg3 = 6
        r0 = 250
        r3 = 0
        r6 = 8
        arg1 = 0
        sc_err = <optimized out>
        sc_ret = <optimized out>
        pd = 0x3fff8079eb10
        pid = 0
        selftid = 7733
#1  0x00003fff82b7136c in __GI_abort () at abort.c:90
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x44976825, sa_sigaction = 
    0x44976825}, sa_mask = {__val = {1150773184, 1150773184, 1150773340, 
---Type <return> to continue, or q <return> to quit---
              1150773484, 1150773184, 1150773484, 0, 0, 0, 0, 0, 0, 0, 1, 
              1130052756, 1128472528}}, sa_flags = 1128702592, sa_restorer = 0x0}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00003fff82b64c44 in __assert_fail_base (
    fmt=0x3fff82cc4410 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x433e2d68 "blk_get_aio_context(d->conf.blk) == s->ctx", 
    file=0x433e2d18 "/builddir/build/BUILD/qemu-2.9.0/hw/scsi/virtio-scsi.c", 
    line=<optimized out>, function=<optimized out>) at assert.c:92
        str = 0x44927940 ""
        total = 65536
#3  0x00003fff82b64d34 in __GI___assert_fail (
    assertion=0x433e2d68 "blk_get_aio_context(d->conf.blk) == s->ctx", 
    file=0x433e2d18 "/builddir/build/BUILD/qemu-2.9.0/hw/scsi/virtio-scsi.c", 
    line=<optimized out>, 
    function=0x433e2a40 <__PRETTY_FUNCTION__.30729> "virtio_scsi_ctx_check")
    at assert.c:101
No locals.
#4  0x0000000042f90990 in virtio_scsi_ctx_check (s=<optimized out>, 
    s=<optimized out>, d=<optimized out>)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245
No locals.
---Type <return> to continue, or q <return> to quit---
#5  0x000000004305d42c in virtio_scsi_ctx_check (s=<optimized out>, 
    s=<optimized out>, d=0x466c3e80)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245
No locals.
#6  virtio_scsi_handle_cmd_req_prepare (req=0x44971cc0, s=0x4617a510)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:558
        vs = 0x4617a510
        rc = <optimized out>
#7  virtio_scsi_handle_cmd_vq (s=0x4617a510, vq=0x46320100)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:598
        req = 0x44971cc0
        next = <optimized out>
        ret = <optimized out>
        progress = true
        reqs = {tqh_first = 0x0, tqh_last = 0x3fff8079df18}
#8  0x000000004305e5d0 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, 
    vq=0x46320100) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi-dataplane.c:60
        progress = <optimized out>
        s = 0x4617a510
#9  0x000000004306c19c in virtio_queue_notify_aio_vq (vq=<optimized out>)
    at /usr/src/debug/qemu-2.9.0/hw/virtio/virtio.c:1510
---Type <return> to continue, or q <return> to quit---
        vdev = <optimized out>
#10 0x00000000433a8f3c in aio_dispatch_handlers (ctx=0x44971900)
    at util/aio-posix.c:399
        revents = <optimized out>
        node = <optimized out>
        tmp = 0x44ae99c0
        progress = <optimized out>
#11 0x00000000433a9dd4 in aio_poll (ctx=0x44971900, blocking=<optimized out>)
    at util/aio-posix.c:685
        node = <optimized out>
        i = <optimized out>
        ret = <optimized out>
        progress = false
        timeout = <optimized out>
        start = 106371359052906
        __PRETTY_FUNCTION__ = "aio_poll"
#12 0x00000000431903c8 in iothread_run (opaque=0x44ac08f0) at iothread.c:59
        iothread = 0x44ac08f0
#13 0x00003fff82d28af4 in start_thread (arg=0x3fff8079eb10) at pthread_create.c:310
        pd = 0x3fff8079eb10
        now = <optimized out>
---Type <return> to continue, or q <return> to quit---
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-1803972963672581207, 
                70366644240640, -1803972963716272039, 0, 0, 0, 0, 1, 1130052756, 
                1128472528, 1128702592, 1128098112, 1128603936, 1149640096, 
                1152125168, 1125712704, 0, 70368074959936, 4001536, 70366644046320, 
                70366604681536, 2449998063175991296, 0 <repeats 42 times>}, 
              mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {
              prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#14 0x00003fff82c54ef4 in clone ()
    at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:109
No locals.

Comment 1 Yongxue Hong 2017-08-10 12:10:14 UTC
It also could be reproduced on x86 and P8.

Version of x86:
     Host : 4.11.0-22.el7a.x86_64
     Guest : 4.11.0-22.el7a.x86_64
     Qemu-kvm : qemu-kvm-2.9.0-19.el7a

Version of P8:
     Host : 3.10.0-693.el7.ppc64le
     Guest : 3.10.0-693.el7.ppc64le
     Qemu-kvm-rhev : qemu-kvm-rhev-2.9.0-14.el7

Comment 2 aihua liang 2017-08-11 09:23:54 UTC
Reporduce it on x86, the same problem exist.

Test version:
  kernel version:3.10.0-693.el7.x86_64
  qemu-kvm-rhev:qemu-kvm-rhev-2.9.0-17.el7a.x86_64

Test steps:
  1. start guest with cmd:
      /usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-machine pc \
-vga std  \
-object iothread,id=iothread0 \
-object iothread,id=iothread1 \
-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=03,iothread=iothread0 \
-drive id=drive_image1,if=none,snapshot=on,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio.qcow2 \
-device scsi-hd,id=image1,drive=drive_image1,bus=scsi0.0,lun=0 \
-device virtio-net-pci,mac=9a:b2:b3:b4:b5:b6,id=iduCv1Ln,vectors=4,netdev=idKgexFk,bus=pci.0,addr=05  \
-netdev tap,id=idKgexFk,vhost=on \
-m 4096  \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
-cpu host \
-vnc :1  \
-enable-kvm \
-monitor stdio \
-device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=04,iothread=iothread1 \
-drive id=data_image1,if=none,werror=stop,rerror=stop,cache=none,format=qcow2,file=/home/test.qcow2 \
-device scsi-hd,id=data1,drive=data_image1,bus=scsi1.0,lun=0 \

 2. run dd test on data disk in guest.
    dd if=/dev/zero of=/dev/sdb bs=4K count=1000000000 oflag=direct status=progress

 3. during its io test,quit qemu
(qemu)quit

Test Result:
   qemu core dump with msg:qemu-kvm: /builddir/build/BUILD/qemu-2.9.0/hw/scsi/virtio-scsi.c:245: virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed.

gdb info:
[root@intel-e31225-16-3 home]# gdb -c core.16257
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
....
Reading symbols from /usr/libexec/qemu-kvm...Reading symbols from /usr/lib/debug/usr/libexec/qemu-kvm.debug...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm -name avocado-vt-vm1 -machine pc -vga std -object iothrea'.
Program terminated with signal 6, Aborted.
#0  0x00007f6abf37b1f7 in raise () from /lib64/libc.so.6
.....



(gdb) bt
#0  0x00007f6abf37b1f7 in raise () at /lib64/libc.so.6
#1  0x00007f6abf37c8e8 in abort () at /lib64/libc.so.6
#2  0x00007f6abf374266 in __assert_fail_base () at /lib64/libc.so.6
#3  0x00007f6abf374312 in  () at /lib64/libc.so.6
#4  0x000055def692ac50 in virtio_scsi_ctx_check (s=<optimized out>, s=<optimized out>, d=0x55defafa7c00)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245
#5  0x000055def69b4e16 in virtio_scsi_handle_cmd_vq (s=<optimized out>, s=<optimized out>, d=0x55defafa7c00)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:245
#6  0x000055def69b4e16 in virtio_scsi_handle_cmd_vq (req=0x55def8d98b40, s=0x55defb000510) at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:558
#7  0x000055def69b4e16 in virtio_scsi_handle_cmd_vq (s=s@entry=0x55defb000510, vq=vq@entry=0x55defb0ba100)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi.c:598
#8  0x000055def69b59fa in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0x55defb0ba100)
    at /usr/src/debug/qemu-2.9.0/hw/scsi/virtio-scsi-dataplane.c:60
#9  0x000055def6c404c8 in aio_dispatch_handlers (ctx=ctx@entry=0x55def8069980) at util/aio-posix.c:399
#10 0x000055def6c40f0a in aio_poll (ctx=0x55def8069980, blocking=blocking@entry=true) at util/aio-posix.c:685
#11 0x000055def6a4933e in iothread_run (opaque=0x55def80c8bb0) at iothread.c:59
#12 0x00007f6abf710e25 in start_thread () at /lib64/libpthread.so.0
#13 0x00007f6abf43e34d in clone () at /lib64/libc.so.6

Comment 5 Yongxue Hong 2018-01-16 06:43:27 UTC
It also reproduced with backend of NBD on P9 host

host : 4.14.0-24.el7a.ppc64le
guest : 4.14.0-24.el7a.ppc64le
qemu : qemu-kvm-rhev-2.10.0-16.el7.ppc64le

[root@ibm-p9z-09 commands]# sh guest-9328.sh
QEMU 2.10.0 monitor - type 'help' for more information
(qemu) q
qemu-kvm: /builddir/build/BUILD/qemu-2.10.0/hw/scsi/virtio-scsi.c:246: virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed.
guest-9328.sh: line 30: 31098 Aborted                 /usr/libexec/qemu-kvm -name 'guest' -machine pseries-rhel7.5.0 -m 16G -nodefaults -smp 4,cores=4,threads=1,sockets=1 -boot order=cdn,once=d,menu=off,strict=off -device nec-usb-xhci,id=xhci0 -device usb-tablet,id=usb-tablet0 -device usb-kbd,id=usb-kbd0 -device VGA,id=vga0 -chardev socket,id=qmp_id_qmpmonitor,path=/var/tmp/qmp-cmd-monitor-yhong,server,nowait -mon chardev=qmp_id_qmpmonitor,mode=control -enable-kvm -object iothread,id=iothread0 -object iothread,id=iothread1 -device virtio-scsi-pci,id=scsi0,iothread=iothread0 -device virtio-scsi-pci,id=scsi1,iothread=iothread1 -drive file=nbd:10.19.19.53:10086,format=qcow2,aio=native,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive_system -device scsi-hd,bus=scsi0.0,drive=drive_system,id=system,bootindex=0 -drive file=nbd:10.19.19.53:20000,format=qcow2,aio=native,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive_data0 -device scsi-hd,bus=scsi1.0,drive=drive_data0,id=data0 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=40:f2:e9:5d:9c:03 -qmp tcp:0:3000,server,nowait -chardev socket,id=serial_id_serial,path=/var/tmp/serial-yhong,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial -monitor stdio -vnc :30

Comment 6 aihua liang 2018-01-19 06:16:49 UTC
Can reproduce on: 
  kernel:3.10.0-826.el7.x86_64 + qemu-kvm-rhev:qemu-kvm-rhev-2.10.0-16.el7.x86_64

Test steps:
  same with comment 2.

Test result:
  qemu core dump with core.1094
  
[root@intel-3323-24-1 home]# gdb -c core.1094
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-109.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
[New LWP 1097]
[New LWP 1196]
[New LWP 1096]
[New LWP 1276]
[New LWP 1194]
[New LWP 1269]
[New LWP 1204]
[New LWP 1277]
[New LWP 1094]
[New LWP 1197]
[New LWP 1195]
[New LWP 1095]
Reading symbols from /usr/libexec/qemu-kvm...Reading symbols from /usr/lib/debug/usr/libexec/qemu-kvm.debug...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm -name avocado-vt-vm1 -machine pc -vga std -object iothrea'.
Program terminated with signal 6, Aborted.
#0  0x00007f1a6bf941a7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install boost-system-1.53.0-27.el7.x86_64 boost-thread-1.53.0-27.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-gssapi-2.1.26-23.el7.x86_64 cyrus-sasl-lib-2.1.26-23.el7.x86_64 cyrus-sasl-md5-2.1.26-23.el7.x86_64 cyrus-sasl-plain-2.1.26-23.el7.x86_64 elfutils-libelf-0.170-3.el7.x86_64 elfutils-libs-0.170-3.el7.x86_64 glib2-2.54.2-2.el7.x86_64 glibc-2.17-220.el7.x86_64 glusterfs-api-3.8.4-53.el7.x86_64 glusterfs-libs-3.8.4-53.el7.x86_64 gmp-6.0.0-15.el7.x86_64 gnutls-3.3.26-9.el7.x86_64 gperftools-libs-2.6.1-1.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-18.el7.x86_64 libacl-2.2.51-14.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-49.el7.x86_64 libcacard-2.5.2-2.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-11.el7.x86_64 libcurl-7.29.0-46.el7.x86_64 libdb-5.3.21-22.el7.x86_64 libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-25.el7.x86_64 libgcrypt-1.5.3-14.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libibverbs-15-2.el7.x86_64 libidn-1.28-4.el7.x86_64 libiscsi-1.9.0-7.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64 libmount-2.23.2-49.el7.x86_64 libnl3-3.2.28-4.el7.x86_64 libpng-1.5.13-7.el7_2.x86_64 librados2-0.94.5-2.el7.x86_64 librbd1-0.94.5-2.el7.x86_64 librdmacm-15-2.el7.x86_64 libseccomp-2.3.1-3.el7.x86_64 libselinux-2.5-12.el7.x86_64 libssh2-1.4.3-10.el7_2.1.x86_64 libstdc++-4.8.5-25.el7.x86_64 libtasn1-4.10-1.el7.x86_64 libusbx-1.0.21-1.el7.x86_64 libuuid-2.23.2-49.el7.x86_64 lz4-1.7.5-2.el7.x86_64 lzo-2.06-8.el7.x86_64 nettle-2.7.1-8.el7.x86_64 nspr-4.17.0-1.el7.x86_64 nss-3.34.0-1.el7.x86_64 nss-softokn-freebl-3.34.0-1.el7.x86_64 nss-util-3.34.0-1.el7.x86_64 numactl-libs-2.0.9-7.el7.x86_64 openldap-2.4.44-10.el7.x86_64 openssl-libs-1.0.2k-12.el7.x86_64 opus-1.0.2-6.el7.x86_64 p11-kit-0.23.5-3.el7.x86_64 pcre-8.32-17.el7.x86_64 pixman-0.34.0-1.el7.x86_64 snappy-1.1.0-3.el7.x86_64 spice-server-0.14.0-2.el7.x86_64 systemd-libs-219-51.el7.x86_64 usbredir-0.7.1-3.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x00007f1a6bf941a7 in raise () at /lib64/libc.so.6
#1  0x00007f1a6bf95898 in abort () at /lib64/libc.so.6
#2  0x00007f1a6bf8cfc8 in __assert_fail_base () at /lib64/libc.so.6
#3  0x00007f1a6bf8d074 in  () at /lib64/libc.so.6
#4  0x000055eeaa026447 in virtio_scsi_ctx_check (s=<optimized out>, s=<optimized out>, d=0x55eeade85400)
    at /usr/src/debug/qemu-2.10.0/hw/scsi/virtio-scsi.c:246
#5  0x000055eeaa0ab9a6 in virtio_scsi_handle_cmd_vq (s=<optimized out>, s=<optimized out>, d=0x55eeade85400)
    at /usr/src/debug/qemu-2.10.0/hw/scsi/virtio-scsi.c:246
#6  0x000055eeaa0ab9a6 in virtio_scsi_handle_cmd_vq (req=0x55eead80a780, s=0x55eeaea6a170)
    at /usr/src/debug/qemu-2.10.0/hw/scsi/virtio-scsi.c:559
#7  0x000055eeaa0ab9a6 in virtio_scsi_handle_cmd_vq (s=s@entry=0x55eeaea6a170, vq=vq@entry=0x55eeaea72100)
    at /usr/src/debug/qemu-2.10.0/hw/scsi/virtio-scsi.c:599
#8  0x000055eeaa0ac58a in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0x55eeaea72100)
    at /usr/src/debug/qemu-2.10.0/hw/scsi/virtio-scsi-dataplane.c:60
#9  0x000055eeaa0b8db6 in virtio_queue_host_notifier_aio_poll (vq=0x55eeaea72100)
    at /usr/src/debug/qemu-2.10.0/hw/virtio/virtio.c:1506
#10 0x000055eeaa0b8db6 in virtio_queue_host_notifier_aio_poll (opaque=0x55eeaea72168)
    at /usr/src/debug/qemu-2.10.0/hw/virtio/virtio.c:2420
#11 0x000055eeaa34c77e in run_poll_handlers_once (ctx=ctx@entry=0x55eeac69bcc0) at util/aio-posix.c:497
#12 0x000055eeaa34d1c5 in aio_poll (blocking=true, ctx=0x55eeac69bcc0) at util/aio-posix.c:573
#13 0x000055eeaa34d1c5 in aio_poll (ctx=0x55eeac69bcc0, blocking=blocking@entry=true) at util/aio-posix.c:602
#14 0x000055eeaa1407c6 in iothread_run (opaque=0x55eeac6a5340) at iothread.c:59
#15 0x00007f1a6c332dd5 in start_thread () at /lib64/libpthread.so.0
#16 0x00007f1a6c05c94d in clone () at /lib64/libc.so.6

Comment 7 Stefan Hajnoczi 2018-01-30 15:46:51 UTC
Patch sent upstream:
https://patchwork.ozlabs.org/patch/867549/

Comment 8 Stefan Hajnoczi 2018-02-20 13:28:05 UTC
I hit more race conditions after backporting the fix from Comment#7.  After additional debugging I sent another patch upstream.

Patch sent upstream:
https://patchwork.ozlabs.org/patch/875530/

Comment 9 Stefan Hajnoczi 2018-03-13 15:31:45 UTC
*** Bug 1550335 has been marked as a duplicate of this bug. ***

Comment 17 aihua liang 2018-05-09 02:26:06 UTC
Verified, the problem has been resolved, set its status to "Verified".

Test version:
  kernel:3.10.0-879.el7.x86_64
  qemu-kvm-rhev:qemu-kvm-rhev-2.12.0-1.el7.x86_64

Test Steps:
  1. Start guest with qemu cmds:
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-machine pc \
-vga std  \
-object iothread,id=iothread0 \
-object iothread,id=iothread1 \
-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=03,iothread=iothread0 \
-drive id=drive_image1,if=none,snapshot=on,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel75-64-virtio.qcow2 \
-device scsi-hd,id=image1,drive=drive_image1,bus=scsi0.0,lun=0 \
-device virtio-net-pci,mac=9a:b2:b3:b4:b5:b6,id=iduCv1Ln,vectors=4,netdev=idKgexFk,bus=pci.0,addr=05  \
-netdev tap,id=idKgexFk,vhost=on \
-m 4096  \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
-cpu host \
-vnc :1  \
-enable-kvm \
-monitor stdio \
-device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=04,iothread=iothread1 \
-drive id=data_image1,if=none,werror=stop,rerror=stop,cache=none,format=qcow2,file=/home/test.qcow2 \
-device scsi-hd,id=data1,drive=data_image1,bus=scsi1.0,lun=0 \

2. Run io test on guest
(guest)#dd if=/dev/zero of=/dev/sdb bs=4K count=1000000000 oflag=direct status=progress

3. Quit qemu during io testing
 (qemu)quit

Test Result:
 Qemu quit without any error.

Comment 18 errata-xmlrpc 2018-11-01 11:01:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443