+++ This bug was initially created as a clone of Bug #1683937 +++ Description of problem: After creating a snapshot for a living guest image(data plane enabled) with blockdev-create/blockdev-add while specifying the image as the backing file with 'backing-file'/'backing', qemu core dumped: # ./vm12.sh rhel76be.qcow2 QEMU 3.1.0 monitor - type 'help' for more information (qemu) (qemu) qemu-kvm: /builddir/build/BUILD/qemu-3.1.0/hw/scsi/virtio-scsi.c:246: virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed. ./vm12.sh: line 28: 68870 Aborted (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -sandbox off -machine pseries -nodefaults -vga std -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado1,server,nowait -mon chardev=qmp_id_qmpmonitor1,mode=control -smp 8,cores=2,threads=2,sockets=2 -object iothread,id=iothread1 -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtserialport,bus=virtio-serial0.0,chardev=qga0,id=qemu-ga0,name=org.qemu.guest_agent.0 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x6,iothread=iothread1 -blockdev node-name=file0,driver=file,filename=$1 -blockdev node-name=disk0,driver=qcow2,file=file0 -device scsi-hd,drive=disk0,id=image0 -device virtio-net-pci,mac=9a:78:79:7a:7b:6c,id=id8e5D72,netdev=idrYUYaH -netdev tap,id=idrYUYaH,vhost=on -m 1024 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -device usb-kbd,id=usb-kbd1,bus=usb1.0,port=2 -vnc :10 -rtc base=localtime,clock=host,driftfix=slew -boot menu=off,strict=off,order=cdn -enable-kvm -monitor stdio The script used to create the snapshot is as follows: # cat imagecreate_scratch.sh #!/bin/bash echo -e "{'execute':'qmp_capabilities'} {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/home/ngu/inc1.qcow2','size':32212254720},'job-id':'job1'}}" | nc -U /var/tmp/avocado1 sleep 2 echo -e "{'execute':'qmp_capabilities'} {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'iinc11','filename':'/home/ngu/inc1.qcow2'}}" | nc -U /var/tmp/avocado1 sleep 2 echo -e "{'execute':'qmp_capabilities'} {'execute':'blockdev-create', 'arguments': {'job-id': 'job2', 'options': {'driver': 'qcow2', 'file': 'iinc11', 'size': 32212254720,'backing-fmt':'qcow2','backing-file':'rhel76be.qcow2'}}}" | nc -U /var/tmp/avocado1 sleep 5 echo -e "{'execute':'qmp_capabilities'} {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'tmp','file':'iinc11','backing':'disk0'}}" | nc -U /var/tmp/avocado1 sleep 5 echo -e "{'execute':'qmp_capabilities'} {'execute':'job-dismiss','arguments':{'id':'job1'}}" | nc -U /var/tmp/avocado1 sleep 2 echo -e "{'execute':'qmp_capabilities'} {'execute':'job-dismiss','arguments':{'id':'job2'}}" | nc -U /var/tmp/avocado1 Version-Release number of selected component (if applicable): Host kernel: 4.18.0-71.el8.ppc64le qemu-kvm-3.1.0-18.module+el8+2834+fa8bb6e2.ppc64le How reproducible: 100% Steps to Reproduce: 1. Boot up a guest with a system disk(the node-name is disk0, data plane enabled) 2. Try to create a scratch disk as the snapshot of disk0 with blockdev-create/blockdev-add(as the script imagecreate_scratch.sh shows) 3. After the snapshot creation finishes, qemu core dumped occurs, or sometimes you need to do a system_reset to trigger the core dump. Actual results: In step3, qemu core dumped. Expected results: No core dump would occur. Additional info: Both system disk and data disk would cause the bug. --- Additional comment from Gu Nini on 2019-02-28 08:39 UTC --- Issue reproduced on x86 platform.
Test on qemu-kvm-rhev-2.12.0-23.el7.x86_64, also hit this issue.
Coredump info on RHEL7: (gdb) bt full #0 0x00007f3aec1b6207 in raise () at /lib64/libc.so.6 #1 0x00007f3aec1b78f8 in abort () at /lib64/libc.so.6 #2 0x00007f3aec1af026 in __assert_fail_base () at /lib64/libc.so.6 #3 0x00007f3aec1af0d2 in () at /lib64/libc.so.6 #4 0x000055e924d99a91 in virtio_scsi_ctx_check (s=<optimized out>, s=<optimized out>, d=0x55e928c32280) at /usr/src/debug/qemu-2.12.0/hw/scsi/virtio-scsi.c:246 #5 0x000055e924e34cc6 in virtio_scsi_handle_cmd_vq (s=<optimized out>, s=<optimized out>, d=0x55e928c32280) at /usr/src/debug/qemu-2.12.0/hw/scsi/virtio-scsi.c:246 vs = 0x55e929690170 rc = <optimized out> req = 0x55e928f4f2c0 next = <optimized out> ret = <optimized out> progress = true reqs = {tqh_first = 0x0, tqh_last = 0x7f3ae42a96d0} #6 0x000055e924e34cc6 in virtio_scsi_handle_cmd_vq (req=0x55e928f4f2c0, s=0x55e929690170) at /usr/src/debug/qemu-2.12.0/hw/scsi/virtio-scsi.c:559 vs = 0x55e929690170 rc = <optimized out> req = 0x55e928f4f2c0 next = <optimized out> ret = <optimized out> progress = true reqs = {tqh_first = 0x0, tqh_last = 0x7f3ae42a96d0} #7 0x000055e924e34cc6 in virtio_scsi_handle_cmd_vq (s=s@entry=0x55e929690170, vq=vq@entry=0x55e929698100) at /usr/src/debug/qemu-2.12.0/hw/scsi/virtio-scsi.c:599 req = 0x55e928f4f2c0 next = <optimized out> ret = <optimized out> progress = true reqs = {tqh_first = 0x0, tqh_last = 0x7f3ae42a96d0} ---Type <return> to continue, or q <return> to quit--- #8 0x000055e924e357da in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0x55e929698100) at /usr/src/debug/qemu-2.12.0/hw/scsi/virtio-scsi-dataplane.c:60 progress = <optimized out> s = 0x55e929690170 #9 0x000055e924e438f6 in virtio_queue_host_notifier_aio_poll (vq=0x55e929698100) at /usr/src/debug/qemu-2.12.0/hw/virtio/virtio.c:1512 n = 0x55e929698168 vq = 0x55e929698100 progress = <optimized out> #10 0x000055e924e438f6 in virtio_queue_host_notifier_aio_poll (opaque=0x55e929698168) at /usr/src/debug/qemu-2.12.0/hw/virtio/virtio.c:2427 n = 0x55e929698168 vq = 0x55e929698100 progress = <optimized out> #11 0x000055e9250fe72e in run_poll_handlers_once (ctx=ctx@entry=0x55e927e89900, timeout=timeout@entry=0x7f3ae42a97d8) at util/aio-posix.c:501 progress = false node = 0x55e928074780 #12 0x000055e9250ff594 in aio_poll (timeout=0x7f3ae42a97d8, max_ns=16000, ctx=0x55e927e89900) at util/aio-posix.c:539 progress = <optimized out> start_time = 255621519877114 elapsed_time = <optimized out> max_ns = 16000 node = <optimized out> i = <optimized out> ret = 0 progress = <optimized out> timeout = -1 start = 255621519875089 __PRETTY_FUNCTION__ = "aio_poll" #13 0x000055e9250ff594 in aio_poll (timeout=0x7f3ae42a97d8, ctx=0x55e927e89900) at util/aio-posix.c:574 ---Type <return> to continue, or q <return> to quit--- max_ns = 16000 node = <optimized out> i = <optimized out> ret = 0 progress = <optimized out> timeout = -1 start = 255621519875089 __PRETTY_FUNCTION__ = "aio_poll" #14 0x000055e9250ff594 in aio_poll (ctx=0x55e927e89900, blocking=blocking@entry=true) at util/aio-posix.c:615 node = <optimized out> i = <optimized out> ret = 0 progress = <optimized out> timeout = -1 start = 255621519875089 __PRETTY_FUNCTION__ = "aio_poll" #15 0x000055e924ece8ae in iothread_run (opaque=0x55e927ea7ce0) at iothread.c:64 iothread = 0x55e927ea7ce0 #16 0x00007f3aec554dd5 in start_thread () at /lib64/libpthread.so.0 #17 0x00007f3aec27dead in clone () at /lib64/libc.so.6
Reproduce steps: 1.Start guest with vm paused status and data_plane enable on scsi disk. /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -machine pc \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190123-032240-rOoB4cgD,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190123-032240-rOoB4cgD,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=id8Ec4Bn \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190123-032240-rOoB4cgD,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20190123-032240-rOoB4cgD,path=/var/tmp/seabios-20190123-032240-rOoB4cgD,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20190123-032240-rOoB4cgD,iobase=0x402 \ -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3,iothread=iothread0 \ -blockdev driver=file,node-name=file_base,filename=/root/sn1,auto-read-only=on \ -blockdev driver=qcow2,file=file_base,node-name=drive_image1,auto-read-only=on \ -device scsi-hd,id=image1,drive=drive_image1 \ -device virtio-net-pci,mac=9a:39:3a:3b:3c:3d,id=id1JNQsL,vectors=4,netdev=idVpZZ6A,bus=pci.0,addr=0x4 \ -netdev tap,id=idVpZZ6A,vhost=on \ -m 4096 \ -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \ -cpu 'Penryn',+kvm_pv_unhalt \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -monitor stdio \ -qmp tcp:0:3000,server,nowait \ 2.Create snapshot image on drive_image1. {'execute':'blockdev-create','arguments':{'options': {'driver':'file','filename':'/root/sn$i','size':21474836480},'job-id':'job1'}} {'execute':'blockdev-add','arguments':{'driver':'file','node-name':'drive_sn$i','filename':'/root/sn$i'}} {'execute':'blockdev-create','arguments':{'options': {'driver': 'qcow2','file':'drive_sn$i','size':21474836480,'backing-file':'/home/kvm_autotest_root/images/rhel77-64-virtio.qcow2','backing-fmt':'qcow2'},'job-id':'job2'}} {'execute':'blockdev-add','arguments':{'driver':'qcow2','node-name':'sn$i','file':'drive_sn$i','backing':'drive_image1'}} {'execute':'job-dismiss','arguments':{'id':'job1'}} {'execute':'job-dismiss','arguments':{'id':'job2'}} 3.Continue vm (qemu)cont After step3, qemu core dump with info: 'virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed' Additional info: virtio_blk works well.
This is what happens: 1. The virtio device first puts the nodes 'drive_image1' and 'file_base' into the AioContext of the iothread, but spec. 2. blockdev-add creates a new node 'sn1' in the main AioContext. 3. 'drive_image1' is added as a backing file to 'sn1'. Because a backing file must be in the same AioContext as its overlay, this pulls 'drive_image1' back into the main AioContext (see bdrv_open_backing_file()). 4. When processing the next request, virtio-scsi notices that the block node is not in the iothread AioContext any more. What should happen in step 3 is that the overlay is instead attached to the iothread AioContext. If there are conflicting requirements with respect to the AioContext a node must be in, attaching it should fail.