Red Hat Bugzilla – Bug 1440667
The guest exit abnormally with data-plane when do "block-job-complete" after do "drive-mirror" in QMP.
Last modified: 2017-08-02 00:35:59 EDT
Description of problem: The guest exit abnormally with data-plane when do "block-job-complete" after do "drive-mirror" in QMP. Version-Release number of selected component (if applicable): host:3.10.0-643.el7.ppc64le guest:3.10.0-643.el7.ppc64le qemu-kvm:version 2.8.92(qemu-kvm-rhev-2.9.0-0.el7.patchwork201703291116) How reproducible: 100% Steps to Reproduce: 1.boot a guest with data-plane as follow: eg: /usr/libexec/qemu-kvm \ -name rhel7_4-9343 \ -M pseries-rhel7.4.0 \ -m 8G \ -nodefaults \ -smp 4,sockets=4,cores=1,threads=1 \ -boot menu=on,order=cd \ -device VGA,id=vga0 \ -device nec-usb-xhci,id=xhci \ -device usb-tablet,id=usb-tablet0 \ -device usb-kbd,id=usb-kbd0 \ -object iothread,id=iothread0 \ -object iothread,id=iothread1 \ -device virtio-scsi-pci,id=scsi-pci-0 \ -drive file=/home/hyx/iso/RHEL-7.4-20170330.1-Server-ppc64le-dvd1.iso,if=none,media=cdrom,id=cd-0 \ -device scsi-cd,bus=scsi-pci-0.0,id=scsi-cd-0,drive=cd-0,channel=0,scsi-id=0,lun=0,bootindex=1 \ -drive file=/home/hyx/image/rhel-7_4-9330-20G.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-0 \ -device virtio-blk-pci,bus=pci.0,addr=0x03,drive=drive-0,id=virtio-blk-0,iothread=iothread0,bootindex=0 \ -drive file=/home/hyx/image/rhel-7_4-9330-30G.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-1 \ -device virtio-blk-pci,bus=pci.0,addr=0x04,drive=drive-1,id=virtio-blk-1,iothread=iothread1 \ -netdev tap,id=hostnet0,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=70:e2:84:14:0e:00 \ -monitor stdio \ -serial unix:./sock3,server,nowait \ -qmp tcp:0:3003,server,nowait \ -vnc :3 2. do "drive-mirror" in qmp eg: { "execute": "drive-mirror", "arguments": { "device": "drive-1","target": "/home/hyx/image/rhel-7_4-9343-mirror-30G.qcow2","sync": "full","format": "qcow2" } } 3. do "block-job-complete" in qmp eg: { "execute": "block-job-complete", "arguments": { "device": "drive-1"} } Actual results: The guest exit and the hmp shows: qemu-kvm: block/io.c:164: bdrv_drain_recurse: Assertion `qemu_get_current_aio_context() == qemu_get_aio_context()' failed. Expected results: The guest run normally. Additional info: This is also reproduce in X86_64.
The backtrace debug info : qemu-kvm: block/io.c:164: bdrv_drain_recurse: Assertion `qemu_get_current_aio_context() == qemu_get_aio_context()' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x3fffb40feaa0 (LWP 20251)] 0x00003fffb6f2edc8 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install libdb-5.3.21-20.el7.ppc64le (gdb) bt #0 0x00003fffb6f2edc8 in raise () from /lib64/libc.so.6 #1 0x00003fffb6f30f4c in abort () from /lib64/libc.so.6 #2 0x00003fffb6f24b44 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00003fffb6f24c34 in __assert_fail () from /lib64/libc.so.6 #4 0x000000004a88a264 in bdrv_drain_recurse (bs=0x4bad6c00) at block/io.c:164 #5 0x000000004a88abd8 in bdrv_drained_begin (bs=0x4bad6c00) at block/io.c:231 #6 0x000000004a82deac in bdrv_child_cb_drained_begin (child=<optimized out>) at block.c:719 #7 0x000000004a88ac74 in bdrv_parent_drained_begin (bs=0x4bacd000) at block/io.c:53 #8 bdrv_drained_begin (bs=0x4bacd000) at block/io.c:228 #9 0x000000004a88b6ac in bdrv_co_drain_bh_cb (opaque=0x3ffda5adfd80) at block/io.c:190 #10 0x000000004a928d58 in aio_bh_call (bh=0x4b7c7d40) at util/async.c:90 #11 aio_bh_poll (ctx=0x4b8c1b80) at util/async.c:118 #12 0x000000004a92d934 in aio_poll (ctx=0x4b8c1b80, blocking=<optimized out>) at util/aio-posix.c:682 #13 0x000000004a70cde8 in iothread_run (opaque=0x4ba209a0) at iothread.c:59 #14 0x00003fffb70e8728 in start_thread () from /lib64/libpthread.so.0 #15 0x00003fffb70113d0 in clone () from /lib64/libc.so.6 (gdb)
Upstream fix for this in QEMU 2.9: commit 19dd29e8a77cd820515de5289f566508e0ed4926 Author: Fam Zheng <famz@redhat.com> Date: Fri Apr 7 14:54:11 2017 +0800 mirror: Fix aio context of mirror_top_bs It should be moved to the same context as source, before inserting to the graph. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reproduced on qemu-kvm-rhev-2.8.0-5.el7.x86_64, and verified on qemu-kvm-rhev-2.9.0-1.el7.x86_64&kernel-3.10.0-640.el7.x86_64. Steps: 1. Launch guest with data-plane: /usr/libexec/qemu-kvm -name rhel7_4-9343 -m 1G -smp 2 -object iothread,id=iothread0 -drive file=/home/kvm_autotest_root/images/rhel74-64-virtio.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-0 -device virtio-blk-pci,drive=drive-0,id=virtio-blk-0,iothread=iothread0,bootindex=0 -monitor stdio -qmp tcp:0:5555,server,nowait -vnc :3 2. block mirror and reopen: { "execute": "drive-mirror", "arguments": { "device": "drive-0","target": "/home/mirror1.qcow2","sync": "full","format": "qcow2" } } {"return": {}} {"timestamp": {"seconds": 1493187904, "microseconds": 180627}, "event": "BLOCK_JOB_READY", "data": {"device": "drive-0", "len": 3761504256, "offset": 3761504256, "speed": 0, "type": "mirror"}} { "execute": "block-job-complete", "arguments": { "device": "drive-0"}} {"return": {}} {"timestamp": {"seconds": 1493187916, "microseconds": 974757}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-0", "len": 3761504256, "offset": 3761504256, "speed": 0, "type": "mirror"}} Result: qemu-kvm-rhev-2.8.0-5.el7.x86_64: Both qemu and guest hang. qemu-kvm-rhev-2.9.0-1.el7.x86_64: Both qemu and guest work well. Therefore moving to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392