Bug 1440677 - The guest exit abnormally with data-plane when do "blockdev-snapshot-sync"in QMP.
Summary: The guest exit abnormally with data-plane when do "blockdev-snapshot-sync"in ...
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev   
(Show other bugs)
Version: 7.4
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Fam Zheng
QA Contact: Qianqian Zhu
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-10 09:10 UTC by Yongxue Hong
Modified: 2017-08-02 04:35 UTC (History)
10 users (show)

Fixed In Version: qemu-kvm-rhev-2.9.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-02 04:35:59 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2392 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2017-08-01 20:04:36 UTC

Description Yongxue Hong 2017-04-10 09:10:02 UTC
Description of problem:
The guest exit abnormally with data-plane when do "blockdev-snapshot-sync"in QMP.

Version-Release number of selected component (if applicable):
host:3.10.0-643.el7.ppc64le
guest:3.10.0-643.el7.ppc64le
qemu-kvm:version 2.8.92(qemu-kvm-rhev-2.9.0-0.el7.patchwork201703291116)

How reproducible:
100%

Steps to Reproduce:
1.boot a guest with data-plane as follow:
eg:
/usr/libexec/qemu-kvm \
-name rhel7_4-85851 \
-M pseries-rhel7.4.0 \
-m 8G \
-nodefaults \
-smp 4,sockets=4,cores=1,threads=1 \
-boot menu=on,order=cd \
-device VGA,id=vga0 \
-device nec-usb-xhci,id=xhci \
-device usb-tablet,id=usb-tablet0 \
-device usb-kbd,id=usb-kbd0 \
-object iothread,id=iothread0 \
-object iothread,id=iothread1 \
-device virtio-scsi-pci,id=scsi-pci-0 \
-drive file=/home/hyx/iso/RHEL-7.4-20170330.1-Server-ppc64le-dvd1.iso,if=none,media=cdrom,id=cd-0 \
-device scsi-cd,bus=scsi-pci-0.0,id=scsi-cd-0,drive=cd-0,channel=0,scsi-id=0,lun=0,bootindex=1 \
-drive file=/home/hyx/image/rhel-7_4-85851-20G.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-0 \
-device virtio-blk-pci,bus=pci.0,addr=0x03,drive=drive-0,id=virtio-blk-0,iothread=iothread0,bootindex=0 \
-drive file=/home/hyx/image/rhel-7_4-85851-30G.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-1 \
-device virtio-blk-pci,bus=pci.0,addr=0x04,drive=drive-1,id=virtio-blk-1,iothread=iothread1 \
-netdev tap,id=hostnet0,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=70:e2:84:14:0e:00 \
-monitor stdio \
-serial unix:./sock3,server,nowait \
-qmp tcp:0:3003,server,nowait \
-vnc :3


2. do "blockdev-snapshot-sync" in qmp
eg:
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-1","snapshot-file": "/home/hyx/image/sn1.qcow2", "format": "qcow2", "mode": "absolute-paths" } }

Actual results:
The guest exit and the hmp shows:
qemu-kvm: /builddir/build/BUILD/qemu-2.9.0/include/block/aio.h:457: aio_enable_external: Assertion `ctx->external_disable_cnt > 0' failed.

Expected results:
The guest should run normally.

Additional info:
This is also reproduce in X86_64.

Comment 2 Yongxue Hong 2017-04-10 10:57:06 UTC
qemu-kvm: /builddir/build/BUILD/qemu-2.9.0/include/block/aio.h:457: aio_enable_external: Assertion `ctx->external_disable_cnt > 0' failed.
Program received signal SIGABRT, Aborted.
0x00003fffb6f2edc8 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install alsa-lib-1.1.3-3.el7.ppc64le bzip2-libs-1.0.6-13.el7.ppc64le cyrus-sasl-lib-2.1.26-21.el7.ppc64le dbus-libs-1.6.12-17.el7.ppc64le elfutils-libelf-0.168-5.el7.ppc64le elfutils-libs-0.168-5.el7.ppc64le flac-libs-1.3.0-5.el7_1.ppc64le glib2-2.50.3-2.el7.ppc64le glibc-2.17-189.el7.ppc64le gmp-6.0.0-15.el7.ppc64le gnutls-3.3.26-6.el7.ppc64le gperftools-libs-2.4-8.el7.ppc64le gsm-1.0.13-11.el7.ppc64le keyutils-libs-1.5.8-3.el7.ppc64le krb5-libs-1.15.1-5.el7.ppc64le libICE-1.0.9-5.el7.ppc64le libSM-1.2.2-2.el7.ppc64le libX11-1.6.4-4.el7.ppc64le libXau-1.0.8-2.1.el7.ppc64le libXext-1.3.3-3.el7.ppc64le libXi-1.7.9-1.el7.ppc64le libXtst-1.2.3-1.el7.ppc64le libaio-0.3.109-13.el7.ppc64le libasyncns-0.8-7.el7.ppc64le libattr-2.4.46-12.el7.ppc64le libcap-2.22-9.el7.ppc64le libcom_err-1.42.9-10.el7.ppc64le libcurl-7.29.0-40.el7.ppc64le libdb-5.3.21-20.el7.ppc64le libfdt-1.4.3-1.el7.ppc64le libffi-3.0.13-18.el7.ppc64le libgcc-4.8.5-14.el7.ppc64le libgcrypt-1.5.3-14.el7.ppc64le libgpg-error-1.12-3.el7.ppc64le libibverbs-13-2.el7.ppc64le libidn-1.28-4.el7.ppc64le libiscsi-1.9.0-7.el7.ppc64le libnl3-3.2.28-3.el7_3.ppc64le libogg-1.3.0-7.el7.ppc64le libpng-1.5.13-7.el7_2.ppc64le librdmacm-13-2.el7.ppc64le libseccomp-2.3.1-3.el7.ppc64le libselinux-2.5-11.el7.ppc64le libsndfile-1.0.25-10.el7.ppc64le libssh2-1.4.3-10.el7_2.1.ppc64le libstdc++-4.8.5-14.el7.ppc64le libtasn1-4.10-1.el7.ppc64le libusbx-1.0.20-1.el7.ppc64le libuuid-2.23.2-36.el7.ppc64le libvorbis-1.3.3-8.el7.ppc64le libxcb-1.12-1.el7.ppc64le lzo-2.06-8.el7.ppc64le nettle-2.7.1-8.el7.ppc64le nspr-4.13.1-1.0.el7.ppc64le nss-3.28.3-5.el7.ppc64le nss-softokn-freebl-3.28.3-4.el7.ppc64le nss-util-3.28.3-3.el7.ppc64le numactl-libs-2.0.9-6.el7_2.ppc64le openldap-2.4.44-3.el7.ppc64le openssl-libs-1.0.2k-5.el7.ppc64le p11-kit-0.23.5-1.el7.ppc64le pcre-8.32-17.el7.ppc64le pixman-0.34.0-1.el7.ppc64le pulseaudio-libs-10.0-3.el7.ppc64le snappy-1.1.0-3.el7.ppc64le systemd-libs-219-32.el7.ppc64le tcp_wrappers-libs-7.6-77.el7.ppc64le xz-libs-5.2.2-1.el7.ppc64le zlib-1.2.7-17.el7.ppc64le
(gdb) bt
#0  0x00003fffb6f2edc8 in raise () from /lib64/libc.so.6
#1  0x00003fffb6f30f4c in abort () from /lib64/libc.so.6
#2  0x00003fffb6f24b44 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00003fffb6f24c34 in __assert_fail () from /lib64/libc.so.6
#4  0x000000005a154a30 in aio_enable_external (ctx=<optimized out>) at /usr/src/debug/qemu-2.9.0/include/block/aio.h:457
#5  0x000000005a4cae88 in aio_enable_external (ctx=<optimized out>) at block/io.c:236
#6  bdrv_drained_end (bs=0x5b70d000) at block/io.c:242
#7  0x000000005a343d30 in external_snapshot_clean (common=0x5b490b90) at blockdev.c:1845
#8  0x000000005a34800c in qmp_transaction (dev_list=0x3fffffffcd08, has_props=false, props=<optimized out>, errp=0x3fffffffcdf0) at blockdev.c:2256
#9  0x000000005a34820c in blockdev_do_action (errp=0x3fffffffcdf0, action=0x3fffffffccf8) at blockdev.c:1242
#10 qmp_blockdev_snapshot_sync (has_device=<optimized out>, device=<optimized out>, has_node_name=<optimized out>, node_name=<optimized out>, snapshot_file=<optimized out>, 
    has_snapshot_node_name=<optimized out>, snapshot_node_name=<optimized out>, has_format=<optimized out>, format=0x5b42fff0 "qcow2", has_mode=true, mode=NEW_IMAGE_MODE_ABSOLUTE_PATHS, 
    errp=0x3fffffffcdf0) at blockdev.c:1270
#11 0x000000005a35b66c in qmp_marshal_blockdev_snapshot_sync (args=<optimized out>, ret=<optimized out>, errp=0x3fffffffced8) at qmp-marshal.c:1023
#12 0x000000005a55b9d4 in do_qmp_dispatch (errp=0x3fffffffced0, request=<optimized out>, cmds=0x5a7cd910 <qmp_commands>) at qapi/qmp-dispatch.c:104
#13 qmp_dispatch (cmds=0x5a7cd910 <qmp_commands>, request=<optimized out>) at qapi/qmp-dispatch.c:131
#14 0x000000005a1bf934 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.9.0/monitor.c:3729
#15 0x000000005a563be0 in json_message_process_token (lexer=0x5b4f3688, input=0x5b574500, type=<optimized out>, x=<optimized out>, y=<optimized out>) at qobject/json-streamer.c:105
#16 0x000000005a58c8f8 in json_lexer_feed_char (lexer=0x5b4f3688, ch=<optimized out>, flush=false) at qobject/json-lexer.c:319
#17 0x000000005a58ca34 in json_lexer_feed (lexer=0x5b4f3688, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369
#18 0x000000005a563d3c in json_message_parser_feed (parser=<error reading variable: value has been optimized out>, buffer=<optimized out>, size=<optimized out>)
    at qobject/json-streamer.c:124
#19 0x000000005a1bd9f4 in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/debug/qemu-2.9.0/monitor.c:3772
#20 0x000000005a4ede1c in qemu_chr_be_write_impl (len=<optimized out>, buf=<optimized out>, s=<optimized out>) at chardev/char.c:284
#21 qemu_chr_be_write (s=<optimized out>, buf=<optimized out>, len=<optimized out>) at chardev/char.c:296
#22 0x000000005a4f7868 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=<optimized out>) at chardev/char-socket.c:411
#23 0x000000005a50be44 in qio_channel_fd_source_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at io/channel-watch.c:84
#24 0x00003fffb7473ab0 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#25 0x000000005a56c224 in glib_pollfds_poll () at util/main-loop.c:213
#26 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:258
#27 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:506
#28 0x000000005a15bee8 in main_loop () at vl.c:1898
#29 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4720
(gdb)

Comment 3 Fam Zheng 2017-04-12 07:15:00 UTC
Upstream commit fixing this is:

commit c26a5ab71338a53340257233bd172bbe22c06b16
Author: Fam Zheng <famz@redhat.com>
Date:   Fri Apr 7 14:54:09 2017 +0800

    block: Fix unpaired aio_disable_external in external snapshot
    
    bdrv_replace_child_noperm tries to hand over the quiesce_counter state
    from old bs to the new one, but if they are not on the same aio context
    this causes unbalance.
    
    Fix this by setting the correct aio context before calling
    bdrv_append().
    
    Reported-by: Ed Swierk <eswierk@skyportsystems.com>
    Reviewed-by: Eric Blake <eblake@redhat.com>
    Signed-off-by: Fam Zheng <famz@redhat.com>
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Comment 4 Qianqian Zhu 2017-04-26 06:54:43 UTC
Reproduced on qemu-img-rhev-2.9.0-0.el7.patchwork201703291116.x86_64, and verified on qemu-kvm-rhev-2.9.0-1.el7.x86_64&kernel-3.10.0-640.el7.x86_64.

Steps:
1. Launch guest with data-plane:
/usr/libexec/qemu-kvm -name rhel7_4-9343 -m 1G -smp 2 -object iothread,id=iothread0 -drive file=/home/kvm_autotest_root/images/rhel74-64-virtio.qcow2,format=qcow2,if=none,cache=none,media=disk,werror=stop,rerror=stop,id=drive-0 -device virtio-blk-pci,drive=drive-0,id=virtio-blk-0,iothread=iothread0,bootindex=0 -monitor stdio -qmp tcp:0:5555,server,nowait -vnc :3

2. Live snapshot:
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-0","snapshot-file": "/home/sn1", "format": "qcow2", "mode": "absolute-paths" } }

Result:
qemu-img-rhev-2.9.0-0.el7.patchwork201703291116.x86_64:
qemu core dump: 
(qemu) Formatting '/home/sn1', fmt=qcow2 size=21474836480 backing_file=/home/kvm_autotest_root/images/rhel74-64-virtio.qcow2 backing_fmt=qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
qemu-kvm: /builddir/build/BUILD/qemu-2.9.0/include/block/aio.h:457: aio_enable_external: Assertion `ctx->external_disable_cnt > 0' failed.
Aborted (core dumped)

qemu-kvm-rhev-2.9.0-1.el7.x86_64:
Both qemu and guest work well.

Moving to VERIFIED.

Comment 6 errata-xmlrpc 2017-08-02 04:35:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392


Note You need to log in before you can comment on or make changes to this bug.