Bug 1478227 - [NBD] qemu-kvm hit Segmentation fault if guest is writing to the NBD data disk and meanwhile unexport this data disk
Summary: [NBD] qemu-kvm hit Segmentation fault if guest is writing to the NBD data dis...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: ---
Hardware: All
OS: Linux
medium
high
Target Milestone: rc
: 8.0
Assignee: Eric Blake
QA Contact: zixchen
URL:
Whiteboard:
: 1672031 (view as bug list)
Depends On:
Blocks: 1672029
TreeView+ depends on / blocked
 
Reported: 2017-08-04 03:09 UTC by yilzhang
Modified: 2020-10-12 10:38 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-12 08:05:06 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description yilzhang 2017-08-04 03:09:52 UTC
Description of problem:
When guest is writing to the NBD data disk, unexport this NBD data disk on NBD server side, after that, qemu-kvm process will abort with Segmentation fault.

Version-Release number of selected component (if applicable):
host: 4.11.0-19.el7a.ppc64le
      qemu-kvm-2.9.0-19.el7a.ppc64le
      SLOF-20170303-4.git66d250e.el7.noarch
guest kernel: 4.11.0-14.el7a.ppc64le

How reproducible: 100%


Steps to Reproduce:
1. Create disk image on NBD server
# qemu-img create -f qcow2  -o preallocation=full   nbd_dataimage_0.qcow2  4G
2. Export image file on NBD server side
# qemu-nbd -f raw  /home/yilzhang/nbd_dataimage_0.qcow2  -p 9001 -t &
3. Boot up guest on NBD client, using the above NBD disk image as one data disk:
/usr/libexec/qemu-kvm \
-name yilzhang_virt8_guest \
 -smp 8,sockets=2,cores=4,threads=1 -m 8192 \
-serial unix:/tmp/nbd-serial.log,server,nowait \
-nodefaults \
 -rtc base=localtime,clock=host \
 -boot menu=on \
 -monitor stdio \
 -vnc :88 \
 -qmp tcp:0:9990,server,nowait \
\
-device pci-bridge,id=bridge1,chassis_nr=1,bus=pci.0 \
 -device virtio-scsi-pci,bus=bridge1,addr=0x1,id=scsi0 \
-drive file=/home/yilzhang/rhel7.4-alt.qcow2,if=none,cache=none,id=drive_sysdisk,snapshot=off,aio=native,format=qcow2,werror=stop,rerror=stop \
-device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0 \
\
-drive file=nbd://10.0.1.20:9001,if=none,cache=none,id=drive_datadisk1,aio=native,format=qcow2,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive_datadisk1,bus=bridge1,addr=0x2,id=datadisk1 \
 -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \
 -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c3:e7:84 \

4. Start QMP on host: # telnet localhost 9990
                        {"execute": "qmp_capabilities"}
5. Login guest, and write data to the above data disk exported from NBD server
   [guest]# dd if=/dev/zero  of=/dev/vda  bs=1M count=2000 oflag=sync
6. During "dd" is still ongoing, unexport the NBD disk image
[NBD server]# kill -9 6746
[5]+  Killed                  qemu-nbd -f raw /home/yilzhang/nbd_dataimage_0.qcow2 -p 9001 -t
7. QMP emits "BLOCK_IO_ERROR" event:
{"timestamp": {"seconds": 1494090476, "microseconds": 553531}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_datadisk1", "nospace": false, "__com.redhat_reason": "eio", "node-name": "#block349", "reason": "Input/output error", "operation": "write", "action": "stop"}}


Actual results:
After a short while, qemu-kvm aborted with Segmentation fault

Expected results:
qemu-kvm should not abort abnormally


Additional info:
1. Power8+qemu-kvm-rhev-2.9.0-14.el7.ppc64le and x86 platform also have this issue
2. gdb  /usr/libexec/qemu-kvm  core.9638
warning: exec file is newer than core file.
[New LWP 9638]
[New LWP 9680]
[New LWP 9682]
[New LWP 9681]
[New LWP 9683]
[New LWP 9685]
[New LWP 9684]
[New LWP 9689]
[New LWP 9639]
[New LWP 9687]
[New LWP 9686]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm -name yilzhang_virt8_guest -smp 8,sockets=2,cores=4,threa'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000002cd67bb4 in aio_co_schedule ()
Missing separate debuginfos, use: debuginfo-install qemu-kvm-2.9.0-19.el7a.ppc64le
(gdb) bt
#0  0x000000002cd67bb4 in aio_co_schedule ()
#1  0x000000002ccd1c9c in nbd_client_attach_aio_context ()
#2  0x000000002cccfce8 in nbd_attach_aio_context ()
#3  0x000000002cc72ac0 in bdrv_attach_aio_context ()
#4  0x000000002cc72a8c in bdrv_attach_aio_context ()
#5  0x000000002cc72c38 in bdrv_set_aio_context ()
#6  0x000000002ccbaab4 in blk_set_aio_context ()
#7  0x000000002c9ff9d0 in virtio_blk_data_plane_stop ()
#8  0x000000002cbf7020 in virtio_bus_stop_ioeventfd ()
#9  0x000000002cbf2598 in virtio_pci_vmstate_change ()
#10 0x000000002ca2e91c in virtio_vmstate_change ()
#11 0x000000002cb5a6b4 in vm_state_notify ()
#12 0x000000002c9bd9c0 in vm_stop ()
#13 0x000000002c95af10 in main ()

Comment 2 yilzhang 2017-08-04 07:53:10 UTC
Power8+qemu-kvm-rhev-2.9.0-14.el7.ppc64le and x86+qemu-kvm-rhev-2.9.0-19.el7a.x86_64   also have this issue

Comment 5 Longxiang Lyu 2017-10-12 08:37:22 UTC
Reproduced in qemu-img-rhev-2.10.0-1.el7.x86_64.

1. verify version info
# rpm -qa | grep ^qemu
qemu-kvm-tools-rhev-2.10.0-1.el7.x86_64
qemu-kvm-common-rhev-2.10.0-1.el7.x86_64
qemu-kvm-rhev-2.10.0-1.el7.x86_64
qemu-kvm-rhev-debuginfo-2.10.0-1.el7.x86_64
qemu-img-rhev-2.10.0-1.el7.x86_64
# uname -r
3.10.0-730.el7.x86_64

2. prepare data disk image and export as NBD server.
# qemu-img create -f qcow2 -o preallocation=full add.qcow2 5G
Formatting 'add.qcow2', fmt=qcow2 size=5368709120 encryption=off cluster_size=65536 preallocation=full lazy_refcounts=off refcount_bits=16

# qemu-nbd -f raw add.qcow2  -p 9001 -t

3. boot up guest
#!/bin/bash
/usr/libexec/qemu-kvm \
-name guest=test-virt \
-machine pc-i440fx-rhel7.4.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off \
-cpu SandyBridge \
-m 2G \
-smp 4,sockets=4,cores=1,threads=1 \
-boot strict=on \
-drive file=/home/test/nbd01/test.qcow2,if=none,format=qcow2,id=img0 \
-device virtio-blk-pci,bus=pci.0,drive=img0,id=virtio-disk0,bootindex=1 \
-drive file=nbd://10.66.11.1:9001,if=none,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native,id=img1 \
-device virtio-blk-pci,bus=pci.0,drive=img1,id=virtio-disk1 \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:12:b3:20:61,bus=pci.0 \
-device qxl-vga \
-usbdevice tablet \
-vnc :2 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \

4. connect to qmp server
# telnet 127.0.0.1 5555
{"execute": "qmp_capabilities"}

4. dd in the guest to the nbd data disk.
# dd if=/dev/urandom of=/dev/vdb bs=1M count=1024

5. kill NBD server before dd ends.

result:
1.
qemu aborted with segmentation fault.
2.
qmp outputs:
{"timestamp": {"seconds": 1507796895, "microseconds": 776193}, "event": "BLOCK_IO_ERROR", "data": {"device": "img1", "nospace": false, "__com.redhat_reason": "eio", "node-name": "#block386", "reason": "Input/output error", "operation": "read", "action": "stop"}}

3. gdb bt output of core file
# gdb core.1788 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
[New LWP 1788]
[New LWP 1805]
[New LWP 1802]
[New LWP 1808]
[New LWP 1999]
[New LWP 1803]
[New LWP 1789]
[New LWP 1807]
[New LWP 1804]
Reading symbols from /usr/libexec/qemu-kvm...Reading symbols from /usr/lib/debug/usr/libexec/qemu-kvm.debug...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm -name guest=test-virt -machine pc-i440fx-rhel7.4.0,accel='.
Program terminated with signal 11, Segmentation fault.
#0  0x000055ae1278a86a in aio_co_schedule (ctx=0x55ae15243980, co=0x0) at util/async.c:441
441	    QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-gssapi-2.1.26-21.el7.x86_64 cyrus-sasl-lib-2.1.26-21.el7.x86_64 cyrus-sasl-md5-2.1.26-21.el7.x86_64 cyrus-sasl-plain-2.1.26-21.el7.x86_64 cyrus-sasl-scram-2.1.26-21.el7.x86_64 elfutils-libelf-0.168-8.el7.x86_64 elfutils-libs-0.168-8.el7.x86_64 glib2-2.50.3-3.el7.x86_64 glibc-2.17-196.el7.x86_64 glusterfs-api-3.8.4-45.el7rhgs.x86_64 glusterfs-libs-3.8.4-45.el7rhgs.x86_64 gmp-6.0.0-15.el7.x86_64 gnutls-3.3.26-9.el7.x86_64 gperftools-libs-2.4-8.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64 libacl-2.2.51-12.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libblkid-2.23.2-43.el7.x86_64 libcacard-2.5.2-2.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-10.el7.x86_64 libcurl-7.29.0-42.el7.x86_64 libdb-5.3.21-20.el7.x86_64 libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-16.el7.x86_64 libgcrypt-1.5.3-14.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libibverbs-13-7.el7.x86_64 libidn-1.28-4.el7.x86_64 libiscsi-1.9.0-7.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64 libmount-2.23.2-43.el7.x86_64 libnl3-3.2.28-4.el7.x86_64 libpng-1.5.13-7.el7_2.x86_64 librados2-12.2.0-2.el7cp.x86_64 librbd1-12.2.0-2.el7cp.x86_64 librdmacm-13-7.el7.x86_64 libseccomp-2.3.1-3.el7.x86_64 libselinux-2.5-11.el7.x86_64 libssh2-1.4.3-10.el7_2.1.x86_64 libstdc++-4.8.5-16.el7.x86_64 libtasn1-4.10-1.el7.x86_64 libunwind-1.2-2.el7.x86_64 libusbx-1.0.20-1.el7.x86_64 libuuid-2.23.2-43.el7.x86_64 lttng-ust-2.4.1-4.el7.x86_64 lzo-2.06-8.el7.x86_64 nettle-2.7.1-8.el7.x86_64 nspr-4.13.1-1.0.el7_3.x86_64 nss-3.28.4-8.el7.x86_64 nss-softokn-freebl-3.28.3-6.el7.x86_64 nss-util-3.28.4-3.el7.x86_64 numactl-libs-2.0.9-6.el7_2.x86_64 openldap-2.4.44-5.el7.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 p11-kit-0.23.5-3.el7.x86_64 pcre-8.32-17.el7.x86_64 pixman-0.34.0-1.el7.x86_64 snappy-1.1.0-3.el7.x86_64 spice-server-0.12.8-2.el7.1.x86_64 systemd-libs-219-42.el7.x86_64 usbredir-0.7.1-2.el7.x86_64 userspace-rcu-0.7.16-1.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x000055ae1278a86a in aio_co_schedule (ctx=0x55ae15243980, co=0x0) at util/async.c:441
#1  0x000055ae126ccdad in bdrv_attach_aio_context (bs=0x55ae15732000, 
    new_context=new_context@entry=0x55ae15243980) at block.c:4547
#2  0x000055ae126ccd8b in bdrv_attach_aio_context (bs=bs@entry=0x55ae153b4800, 
    new_context=new_context@entry=0x55ae15243980) at block.c:4544
#3  0x000055ae126cce89 in bdrv_set_aio_context (bs=bs@entry=0x55ae153b4800, 
    new_context=new_context@entry=0x55ae15243980) at block.c:4580
#4  0x000055ae1270a2ec in blk_set_aio_context (blk=0x55ae1523c780, new_context=0x55ae15243980)
    at block/block-backend.c:1769
#5  0x000055ae124d9f17 in virtio_blk_data_plane_stop (vdev=<optimized out>)
    at /usr/src/debug/qemu-2.10.0/hw/block/dataplane/virtio-blk.c:262
#6  0x000055ae1266cc75 in virtio_bus_stop_ioeventfd (bus=0x55ae1769c3a8) at hw/virtio/virtio-bus.c:246
#7  0x000055ae124fecf4 in virtio_vmstate_change (opaque=0x55ae1769c420, running=<optimized out>, 
    state=<optimized out>) at /usr/src/debug/qemu-2.10.0/hw/virtio/virtio.c:2230
#8  0x000055ae1258a952 in vm_state_notify (running=running@entry=0, 
    state=state@entry=RUN_STATE_IO_ERROR) at vl.c:1603
#9  0x000055ae124ad4ba in do_vm_stop (state=RUN_STATE_IO_ERROR) at /usr/src/debug/qemu-2.10.0/cpus.c:941
#10 vm_stop (state=RUN_STATE_IO_ERROR) at /usr/src/debug/qemu-2.10.0/cpus.c:1807
#11 0x000055ae12472924 in main_loop_should_exit () at vl.c:1903
#12 main_loop () at vl.c:1921
#13 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4804
(gdb) 

Core dump file link:
http://fileshare.englab.nay.redhat.com/pub/section2/coredump/bug_1478227/core.1788

Comment 6 Yongxue Hong 2017-12-08 07:31:34 UTC
It is also reproduced with 4.14.0-15.el7a.ppc64le on P9.

Comment 7 Tingting Mao 2018-12-29 08:41:37 UTC
Reproduced this issue in qemu-kvm-3.1.0-2.module+el8+2606+2c716ad7

Tested packages:
qemu-kvm-3.1.0-2.module+el8+2606+2c716ad7
kernel-4.18.0-57.el8

Steps:
1. Create a data disk file and export it by network
# qemu-img create -f raw data.img 2G
# qemu-nbd -f raw -p 9000 data.img -t

2. Boot guest with data.img file as data disk
/usr/libexec/qemu-kvm \
        -name 'guest-rhel7.6' \
        -machine q35 \
        -nodefaults \
	-vga qxl \
	-blockdev driver=file,cache.direct=off,cache.no-flush=on,node-name=my_file,filename=base.qcow2 \
	-blockdev driver=qcow2,node-name=my,file=my_file \
	-device virtio-blk-pci,id=virtio_blk_pci0,drive=my \
       -blockdev driver=nbd,cache.direct=off,cache.no-flush=on,node-name=my_file1,server.host=localhost,server.port=9000,server.type=inet \
       -blockdev driver=raw,node-name=my1,file=my_file1 \
       -device virtio-blk-pci,id=virtio_blk_pci1,drive=my1 \
        -vnc :0 \
        -monitor stdio \
        -m 8192 \
        -smp 8 \
        -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0,addr=0x9  \
        -netdev tap,id=idxgXAlm \
        -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/timao/monitor-qmpmonitor1-20180220-094308-h9I6hRsI,server,nowait \
        -mon chardev=qmp_id_qmpmonitor1,mode=control  \
        -device pcie-root-port,id=pcie.0-root-port-8,slot=8,chassis=8,addr=0x8,bus=pcie.0 \

3. Write data in guest with dd

4. kill the command 'qemu-nbd -f raw -p 9000 data.img -t' by 'Ctrl+C' during 'dd' command in guest.

5. Shutdown the guest


Result:

After 4th step, there are I/O errors on guest screen, but qemu work normally.

After 5th step, qemu core dumpd:
(qemu) local_qemu.sh: line 21: 29646 Segmentation fault      (core dumped) /usr/libexec/qemu-kvm -name 'guest-rhel7.6' -machine q35 -nodefaults -vga qxl -blockdev driver=file,cache.direct=off,cache.no-flush=on,node-name=my_file,filename=$1 -blockdev driver=qcow2,node-name=my,file=my_file -device virtio-blk-pci,id=virtio_blk_pci0,drive=my -blockdev driver=nbd,cache.direct=off,cache.no-flush=on,node-name=my_file1,server.host=localhost,server.port=9000,server.type=inet -blockdev driver=raw,node-name=my1,file=my_file1 -device virtio-blk-pci,id=virtio_blk_pci1,drive=my1 -vnc :0 -monitor stdio -m 8192 -smp 8 -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0,addr=0x9 -netdev tap,id=idxgXAlm -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/timao/monitor-qmpmonitor1-20180220-094308-h9I6hRsI,server,nowait -mon chardev=qmp_id_qmpmonitor1,mode=control -device pcie-root-port,id=pcie.0-root-port-8,slot=8,chassis=8,addr=0x8,bus=pcie.0


gdb log:
(gdb) bt
#0  0x0000556b4c37979e in aio_co_schedule (ctx=0x556b4e664be0, co=0x0) at util/async.c:453
#1  0x0000556b4c2bf629 in bdrv_attach_aio_context (bs=0x556b4e6ad8a0, new_context=new_context@entry=0x556b4e664be0) at block.c:5102
#2  0x0000556b4c2bf607 in bdrv_attach_aio_context (bs=bs@entry=0x556b4e6b5230, new_context=new_context@entry=0x556b4e664be0) at block.c:5099
#3  0x0000556b4c2bf731 in bdrv_set_aio_context (bs=0x556b4e6b5230, new_context=0x556b4e664be0) at block.c:5135
#4  0x0000556b4c2f1a3c in blk_set_aio_context (blk=<optimized out>, new_context=<optimized out>) at block/block-backend.c:1901
#5  0x0000556b4c0b7552 in virtio_blk_data_plane_stop (vdev=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-2.module+el8+2606+2c716ad7.x86_64/hw/block/dataplane/virtio-blk.c:285
#6  0x0000556b4c2505bf in virtio_bus_stop_ioeventfd (bus=0x556b4f77a578) at hw/virtio/virtio-bus.c:246
#7  0x0000556b4c0dd47e in virtio_vmstate_change (opaque=0x556b4f77a5f0, running=0, state=<optimized out>)
    at /usr/src/debug/qemu-kvm-3.1.0-2.module+el8+2606+2c716ad7.x86_64/hw/virtio/virtio.c:2242
#8  0x0000556b4c16ecef in vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN) at vl.c:1578
#9  0x0000556b4c075b8a in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-2.module+el8+2606+2c716ad7.x86_64/cpus.c:1074
#10 0x0000556b4c02f3fe in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4686

Comment 8 Tingting Mao 2019-05-07 09:08:44 UTC
Reproduced this bug in 'qemu-kvm-3.1.0-24.module+el8.0.1+3117+9f83299e'.


Steps:
1. Export a image file installed OS over NBD
# qemu-nbd -f qcow2 tgt.qcow2 -p 9000 -t

2. Boot up a guest from the exported image
#/usr/libexec/qemu-kvm \
        -name 'guest' \
        -machine pc \
        -nodefaults \
        -vga qxl \
        -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=unsafe,format=raw,file=nbd:localhost:9000 \
        -device virtio-blk-pci,id=virtio_blk_pci0,drive=drive_image1,bus=pci.0,addr=05,bootindex=0 \
        -vnc :0 \
        -monitor stdio \
        -m 8192 \
        -smp 8 \
        -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pci.0,addr=0x9  \
        -netdev tap,id=idxgXAlm \
        -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/timao/monitor-qmpmonitor1-20180220-094308-h9I6hRsI,server,nowait \
        -mon chardev=qmp_id_qmpmonitor1,mode=control  \

3. Kill the connect from the server
Ctrl + c

4. Exit the guest in HMP  ------------------- Core dumped!
QEMU 3.1.0 monitor - type 'help' for more information
(qemu) q
qemu.sh: line 18:  3179 Segmentation fault      (core dumped) /usr/libexec/qemu-kvm -name 'guest' -machine pc -nodefaults -vga qxl -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=unsafe,format=raw,file=$1 -device virtio-blk-pci,id=virtio_blk_pci0,drive=drive_image1,bus=pci.0,addr=05,bootindex=0 -vnc :0 -monitor stdio -m 8192 -smp 8 -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pci.0,addr=0x9 -netdev tap,id=idxgXAlm -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/timao/monitor-qmpmonitor1-20180220-094308-h9I6hRsI,server,nowait -mon chardev=qmp_id_qmpmonitor1,mode=control


Gdb trace log:
(gdb) bt
#0  0x00005634528f2ace in aio_co_schedule (ctx=0x5634545a2dc0, co=0x0) at util/async.c:453
#1  0x00005634528386c9 in bdrv_attach_aio_context (bs=0x5634545d3fa0, new_context=new_context@entry=0x5634545a2dc0) at block.c:5109
#2  0x00005634528386a7 in bdrv_attach_aio_context (bs=bs@entry=0x5634545cd960, new_context=new_context@entry=0x5634545a2dc0) at block.c:5106
#3  0x00005634528387d1 in bdrv_set_aio_context (bs=0x5634545cd960, new_context=0x5634545a2dc0) at block.c:5142
#4  0x000056345286acbc in blk_set_aio_context (blk=<optimized out>, new_context=<optimized out>) at block/block-backend.c:1901
#5  0x000056345262fca2 in virtio_blk_data_plane_stop (vdev=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-24.module+el8.0.1+3117+9f83299e.x86_64/hw/block/dataplane/virtio-blk.c:285
#6  0x00005634527c908f in virtio_bus_stop_ioeventfd (bus=0x5634556f4728) at hw/virtio/virtio-bus.c:246
#7  0x0000563452655c4e in virtio_vmstate_change (opaque=0x5634556f47a0, running=0, state=<optimized out>)
    at /usr/src/debug/qemu-kvm-3.1.0-24.module+el8.0.1+3117+9f83299e.x86_64/hw/virtio/virtio.c:2242
#8  0x00005634526e74ef in vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN) at vl.c:1578
#9  0x00005634525ee2da in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=<optimized out>) at /usr/src/debug/qemu-kvm-3.1.0-24.module+el8.0.1+3117+9f83299e.x86_64/cpus.c:1074
#10 0x00005634525a7b1e in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4686

Comment 9 Tingting Mao 2019-05-13 06:22:19 UTC
*** Bug 1672031 has been marked as a duplicate of this bug. ***

Comment 10 Ademar Reis 2020-02-05 22:44:19 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 12 Alice Frosi 2020-10-01 13:20:32 UTC
I tried to reproduce this bug using qemu-nbd/qemu-system-x86 from master upstream and from the package available in rhel:8.3 stream (qemu-kvm-5.1.0-10.module+el8.3.0+8254+568ca30d). I haven't got any core dump anymore. 
When I kill the qemu-nbd process, I get this error message from the dd command:

dd if=/dev/zero  of=/dev/vdb  bs=1M count=2000 oflag=sync
[  240.974002] blk_update_request: I/O error, dev vdb, sector 139264 op 0x1:(WRITE) flags 0x4800 phys_seg 254 prio class 0
[  240.974996] Buffer I/O error on dev vdb, logical block 17408, lost async page write
[  240.974996] Buffer I/O error on dev vdb, logical block 17409, lost async page write
[  240.974996] Buffer I/O error on dev vdb, logical block 17410, lost async page write
[  240.974996] Buffer I/O error on dev vdb, logical block 17411, lost async page write
[  240.974996] Buffer I/O error on dev vdb, logical block 17412, lost async page write
[  240.974996] Buffer I/O error on dev vdb, logical block 17413, lost async page write
[  241.020641] Buffer I/O error on dev vdb, logical block 17414, lost async page write
[  241.025624] Buffer I/O error on dev vdb, logical block 17415, lost async page write
[  241.030745] Buffer I/O error on dev vdb, logical block 17416, lost async page write
[  241.036654] Buffer I/O error on dev vdb, logical block 17417, lost async page write
[  241.042028] blk_update_request: I/O error, dev vdb, sector 141296 op 0x1:(WRITE) flags 0x800 phys_seg 2 prio class 0
dd: error writing '/dev/vdb': Input/output error
69+0 records in
[  241.059231] blk_update_request: I/O error, dev vdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  241.068695] blk_update_request: I/O error, dev vdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
68+0 records out[  241.078653] blk_update_request: I/O error, dev vdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

[  241.088089] ldm_validate_partition_table(): Disk read failed.
[  241.094056] blk_update_request: I/O error, dev vdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
71303168 bytes ([  241.103901] blk_update_request: I/O error, dev vdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
71 MB, 68 MiB) copied, 1.86998 s, 38.1 MB/s[  241.114189] blk_update_request: I/O error, dev vdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  241.124722] blk_update_request: I/O error, dev vdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  241.133705] Dev vdb: unable to read RDB block 0
[  241.138055] blk_update_request: I/O error, dev vdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

The bug might have been solved in a more recent version of qemu-nbd/qemu.

Comment 14 John Ferlan 2020-10-02 20:52:30 UTC
Given comment 12, can we officially retest this.  Thanks!

Comment 15 zixchen 2020-10-12 07:26:02 UTC
Test with qemu-kvm-5.1.0-0.module+el8.3.0+7648+42900458.x86_64, the segmentation fault no longer exists.

Version:
kernel-4.18.0-240.el8.x86_64
qemu-kvm-5.1.0-0.module+el8.3.0+7648+42900458.x86_64

Test Steps:
1. Export a image file installed OS over NBD
# qemu-nbd -f qcow2 nbd_system.qcow2 -p 10809 -t

2. Boot up a guest from the exported image
#/usr/libexec/qemu-kvm \
        -name 'guest' \
        -machine pc \
        -nodefaults \
        -vga qxl \
        -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=unsafe,format=raw,file=nbd:localhost:10809 \
        -device virtio-blk-pci,id=virtio_blk_pci0,drive=drive_image1,bus=pci.0,addr=05,bootindex=0 \
        -vnc :0 \
        -monitor stdio \
        -m 8192 \
        -smp 8 \
        -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pci.0,addr=0x9  \
        -netdev tap,id=idxgXAlm \
        -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/czx/monitor-qmpmonitor1-20180220-094308-h9I6hRsI,server,nowait \
        -mon chardev=qmp_id_qmpmonitor1,mode=control  \
3. dd in the guest to the nbd data disk.
# dd if=/dev/urandom of=/dev/sdb bs=1M count=2048

4. Kill the connect from the server before dd ends.
Ctrl + c

5. Exit the guest in qemu
(qemu) q
qemu-kvm: Failed to flush the L2 table cache: Input/output error
qemu-kvm: Failed to flush the refcount block cache: Input/output error

Actual result:
After Step 4, dd cmd success, no core dump on qemu as well.

Expected Result:
Same as actual result.


Result

Comment 16 zixchen 2020-10-12 08:05:06 UTC
Also test with latest qemu version, qemu-kvm-5.1.0-13.module+el8.3.0+8382+afc3bbea.x86_64, steps and results are the same with above. This issue is fixed, so close the bug.


Note You need to log in before you can comment on or make changes to this bug.