Bug 1631227

Summary: Qemu Core dump when quit vm that's in status "paused(io-error)" with data plane enabled
Product: Red Hat Enterprise Linux 7 Reporter: aihua liang <aliang>
Component: qemu-kvm-rhevAssignee: Kevin Wolf <kwolf>
Status: CLOSED ERRATA QA Contact: yujie ma <yujma>
Severity: high Docs Contact:
Priority: high    
Version: 7.6CC: aliang, chayang, coli, jomurphy, juzhang, lijin, ngu, phou, qzhang, timao, virt-maint, xuwei, yhong
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-28.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1716347 (view as bug list) Environment:
Last Closed: 2019-08-22 09:18:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1649160, 1651787, 1716347    
Attachments:
Description Flags
coredump none

Description aihua liang 2018-09-20 09:12:10 UTC
Created attachment 1485067 [details]
coredump

Description of problem:
  Qemu Core dump when quit vm with status "paused(io-error)"

Version-Release number of selected component (if applicable):
 kernel version:3.10.0-945.el7.x86_64
 qemu-kvm-rhev version: qemu-kvm-rhev-2.12.0-17.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Install a guest with gluster backend.
2.Full write image stored disks in gluster backend to make it "No spcace left on device"
3.Start vm with qemu cmds:
    /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20180910-021412-u4bPHcZI,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idDQoT2q  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20180910-021412-u4bPHcZI,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20180910-021412-u4bPHcZI,path=/var/tmp/seabios-20180910-021412-u4bPHcZI,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20180910-021412-u4bPHcZI,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -object iothread,id=iothread0 \
    -object secret,id=sec0,data=backing \
    -drive if=none,format=qcow2,id=drive_image1,cache=none,encrypt.key-secret=sec0,encrypt.format=luks,file=gluster://ibm-x3650m5-07.lab.eng.pek2.redhat.com/aliang/rhel76-64-virtio.qcow2.17 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=0x4,iothread=iothread0,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off \
    -device virtio-net-pci,mac=9a:5b:5c:5d:5e:5f,id=idqHhiSX,vectors=4,netdev=idLUiEfL,bus=pci.0,addr=0x5  \
    -netdev tap,id=idLUiEfL,vhost=on \
    -m 11264  \
    -blockdev driver=raw,file.driver=file,node-name=drive_cd1,cache.no-flush=on,cache.direct=off,file.filename=/home/kvm_autotest_root/iso/linux/RHEL7.6-Server-x86_64.iso,read-only=on \
    -device ide-cd,id=cd1,drive=drive_cd1 \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu Penryn \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=on,strict=off,order=cdn  \
    -enable-kvm \
    -monitor stdio \
    -blockdev driver=raw,file.driver=iscsi,node-name=drive_data2,cache.no-flush=on,cache.direct=off,file.transport=tcp,file.portal=10.73.224.153,file.target=iqn.2018-09.com.example:t1,file.lun=1 \
    -device virtio-blk-pci,id=data2,drive=drive_data2,bus=pci.0,addr=0x07 \
    -blockdev driver=qcow2,file.driver=file,node-name=drive_data1,cache.no-flush=on,cache.direct=off,file.filename=/home/data.qcow2 \
    -device virtio-blk-pci,id=data1,drive=drive_data1,bus=pci.0 \
    -blockdev driver=raw,node-name=drive_data0,cache.no-flush=on,cache.direct=off,file.filename=/dev/disk/by-path/ip-10.73.224.153:3260-iscsi-iqn.2018-09.com.example:t1-lun-2,file.driver=host_device \
    -device virtio-blk-pci,bus=pci.0,id=data0,drive=drive_data0,scsi=on,disable-modern=on \
    -qmp tcp:0:3000,server,nowait \

4. Start some apps on guest, and check vm status by qmp monitor
   (qmp)#nc -U /var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI
{"execute":"qmp_capabilities"}
"timestamp": {"seconds": 1537433398, "microseconds": 190699}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block569", "reason": "No space left on device", "operation": "write", "action": "stop"}}
......
{"timestamp": {"seconds": 1537433398, "microseconds": 195533}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": false, "__com.redhat_reason": "eother", "node-name": "#block569", "reason": "No medium found", "operation": "write", "action": "report"}}
{"timestamp": {"seconds": 1537433398, "microseconds": 195732}, "event": "STOP"}

5.After we get "STOP" via qmp monitor, check vm status in hmp:
    (qemu) info status 
VM status: paused (io-error)

6.Quit vm
  (qemu)quit

Actual results:
 After step 6, qemu core dump with info:
  (qemu) quit 
qemu: qemu_mutex_unlock_impl: Operation not permitted
blockdev.txt: line 41:  7985 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -sandbox off -machine pc -nodefaults -device VGA,bus=pci.0,addr=0x2 -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20180910-021412-u4bPHcZI,server,nowait -mon chardev=qmp_id_qmpmonitor1,mode=control -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20180910-021412-u4bPHcZI,server,nowait -mon chardev=qmp_id_catch_monitor,mode=control -device pvpanic,ioport=0x505,id=idDQoT2q -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20180910-021412-u4bPHcZI,server,nowait -device isa-serial,chardev=serial_id_serial0 -chardev socket,id=seabioslog_id_20180910-021412-u4bPHcZI,path=/var/tmp/seabios-20180910-021412-u4bPHcZI,server,nowait -device isa-debugcon,chardev=seabioslog_id_20180910-021412-u4bPHcZI,iobase=0x402 -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 -object iothread,id=iothread0 -object secret,id=sec0,data=backing -drive if=none,format=qcow2,id=drive_image1,cache=none,encrypt.key-secret=sec0,encrypt.format=luks,file=gluster://ibm-x3650m5-07.lab.eng.pek2.redhat.com/aliang/rhel76-64-virtio.qcow2.17 -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=0x4,iothread=iothread0,scsi=off,serial=0xdafafafafafaf12121212121212121212121212121212121212121212aaafffffffff,physical_block_size=4096,logical_block_size=512,disable-modern=on,disable-legacy=off -device virtio-net-pci,mac=9a:5b:5c:5d:5e:5f,id=idqHhiSX,vectors=4,netdev=idLUiEfL,bus=pci.0,addr=0x5 -netdev tap,id=idLUiEfL,vhost=on -m 11264 -blockdev driver=raw,file.driver=file,node-name=drive_cd1,cache.no-flush=on,cache.direct=off,file.filename=/home/kvm_autotest_root/iso/linux/RHEL7.6-Server-x86_64.iso,read-only=on -device ide-cd,id=cd1,drive=drive_cd1 -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 -cpu Penryn -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :0 -rtc base=utc,clock=host,driftfix=slew -boot menu=on,strict=off,order=cdn -enable-kvm -monitor stdio -blockdev driver=raw,file.driver=iscsi,node-name=drive_data2,cache.no-flush=on,cache.direct=off,file.transport=tcp,file.portal=10.73.224.153,file.target=iqn.2018-09.com.example:t1,file.lun=1 -device virtio-blk-pci,id=data2,drive=drive_data2,bus=pci.0,addr=0x07 -blockdev driver=qcow2,file.driver=file,node-name=drive_data1,cache.no-flush=on,cache.direct=off,file.filename=/home/data.qcow2 -device virtio-blk-pci,id=data1,drive=drive_data1,bus=pci.0 -blockdev driver=raw,node-name=drive_data0,cache.no-flush=on,cache.direct=off,file.filename=/dev/disk/by-path/ip-10.73.224.153:3260-iscsi-iqn.2018-09.com.example:t1-lun-2,file.driver=host_device -device virtio-blk-pci,bus=pci.0,id=data0,drive=drive_data0,scsi=on,disable-modern=on -qmp tcp:0:3000,server,nowait


 
[root@ibm-x3250m6-10 home]# gdb -c core.7985 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
[New LWP 7985]
[New LWP 8007]
[New LWP 7987]
[New LWP 8096]
[New LWP 8098]
[New LWP 8006]
[New LWP 7986]
[New LWP 8094]
[New LWP 8095]
Reading symbols from /usr/libexec/qemu-kvm...Reading symbols from /usr/lib/debug/usr/libexec/qemu-kvm.debug...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm -name avocado-vt-vm1 -sandbox off -machine pc -nodefaults'.
Program terminated with signal 6, Aborted.
#0  0x00007f499243d207 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install boost-iostreams-1.53.0-27.el7.x86_64 boost-random-1.53.0-27.el7.x86_64 boost-system-1.53.0-27.el7.x86_64 boost-thread-1.53.0-27.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-gssapi-2.1.26-23.el7.x86_64 cyrus-sasl-lib-2.1.26-23.el7.x86_64 cyrus-sasl-md5-2.1.26-23.el7.x86_64 cyrus-sasl-plain-2.1.26-23.el7.x86_64 elfutils-libelf-0.172-2.el7.x86_64 elfutils-libs-0.172-2.el7.x86_64 expat-2.1.0-10.el7_3.x86_64 glib2-2.56.1-1.el7.x86_64 glibc-2.17-260.el7.x86_64 glusterfs-3.12.2-18.el7.x86_64 glusterfs-api-3.12.2-18.el7.x86_64 glusterfs-libs-3.12.2-18.el7.x86_64 gmp-6.0.0-15.el7.x86_64 gnutls-3.3.29-8.el7.x86_64 gperftools-libs-2.6.1-1.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-34.el7.x86_64 libacl-2.2.51-14.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-59.el7.x86_64 libcacard-2.5.2-2.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-13.el7.x86_64 libcurl-7.29.0-51.el7.x86_64 libdb-5.3.21-24.el7.x86_64 libdrm-2.4.91-3.el7.x86_64 libepoxy-1.5.2-1.el7.x86_64 libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libgcrypt-1.5.3-14.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libibumad-17.2-3.el7.x86_64 libibverbs-17.2-3.el7.x86_64 libidn-1.28-4.el7.x86_64 libiscsi-1.9.0-7.el7.x86_64 libjpeg-turbo-1.2.90-6.el7.x86_64 libmount-2.23.2-59.el7.x86_64 libnl3-3.2.28-4.el7.x86_64 libpng-1.5.13-7.el7_2.x86_64 librados2-10.2.5-4.el7.x86_64 librbd1-10.2.5-4.el7.x86_64 librdmacm-17.2-3.el7.x86_64 libseccomp-2.3.1-3.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libssh2-1.4.3-12.el7.x86_64 libstdc++-4.8.5-36.el7.x86_64 libtasn1-4.10-1.el7.x86_64 libusbx-1.0.21-1.el7.x86_64 libuuid-2.23.2-59.el7.x86_64 libwayland-server-1.15.0-1.el7.x86_64 lz4-1.7.5-2.el7.x86_64 lzo-2.06-8.el7.x86_64 mesa-libgbm-18.0.5-3.el7.x86_64 nettle-2.7.1-8.el7.x86_64 nspr-4.19.0-1.el7_5.x86_64 nss-3.36.0-5.el7_5.x86_64 nss-softokn-freebl-3.36.0-5.el7_5.x86_64 nss-util-3.36.0-1.el7_5.x86_64 numactl-libs-2.0.9-7.el7.x86_64 openldap-2.4.44-20.el7.x86_64 openssl-libs-1.0.2k-16.el7.x86_64 opus-1.0.2-6.el7.x86_64 p11-kit-0.23.5-3.el7.x86_64 pcre-8.32-17.el7.x86_64 pixman-0.34.0-1.el7.x86_64 snappy-1.1.0-3.el7.x86_64 spice-server-0.14.0-6.el7.x86_64 sssd-client-1.16.2-13.el7.x86_64 systemd-libs-219-61.el7.x86_64 usbredir-0.7.1-3.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007f499243d207 in raise () at /lib64/libc.so.6
#1  0x00007f499243e8f8 in abort () at /lib64/libc.so.6
#2  0x000055a7cabf7e0f in error_exit (err=<optimized out>, msg=msg@entry=0x55a7cb0ef520 <__func__.18625> "qemu_mutex_unlock_impl")
    at util/qemu-thread-posix.c:36
#3  0x000055a7caf5b30f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x55a7ce209960, file=file@entry=0x55a7cb0eeaff "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97
#4  0x000055a7caf56b05 in aio_context_release (ctx=ctx@entry=0x55a7ce209900) at util/async.c:507
#5  0x000055a7caed5528 in bdrv_prwv_co (child=child@entry=0x55a7ce56b220, offset=offset@entry=25770983424, qiov=qiov@entry=0x7ffcb8f74c90, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833
#6  0x000055a7caed58e9 in bdrv_pwrite (qiov=0x7ffcb8f74c90, offset=25770983424, child=0x55a7ce56b220) at block/io.c:969
#7  0x000055a7caed58e9 in bdrv_pwrite (child=0x55a7ce56b220, offset=25770983424, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990
#8  0x000055a7caeb2595 in qcow2_cache_entry_flush (bs=bs@entry=0x55a7ce490800, c=c@entry=0x55a7ce2dfa80, i=i@entry=1) at block/qcow2-cache.c:227
#9  0x000055a7caeb26bd in qcow2_cache_write (bs=bs@entry=0x55a7ce490800, c=0x55a7ce2dfa80) at block/qcow2-cache.c:248
#10 0x000055a7caeb246e in qcow2_cache_flush (bs=bs@entry=0x55a7ce490800, c=<optimized out>) at block/qcow2-cache.c:259
#11 0x000055a7caeb251e in qcow2_cache_entry_flush (c=0x55a7ce2df900, c=0x55a7ce2df900, bs=0x55a7ce490800) at block/qcow2-cache.c:170
#12 0x000055a7caeb251e in qcow2_cache_entry_flush (bs=bs@entry=0x55a7ce490800, c=c@entry=0x55a7ce2df900, i=i@entry=6) at block/qcow2-cache.c:194
#13 0x000055a7caeb26bd in qcow2_cache_write (bs=bs@entry=0x55a7ce490800, c=0x55a7ce2df900) at block/qcow2-cache.c:248
#14 0x000055a7caeb246e in qcow2_cache_flush (bs=bs@entry=0x55a7ce490800, c=<optimized out>) at block/qcow2-cache.c:259
#15 0x000055a7caea396c in qcow2_inactivate (bs=bs@entry=0x55a7ce490800) at block/qcow2.c:2124
#16 0x000055a7caea3a3f in qcow2_close (bs=0x55a7ce490800) at block/qcow2.c:2153
#17 0x000055a7cae831c2 in bdrv_unref (bs=0x55a7ce490800) at block.c:3358
#18 0x000055a7cae831c2 in bdrv_unref (bs=0x55a7ce490800) at block.c:3542
#19 0x000055a7cae831c2 in bdrv_unref (bs=0x55a7ce490800) at block.c:4598
#20 0x000055a7caec4df1 in blk_remove_bs (blk=blk@entry=0x55a7ce316000) at block/block-backend.c:785
#21 0x000055a7caec4e4b in blk_remove_all_bs () at block/block-backend.c:483
#22 0x000055a7cae8075f in bdrv_close_all () at block.c:3412
#23 0x000055a7cabfb8db in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776
(gdb) 


Expected results:
 Qemu can quit successfully although vm in "paused(io-error)" status.

Additional info:
  The issue exist from qemu-kvm-rhev-2.12.0-7.el7.x86_64(included).
  When test with qemu-kvm-rhev-2.12.0-6.el7.x86_64, it quit successfully with info: "qemu-kvm:Failed tp flush the L2 table cache:No medium found".
  Attachment is the coredump info.

Comment 5 Ademar Reis 2018-09-20 16:36:46 UTC
Looking at the list of introduced in 12.12.0-7, my initial suspicious would be this commit:

commit d53f85c0fbaf3b7936e7a4b0f019be986a39fe7a
Author: Kevin Wolf <kwolf>
Date:   Mon Jul 2 15:40:07 2018 +0200

    qcow2: Free allocated clusters on write error



Here's the full list of commits:

$ git shortlog -n qemu-kvm-rhev-2.12.0-6.el7..qemu-kvm-rhev-2.12.0-7.el7

Dr. David Alan Gilbert (18):
      migration: stop compressing page in migration thread
      migration: stop compression to allocate and free memory frequently
      migration: stop decompression to allocate and free memory frequently
      migration: detect compression and decompression errors
      migration: introduce control_save_page()
      migration: move some code to ram_save_host_page
      migration: move calling control_save_page to the common place
      migration: move calling save_zero_page to the common place
      migration: introduce save_normal_page()
      migration: remove ram_save_compressed_page()
      migration/block-dirty-bitmap: fix memory leak in dirty_bitmap_load_bits
      migration: fix saving normal page even if it's been compressed
      migration: update index field when delete or qsort RDMALocalBlock
      Migration+TLS: Fix crash due to double cleanup
      migration: introduce decompress-error-check
      migration: Don't activate block devices if using -S
      migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect
      migration/block-dirty-bitmap: fix dirty_bitmap_load

Fam Zheng (13):
      block: Introduce API for copy offloading
      raw: Check byte range uniformly
      raw: Implement copy offloading
      qcow2: Implement copy offloading
      file-posix: Implement bdrv_co_copy_range
      iscsi: Query and save device designator when opening
      iscsi: Create and use iscsi_co_wait_for_task
      iscsi: Implement copy offloading
      block-backend: Add blk_co_copy_range
      qemu-img: Convert with copy offloading
      qcow2: Fix src_offset in copy offloading
      iscsi: Don't blindly use designator length in response for memcpy
      file-posix: Fix EINTR handling

plai (10):
      vhost-user: add Net prefix to internal state structure
      virtio: support setting memory region based host notifier
      vhost-user: support receiving file descriptors in slave_read
      osdep: add wait.h compat macros
      vhost-user-bridge: support host notifier
      vhost: allow backends to filter memory sections
      vhost-user: allow slave to send fds via slave channel
      vhost-user: introduce shared vhost-user state
      vhost-user: support registering external host notifiers
      libvhost-user: support host notifier

Eduardo Habkost (4):
      i386: Define the Virt SSBD MSR and handling of it (CVE-2018-3639)
      i386: define the AMD 'virt-ssbd' CPUID feature bit (CVE-2018-3639)
      pc: Add rhel7.6.0 machine-types
      qemu-options: Add missing newline to -accel help text

Kevin Wolf (4):
      usb-storage: Add rerror/werror properties
      qemu-iotests: Update 026.out.nocache reference output
      qcow2: Free allocated clusters on write error
      qemu-iotests: Test qcow2 not leaking clusters on write error

Max Reitz (3):
      block/file-posix: Pass FD to locking helpers
      block/file-posix: File locking during creation
      iotests: Add creation test to 153

Gerd Hoffmann (2):
      Add qemu-keymap to qemu-kvm-tools
      usb-host: skip open on pending postload bh

Alex Williamson (1):
      vfio/pci: Default display option to "off"

Cornelia Huck (1):
      s390x/cpumodel: default enable bpb and ppa15 for z196 and later

Igor Mammedov (1):
      numa: clarify error message when node index is out of range in -numa dist, ...

Miroslav Rezanina (1):
      Update to qemu-kvm-ma-2.12.0-7.el7 / qemu-kvm-rhev-2.12.0-7.el7

Comment 13 Kevin Wolf 2019-04-17 16:08:04 UTC
Posted a patch upstream that should fix this: https://lists.gnu.org/archive/html/qemu-block/2019-04/msg00490.html

Comment 15 Miroslav Rezanina 2019-05-13 15:59:52 UTC
Fix included in qemu-kvm-rhev-2.12.0-28.el7

Comment 17 yujie ma 2019-05-21 07:10:36 UTC
Reproduced this bug as below:


Tested with:
kernel-3.10.0-1048.el7.x86_64
qemu-kvm-rhev-2.12.0-16.el7


Steps:
1. Mount the gluster volume to local host, and full write it;
# mount.glusterfs dhcp-8-206.nay.redhat.com:/vol1 /home/gluster/
# dd if=/dev/zero of=/home/gluster/test.bin bs=1M oflag=direct
dd: error writing ‘/home/gluster/test.bin’: No space left on device
21350+0 records in
21349+0 records out
22386311168 bytes (22 GB) copied, 2508.5 s, 8.9 MB/s

2. Boot the vm with the image stored on the gluster volume;
/usr/libexec/qemu-kvm \
    -name 'rhel7.7' \
    -machine q35 \
    -nodefaults \
    -vga qxl \
    -object   iothread,id=iothread0 \
    -rtc base=utc,clock=host,driftfix=slew \
    -device   pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2 \
    -device virtio-scsi-pci,id=scsi0,iothread=iothread0,bus=pcie.0-root-port-2,addr=0x0 \
    -drive if=none,cache=none,format=qcow2,id=drive_image1,aio=native,file=gluster://gluster-virt-qe-01.lab.eng.pek2.redhat.com/vol1/rhel77-64-virtio.qcow2 \
    -device   pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -device  virtio-blk-pci,id=image2,drive=drive_image1,write-cache=on,iothread=iothread0,bus=pcie.0-root-port-3,bootindex=0 \
    -blockdev driver=raw,cache.direct=off,cache.no-flush=on,file.filename=/home/IOtest/data.qcow2,node-name=data_disk1,file.driver=file \
    -device scsi-hd,drive=data_disk1,id=data1,bootindex=1 \
    -vnc :0  \
    -monitor stdio \
    -m 4096 \
    -smp 8 \
    -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0,addr=0x9  \
    -netdev tap,id=idxgXAlm \
    -qmp tcp:localhost:5902,server,nowait  \
    -device nec-usb-xhci,id=usb1,bus=pcie.0,addr=0x5 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \

3. Start some apps on guest, and check vm status by qmp monitor;
# telnet localhost 5902
{"execute":"qmp_capabilities"}
{"return": {}}

{"timestamp": {"seconds": 1558420428, "microseconds": 418941}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block247", "reason": "No space left on device", "operation": "read", "action": "report"}}
{"timestamp": {"seconds": 1558420429, "microseconds": 696417}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block247", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1558420429, "microseconds": 697998}, "event": "STOP"}

4. After we get "STOP" via qmp monitor, check vm status in hmp:
    (qemu) info status 
VM status: paused (io-error)

5. Quit vm;
  (qemu)quit


Actual results:
After step 6, qemu core dump with info:
(qemu) quit
qemu: qemu_mutex_unlock_impl: Operation not permitted
test.gluster.sh: line 23: 21985 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name 'rhel7.7' -machine q35 -nodefaults -vga qxl -object iothread,id=iothread0 -rtc base=utc,clock=host,driftfix=slew -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2 -device virtio-scsi-pci,id=scsi0,iothread=iothread0,bus=pcie.0-root-port-2,addr=0x0 -drive if=none,cache=none,format=qcow2,id=drive_image1,aio=native,file=gluster://gluster-virt-qe-01.lab.eng.pek2.redhat.com/vol1/rhel77-64-virtio.qcow2 -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 -device virtio-blk-pci,id=image2,drive=drive_image1,write-cache=on,iothread=iothread0,bus=pcie.0-root-port-3,bootindex=0 -blockdev driver=raw,cache.direct=off,cache.no-flush=on,file.filename=/home/IOtest/data.qcow2,node-name=data_disk1,file.driver=file -device scsi-hd,drive=data_disk1,id=data1,bootindex=1 -vnc :0 -monitor stdio -m 4096 -smp 8 -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0,addr=0x9 -netdev tap,id=idxgXAlm -qmp tcp:localhost:5902,server,nowait -device nec-usb-xhci,id=usb1,bus=pcie.0,addr=0x5 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1

Check the coredump file with gdb, the following info could be get:
(gdb) bt
#0  0x00007f6aba407377 in raise () at /lib64/libc.so.6
#1  0x00007f6aba408a68 in abort () at /lib64/libc.so.6
#2  0x0000561d26dbde0f in error_exit (err=<optimized out>, msg=msg@entry=0x561d272b5500 <__func__.18621> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
#3  0x0000561d271212df in qemu_mutex_unlock_impl (mutex=mutex@entry=0x561d29741960, file=file@entry=0x561d272b4adf "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97
#4  0x0000561d2711cb05 in aio_context_release (ctx=ctx@entry=0x561d29741900) at util/async.c:507
#5  0x0000561d27098d18 in bdrv_flush (bs=<optimized out>) at block/io.c:2669
#6  0x0000561d27078483 in qcow2_cache_flush (bs=bs@entry=0x561d2981e800, c=<optimized out>)
    at block/qcow2-cache.c:262
#7  0x0000561d2706996c in qcow2_inactivate (bs=bs@entry=0x561d2981e800) at block/qcow2.c:2124
#8  0x0000561d27069a3f in qcow2_close (bs=0x561d2981e800) at block/qcow2.c:2153
#9  0x0000561d270491c2 in bdrv_unref (bs=0x561d2981e800) at block.c:3358
#10 0x0000561d270491c2 in bdrv_unref (bs=0x561d2981e800) at block.c:3542
#11 0x0000561d270491c2 in bdrv_unref (bs=0x561d2981e800) at block.c:4598
#12 0x0000561d2708adf1 in blk_remove_bs (blk=blk@entry=0x561d29830000)
    at block/block-backend.c:785
#13 0x0000561d2708ae4b in blk_remove_all_bs () at block/block-backend.c:483
#14 0x0000561d2704675f in bdrv_close_all () at block.c:3412
#15 0x0000561d26dc18db in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4776




Verified this bug as below:


Tested with:
kernel-3.10.0-1040.el7.x86_64 
qemu-kvm-rhev-2.12.0-29.el7


Steps:
1. Mount the gluster volume to local host, and full write it;

# mount.glusterfs dhcp-8-206.nay.redhat.com:/vol1 /home/gluster/
# dd if=/dev/zero of=/home/gluster/test.bin bs=1M oflag=direct
dd: error writing ‘/home/gluster/test.bin’: No space left on device
21350+0 records in
21349+0 records out
22386311168 bytes (22 GB) copied, 2508.5 s, 8.9 MB/s

2. Boot the vm with the image stored on the gluster volume;
/usr/libexec/qemu-kvm \
    -name 'rhel7.7' \
    -machine q35 \
    -nodefaults \
    -vga qxl \
    -object   iothread,id=iothread0 \
    -rtc base=utc,clock=host,driftfix=slew \
    -device   pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2 \
    -device virtio-scsi-pci,id=scsi0,iothread=iothread0,bus=pcie.0-root-port-2,addr=0x0 \
    -drive if=none,cache=none,format=qcow2,id=drive_image1,aio=native,file=gluster://gluster-virt-qe-01.lab.eng.pek2.redhat.com/vol1/rhel77-64-virtio.qcow2 \
    -device   pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -device  virtio-blk-pci,id=image2,drive=drive_image1,write-cache=on,iothread=iothread0,bus=pcie.0-root-port-3,bootindex=0 \
    -blockdev driver=raw,cache.direct=off,cache.no-flush=on,file.filename=/home/IOtest/data.qcow2,node-name=data_disk1,file.driver=file \
    -device scsi-hd,drive=data_disk1,id=data1,bootindex=1 \
    -vnc :0  \
    -monitor stdio \
    -m 4096 \
    -smp 8 \
    -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0,addr=0x9  \
    -netdev tap,id=idxgXAlm \
    -qmp tcp:localhost:5902,server,nowait  \
    -device nec-usb-xhci,id=usb1,bus=pcie.0,addr=0x5 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \

3. Start some apps on guest, and check vm status by qmp monitor;
# telnet localhost 5902
{"execute":"qmp_capabilities"}
{"return": {}}

{"timestamp": {"seconds": 1558421625, "microseconds": 697088}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block212", "reason": "No space left on device", "operation": "read", "action": "report"}}
{"timestamp": {"seconds": 1558421626, "microseconds": 74598}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive_image1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block212", "reason": "No space left on device", "operation": "write", "action": "stop"}}
{"timestamp": {"seconds": 1558421626, "microseconds": 76967}, "event": "STOP"}

4. After we get "STOP" via qmp monitor, check vm status in hmp:
    (qemu) info status 
VM status: paused (io-error)

5. Quit vm;
  (qemu)quit


Actual result:
No core dump and qemu can quit successfully although vm in "paused(io-error)" status.

Comment 19 errata-xmlrpc 2019-08-22 09:18:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2553