Bug 1673397

Summary: [RHEL.7] qemu-kvm core dumped after hotplug the deleted disk with iothread parameter
Product: Red Hat Enterprise Linux 7 Reporter: Markus Armbruster <armbru>
Component: qemu-kvm-rhevAssignee: Markus Armbruster <armbru>
Status: CLOSED ERRATA QA Contact: yujie ma <yujma>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: chayang, coli, juzhang, ngu, virt-maint, xuwei
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-29.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1656276 Environment:
Last Closed: 2019-08-22 09:19:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1656276    
Bug Blocks: 1673396, 1718992, 1722710    

Comment 4 Miroslav Rezanina 2019-05-17 15:47:20 UTC
Fix included in qemu-kvm-rhev-2.12.0-29.el7

Comment 6 yujie ma 2019-05-20 06:59:01 UTC
update:
1. Reproduced with kernel-4.18.0-45.el8.x86_64 + qemu-kvm-3.1.0-0.module+el8+2266+616cf026.next.candidate.x86_64

1) Boot guest with the following command line:
/usr/libexec/qemu-kvm \
    -name 'rhel7.7' \
    -machine q35 \
    -nodefaults \
    -vga qxl \
    -object   iothread,id=iothread0 \
    -rtc base=utc,clock=host,driftfix=slew \
    -device   pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2 \
    -device virtio-scsi-pci,id=scsi0,iothread=iothread0,bus=pcie.0-root-port-2,addr=0x0 \
    -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/home/test/rhel77-64-virtio.qcow2,node-name=my_file1  \
    -blockdev driver=qcow2,node-name=file_image1,file=my_file1 \
    -device   pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -device  virtio-blk-pci,id=image2,drive=file_image1,write-cache=on,iothread=iothread0,bus=pcie.0-root-port-3,bootindex=0 \
    -blockdev driver=raw,cache.direct=off,cache.no-flush=on,file.filename=/home/test/data.qcow2,node-name=data_disk1,file.driver=file \
    -device scsi-hd,drive=data_disk1,id=data1,bootindex=1 \
    -vnc :0  \
    -monitor stdio \
    -m 4096 \
    -smp 8 \
    -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0,addr=0x9  \
    -netdev tap,id=idxgXAlm \
    -qmp tcp:localhost:5902,server,nowait  \
    -device nec-usb-xhci,id=usb1,bus=pcie.0,addr=0x5 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \

2) Unplug the data disk:
{"execute":"device_del","arguments":{"id":"data1"}}

{"timestamp": {"seconds": 1558332129, "microseconds": 411530}, "event": "DEVICE_DELETED", "data": {"device": "data1", "path": "/machine/peripheral/data1"}}
{"return": {}}

3) Hotplug the deleted disk failed:
{ 'execute':'device_add','arguments':{'driver':'scsi-hd','drive':'data_disk1','id':'data1'}}

Connection closed by foreign host.

4) Check the coredump file with gdb, the following info could be get:
(gdb) bt
#0  0x00007f9b47147337 in raise () at /lib64/libc.so.6
#1  0x00007f9b47148a28 in abort () at /lib64/libc.so.6
#2  0x000055da028e66ef in error_exit (err=<optimized out>, msg=msg@entry=0x55da02de0a60 <__func__.18625> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
#3  0x000055da02c4c1bf in qemu_mutex_unlock_impl (mutex=mutex@entry=0x55da05a2d960, file=file@entry=0x55da02de003f "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97
#4  0x000055da02c479a5 in aio_context_release (ctx=ctx@entry=0x55da05a2d900) at util/async.c:507
#5  0x000055da02bb1b78 in blk_prw (blk=blk@entry=0x55da06b14dc0, offset=offset@entry=0, buf=buf@entry=0x7fffe21e7b90 "QF", <incomplete sequence \373>, bytes=bytes@entry=512, co_entry=co_entry@entry=0x55da02bb30e0 <blk_read_entry>, flags=flags@entry=0) at block/block-backend.c:1263
#6  0x000055da02bb323a in blk_pread_unthrottled (count=512, buf=0x7fffe21e7b90, offset=0, blk=0x55da06b14dc0) at block/block-backend.c:1433
#7  0x000055da02bb323a in blk_pread_unthrottled (blk=blk@entry=0x55da06b14dc0, offset=offset@entry=0, buf=buf@entry=0x7fffe21e7b90 "QF", <incomplete sequence \373>, count=count@entry=512)
    at block/block-backend.c:1280
...


2. Verified with 3.10.0-1040.el7.x86_64 + qemu-kvm-rhev-2.12.0-29.el7
no core dump and hotplug disk successfully.

1) Boot guest with the following command line:
/usr/libexec/qemu-kvm \
    -name 'rhel7.7' \
    -machine q35 \
    -nodefaults \
    -vga qxl \
    -object   iothread,id=iothread0 \
    -rtc base=utc,clock=host,driftfix=slew \
    -device   pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2 \
    -device virtio-scsi-pci,id=scsi0,iothread=iothread0,bus=pcie.0-root-port-2,addr=0x0 \
    -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/home/test/rhel77-64-virtio.qcow2,node-name=my_file1  \
    -blockdev driver=qcow2,node-name=file_image1,file=my_file1 \
    -device   pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -device  virtio-blk-pci,id=image2,drive=file_image1,write-cache=on,iothread=iothread0,bus=pcie.0-root-port-3,bootindex=0 \
    -blockdev driver=raw,cache.direct=off,cache.no-flush=on,file.filename=/home/test/data.qcow2,node-name=data_disk1,file.driver=file \
    -device scsi-hd,drive=data_disk1,id=data1,bootindex=1 \
    -vnc :0  \
    -monitor stdio \
    -m 4096 \
    -smp 8 \
    -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b3,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0,addr=0x9  \
    -netdev tap,id=idxgXAlm \
    -qmp tcp:localhost:5902,server,nowait  \
    -device nec-usb-xhci,id=usb1,bus=pcie.0,addr=0x5 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \

2) Unplug the data disk:
{"execute":"device_del","arguments":{"id":"data1"}}

{"timestamp": {"seconds": 1558334175, "microseconds": 85720}, "event": "DEVICE_DELETED", "data": {"device": "data1", "path": "/machine/peripheral/data1"}}
{"return": {}}

3) Hotplug the deleted disk successfully:
{ 'execute':'device_add','arguments':{'driver':'scsi-hd','drive':'data_disk1','id':'data1'}}
{"return": {}}

4) Run IO test on the data disk, it worked normally:
# lsblk
sda                                 8:0    0 192.5K  0 disk 
# dd if=/dev/zero of=/dev/sda bs=1M count=1000 oflag=direct

Comment 8 errata-xmlrpc 2019-08-22 09:19:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2553