Bug 2185688

Summary:	[qemu-kvm] no response with QMP command block_resize
Product:	Red Hat Enterprise Linux 9	Reporter:	qing.wang <qinwang>
Component:	qemu-kvm	Assignee:	Kevin Wolf <kwolf>
qemu-kvm sub component:	virtio-blk,scsi	QA Contact:	qing.wang <qinwang>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	aliang, chayang, coli, hreitz, jinzhao, juzhang, kwolf, lijin, meili, mrezanin, qizhu, vgoyal, virt-maint, xuwei, zhenyzha
Version:	9.3	Keywords:	CustomerScenariosInitiative, Regression, Triaged
Target Milestone:	rc	Flags:	pm-rhel: mirror+
Target Release:	---
Hardware:	x86_64
OS:	All
Whiteboard:
Fixed In Version:	qemu-kvm-8.0.0-4.el9	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-11-07 08:27:12 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description qing.wang 2023-04-11 01:29:34 UTC

Description of problem:
It can not get a response after execute QMP command "block_resize"

Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux release 9.2 Beta (Plow)
5.14.0-289.el9.x86_64
qemu-kvm-8.0.0-0.rc1.el9.candidate.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Create image file
/usr/bin/qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg.qcow2 10G

2. Boot VM
/usr/libexec/qemu-kvm \
     -name 'avocado-vt-vm1'  \
     -sandbox on  \
     -machine q35,memory-backend=mem-machine_mem \
     -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \
     -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}'  \
     -nodefaults \
     -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \
     -m 12288 \
     -object '{"size": 12884901888, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}'  \
     -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2  \
     -cpu host,+kvm_pv_unhalt \
     \
     -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \
     -device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \
     -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \
     -object '{"qom-type": "iothread", "id": "iothread0"}' \
     -object '{"qom-type": "iothread", "id": "iothread1"}' \
     -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \
     -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0", "iothread": "iothread0"}' \
     -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel920-64-virtio-scsi.qcow2", "cache": {"direct": true, "no-flush": false}}' \
     -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
     -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
     -blockdev '{"node-name": "file_stg", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/stg.qcow2", "cache": {"direct": true, "no-flush": false}}' \
     -blockdev '{"node-name": "drive_stg", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_stg"}' \
     -device '{"driver": "scsi-hd", "id": "stg", "drive": "drive_stg", "write-cache": "on", "serial": "TARGET_DISK0"}' \
     -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \
     -device '{"driver": "virtio-net-pci", "mac": "9a:dd:5d:44:6b:fb", "id": "idPy6FFL", "netdev": "idTvtSK1", "bus": "pcie-root-port-3", "addr": "0x0"}'  \
     -netdev tap,id=idTvtSK1,vhost=on  \
     -vnc :5  \
     -monitor stdio \
     -qmp tcp:0:5955,server=on,wait=off \
     -rtc base=utc,clock=host,driftfix=slew  \
     -boot menu=off,order=cdn,once=c,strict=off \
     -enable-kvm \
     -device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}' \
     -chardev socket,id=socket-serial,path=/var/tmp/socket-serial,logfile=/var/tmp/file-serial.log,mux=on,server=on,wait=off \
     -serial chardev:socket-serial \
    -chardev file,path=/var/tmp/file-bios.log,id=file-bios \
    -device isa-debugcon,chardev=file-bios,iobase=0x402 \
    \

3. login guest and format data disk (optional step)
disk=`lsblk -nd |grep 10G|awk '{print $1}'`
echo "$disk"

yes|mkfs.ext4 "/dev/${disk}"
rm -rf /mnt/${disk}; mkdir /mnt/${disk}
mount -t ext4 /dev/${disk} /mnt/${disk}
lsblk 
mount |grep ${disk}


4. execute QMP comand
{'execute': 'block_resize', 'arguments': {'node-name': 'drive_stg', 'size': 16106127360}, 'id': 'qswyaxRs'}

Actual results:
no response with block_resize command


Expected results:
It can get a response with block_resize command

Additional info:

It have no issue on
Red Hat Enterprise Linux release 9.2 Beta (Plow)
5.14.0-284.2.1.el9_2.x86_64
qemu-kvm-7.2.0-14.el9_2.x86_64

Comment 4 Hanna Czenczek 2023-04-20 09:28:02 UTC

I think this is a different bug, here’s the stack trace of what seems to hang:

(gdb) bt
#0  0x00007fd863a909a0 in  () at /usr/lib/libc.so.6
#1  0x00007fd863a96efa in pthread_mutex_lock () at /usr/lib/libc.so.6
#2  0x000055dd05f770c3 in qemu_mutex_lock_impl (mutex=0x55dd09405ac0, file=0x55dd061a60c7 "../util/async.c", line=697) at ../util/qemu-thread-posix.c:94
#3  0x000055dd05e64900 in bdrv_co_drain_bh_cb (opaque=0x7fd85e6cfc90) at ../block/io.c:278
#4  0x000055dd05f88345 in aio_bh_call (bh=0x55dd09708d20) at ../util/async.c:155
#5  aio_bh_poll (ctx=ctx@entry=0x55dd09405a60) at ../util/async.c:184
#6  0x000055dd05f73e6b in aio_poll (ctx=0x55dd09405a60, blocking=blocking@entry=true) at ../util/aio-posix.c:721
#7  0x000055dd05e2ad66 in iothread_run (opaque=opaque@entry=0x55dd090e9090) at ../iothread.c:63
#8  0x000055dd05f76ce8 in qemu_thread_start (args=0x55dd094060a0) at ../util/qemu-thread-posix.c:541

Notably, as written in comment 0, I/O is completely optional.  Even when you start an empty VM (no guest) with -S, block_resize still won’t return.

(When you do have a guest, as far as I can see, it doesn’t fully hang, but I believe starting from block_resize, all I/O to the resizee’s iothread hangs.  Which basically makes the guest hang.)

Something seems to keep the AioContext acquired and isn’t releasing it, but I don’t know what yet.  Sounds a bit like the secondary bdrv_drain_all_end() thing from bug 2186725.

Comment 5 Hanna Czenczek 2023-04-20 10:23:58 UTC

Some debugging later, I’m not entirely sure what the problem is, but here’s my best guess:

1. qmp_block_resize(), near its end, calls bdrv_co_lock(bs), locking the AioContext.
2. We go down this chain: blk_unref(blk) -> blk_delete() -> blk_remove_bs() -> bdrv_root_unref_child() -> ... -> bdrv_graph_wrlock() -> bdrv_drain_all_begin_nopoll()
3. Iterating over all nodes, we lock the AioContext again[1].
4. bdrv_do_drained_begin() -> bdrv_co_yield_to_drain()
5. We release the AioContext so we can schedule a BH to run in it, and it will actually be run; but it is still locked once by bdrv_co_lock() from qmp_block_resize()
6. The hang occurs with the stack trace shown in comment 4: iothread_run() -> aio_poll()  -> aio_bh_call() wants to run the BH, but can’t, because the context is still locked from bdrv_co_lock()

[1] I haven’t quite understood at this point whether nested aio_context_acquire() is possible; AFAIR it was allowed in the past, but the implementation is just a mutex, so it doesn’t look like it’s still possible.  Anyway, this nested lock is actually not where we hang, so it looks like it is possible.

So I think what has caused this bug to appear is the fact that bdrv_graph_wrlock() runs bdrv_drain_all, which seems like it can’t work when any AioContext is acquired; i.e. in the end just like bug 2186725, bdrv_graph_wrlock() mustn’t be called with AioContexts acquired, just that there’s an additional reason why it doesn’t work (not just that read lock owners in any AioContexts will deadlock).

Comment 7 qing.wang 2023-04-26 06:09:03 UTC

hit same issue on
Red Hat Enterprise Linux release 9.3 Beta (Plow)
5.14.0-303.el9.x86_64
qemu-kvm-8.0.0-1.el9.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20230301gitf80f052277c8-2.el9.noarch
libvirt-9.0.0-10.el9_2.x86_64
virtio-win-prewhql-0.1-235.iso

Comment 8 Kevin Wolf 2023-05-05 12:05:15 UTC

I actually already sent an upstream patch that fixes this. I just didn't make the connection with this BZ.

https://patchew.org/QEMU/20230504115750.54437-1-kwolf@redhat.com/20230504115750.54437-5-kwolf@redhat.com/

Comment 11 Yanan Fu 2023-05-24 09:20:13 UTC

QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 14 qing.wang 2023-05-26 03:03:36 UTC

Pass the test with comment0 steps

Red Hat Enterprise Linux release 9.3 Beta (Plow)
5.14.0-316.el9.x86_64
qemu-kvm-8.0.0-4.el9.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20230301gitf80f052277c8-4.el9.noarch
libvirt-9.3.0-2.el9.x86_64
virtio-win-prewhql-0.1-237.iso

Comment 16 errata-xmlrpc 2023-11-07 08:27:12 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6368

Comment 17 Marcus 2024-07-19 09:41:08 UTC Comment hidden (spam)

This comment was flagged a spam, view the edit history to see the original text if required.