Bug 1879437
Summary: | Qemu coredump when refreshing block limits on an actively used iothread block device [rhel.9] | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | aihua liang <aliang> | ||||
Component: | qemu-kvm | Assignee: | Hanna Czenczek <hreitz> | ||||
qemu-kvm sub component: | Storage | QA Contact: | qing.wang <qinwang> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | medium | ||||||
Priority: | low | CC: | coli, hreitz, jinzhao, juzhang, kkiwi, mrezanin, ngu, qinwang, qzhang, virt-maint, yfu, zhencliu | ||||
Version: | 9.0 | Keywords: | Triaged | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-kvm-7.0.0-1.el9 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 2072932 (view as bug list) | Environment: | |||||
Last Closed: | 2022-11-15 09:53:23 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 2064757 | ||||||
Bug Blocks: | 2072932 | ||||||
Attachments: |
|
Description
aihua liang
2020-09-16 10:01:55 UTC
Coredump file located at: 10.73.194.27:/vol/s2coredump/bz1879437/core.qemu-kvm.0.25d3d2d17fce4fcc9f5305deee6423d2.104317.1600249352000000.lz4 I can not reproduce this issue on 4.18.0-236.el8.x86_64 qemu-kvm-common-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64 seabios-1.14.0-1.module+el8.3.0+7638+07cf13d2.x86_64 edk2-ovmf-20200602gitca407c7246bf-3.el8.noarch Steps: 1.boot vm /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -device pvpanic,ioport=0x505,id=idZcGD6F \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-7,slot=7,chassis=7,addr=0x7,bus=pcie.0 \ -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-4,addr=0x0 \ \ -object iothread,id=iothread0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel830-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3,iothread=iothread0 \ \ -blockdev node-name=data_image1,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/kvm_autotest_root/images/stg1.qcow2,aio=threads \ -blockdev node-name=data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=data_image1 \ -device virtio-blk-pci,id=disk1,drive=data1,write-cache=on,bus=pcie.0-root-port-7,iothread=iothread0 \ -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \ -device virtio-net-pci,mac=9a:55:56:57:58:59,id=id18Xcuo,netdev=idGRsMas,bus=pcie.0-root-port-5,addr=0x0 \ -netdev tap,id=idGRsMas,vhost=on \ -m 4096 \ -smp 24,maxcpus=24,cores=12,threads=1,sockets=2 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :5 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -monitor stdio \ -qmp tcp:0:5955,server,nowait \ 2. add non-exist image (this step is necessary?) {'execute':'qmp_capabilities'} {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch.img"},"backing":"drive_image1"}} 3.create image qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/rhel830-64-virtio-scsi.qcow2 -F qcow2 /home/kvm_autotest_root/images/scratch.img 4.re-add the image {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch.img"},"backing":"drive_image1"}} It passed. (In reply to qing.wang from comment #2) > I can not reproduce this issue on > 4.18.0-236.el8.x86_64 > qemu-kvm-common-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64 > seabios-1.14.0-1.module+el8.3.0+7638+07cf13d2.x86_64 > edk2-ovmf-20200602gitca407c7246bf-3.el8.noarch This is the same QEMU version as reported in the original description (comment #0), so I'll assume this is just difficult to reproduce. Reprodue this issue with steps: 1. rm -rf /home/kvm_autotest_root/images/scratch.img 2. boot vm /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -device pvpanic,ioport=0x505,id=idZcGD6F \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-7,slot=7,chassis=7,addr=0x7,bus=pcie.0 \ -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-4,addr=0x0 \ \ -object iothread,id=iothread0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel830-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3,iothread=iothread0 \ \ -blockdev node-name=data_image1,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/kvm_autotest_root/images/stg1.qcow2,aio=threads \ -blockdev node-name=data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=data_image1 \ -device virtio-blk-pci,id=disk1,drive=data1,write-cache=on,bus=pcie.0-root-port-7,iothread=iothread0 \ -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \ -device virtio-net-pci,mac=9a:55:56:57:58:59,id=id18Xcuo,netdev=idGRsMas,bus=pcie.0-root-port-5,addr=0x0 \ -netdev tap,id=idGRsMas,vhost=on \ -m 4096 \ -smp 24,maxcpus=24,cores=12,threads=1,sockets=2 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :5 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -monitor stdio \ -qmp tcp:0:5955,server,nowait \ 3 create backing image when vm in booting qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/rhel830-64-virtio-scsi.qcow2 -F qcow2 /home/kvm_autotest_root/images/scratch.img 4.hotplug the node {'execute':'qmp_capabilities'} {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch.img"},"backing":"drive_image1"}} 20% crashed. So this issue is irrelevant re-add an image that failed to add for "No such file or directory". The bt is same as comment 0. (gdb) bt #0 0x00007f309a67c7ff in raise () at /lib64/libc.so.6 #1 0x00007f309a666c35 in abort () at /lib64/libc.so.6 #2 0x00007f309a666b09 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007f309a674de6 in .annobin_assert.c_end () at /lib64/libc.so.6 #4 0x000055f8aff16a5f in bdrv_aligned_preadv (child=child@entry=0x55f8b27a9e00, req=req@entry=0x7f2f141da990, offset=<optimized out>, bytes=0, align=<optimized out>, qiov=0x7f2f141daa20, qiov_offset=0, flags=0) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/io.c:1464 #5 0x000055f8aff17021 in bdrv_co_preadv_part (child=0x55f8b27a9e00, offset=<optimized out>, bytes=<optimized out>, bytes@entry=4096, qiov=<optimized out>, qiov@entry=0x7f30840146a8, qiov_offset=<optimized out>, qiov_offset@entry=0, flags=flags@entry=0) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/io.c:1757 #6 0x000055f8afee2ec9 in qcow2_co_preadv_task (qiov_offset=0, qiov=0x7f30840146a8, bytes=4096, offset=<optimized out>, file_cluster_offset=<optimized out>, cluster_type=<optimized out>, bs=0x55f8b285f000) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/qcow2.c:2254 #7 0x000055f8afee2ec9 in qcow2_co_preadv_task_entry (task=<optimized out>) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/qcow2.c--Type <RET> for more, q to quit, c to continue without paging-- :2271 #8 0x000055f8afee0fd5 in qcow2_add_task (bs=bs@entry=0x55f8b285f000, pool=pool@entry=0x0, func=func@entry=0x55f8afee2df0 <qcow2_co_preadv_task_entry>, cluster_type=cluster_type@entry=QCOW2_CLUSTER_NORMAL, file_cluster_offset=2072444928, offset=offset@entry=17403441152, bytes=4096, qiov=0x7f30840146a8, qiov_offset=0, l2meta=0x0) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/qcow2.c:2211 #9 0x000055f8afee270d in qcow2_co_preadv_part (bs=0x55f8b285f000, offset=17403441152, bytes=4096, qiov=0x7f30840146a8, qiov_offset=0, flags=<optimized out>) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/qcow2.c:2310 #10 0x000055f8aff131c8 in bdrv_driver_preadv (bs=bs@entry=0x55f8b285f000, offset=offset@entry=17403441152, bytes=bytes@entry=4096, qiov=qiov@entry=0x7f30840146a8, qiov_offset=qiov_offset@entry=0, flags=0) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/io.c:1129 #11 0x000055f8aff16a86 in bdrv_aligned_preadv (child=child@entry=0x55f8b324e380, req=req@entry=0x7f2f141dae60, offset=17403441152, bytes=4096, align=1, qiov=0x7f30840146a8, qiov_offset=0, flags=0) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/io.c:1515 (In reply to aihua liang from comment #0) I wonder how we passed from bytes=4096 in #5: #5 0x000055d1bc2d3021 in bdrv_co_preadv_part (child=0x55d1bdd4e380, offset=<optimized out>, bytes=<optimized out>, bytes@entry=4096, qiov=<optimized out>, qiov@entry=0x7f63a001bc68, qiov_offset=<optimized out>, qiov_offset@entry=0, flags=flags@entry=0) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/io.c:1757 To bytes=0 in #4: #4 0x000055d1bc2d2a5f in bdrv_aligned_preadv (child=child@entry=0x55d1bdd4e380, req=req@entry=0x7f60d5bff990, offset=<optimized out>, bytes=0, align=<optimized out>, qiov=0x7f60d5bffa20, qiov_offset=0, flags=0) at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/block/io.c:1464 The only call updating &bytes is bdrv_pad_request(): *bytes += pad->head + pad->tail; Unlikely but could *bytes wrap??? FTR subsequent changes in bdrv_pad_request(): 98ca45494fc ("block/io: bdrv_pad_request(): support qemu_iovec_init_extended failure") 87ab8802524 ("block: Fix in_flight leak in request padding error path") Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release. Hanna, looks like the last real update to this bug was in Aug/2021 - still, it's a crash and we need to start converging what to do for RHEL9. Can you take this one and advise on next steps? I can’t access the core dump linked in comment 1 (ssh-ing there yields “Invalid key length”), and I suspect the core dump may have been deleted from the server in the meantime anyway. Just from the back trace, I can’t find anything. I was wondering whether this has really anything to do with re-adding an image after ENOENT, but comment 4 has already confirmed that the previous ENOENT has nothing to do with the assertion failure. What Philippe noted in comment 7 is indeed interesting, but without a core dump to find out e.g. what the BDS’s alignment is (child->bs->bl.request_alignment in frame 4), I can’t know in what way the alignment is not a power of two[1], whether Philippe’s guess that @bytes overflowed is correct (it does seem likely), and if so, why it overflows to exactly 0, and what the connection to the reportedly non-power-of-two alignment is. As for the next steps: I can and will try reproducing it, too, but I’m not expecting to find anything. A core dump on a more recent version could help, I believe, and perhaps some more information about the image that’s added with blockdev-add (what filesystem it is on, the `qemu-img info` output on it, and the `qemu-img info` output on the backing image). [1] The assertion failure comes from the protocol BDS (driven by the file-posix block driver) having a non-power-of-two alignment, i.e. it could be 0 or a positive non-power-of-two. It’s set during BDS creation, and when refreshed, block/file-posix.c:raw_probe_alignment() ensures that it isn’t 0, so it’s more likely to be a positive non-power-of-two. Then again, this is asserted when the BDS is created (bdrv_open_driver() has an is_power_of_2(bs->bl.request_alignment) assertion), so it’s not clear how the alignment would change to no longer be a power of two afterwards. To my surprise, I was able to reproduce this, but only after I added a sleep(): diff --git a/block/file-posix.c b/block/file-posix.c index 9a00d4190a..37d628cb7d 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -365,6 +365,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp) } bs->bl.request_alignment = 0; + sleep(1); s->buf_align = 0; /* Let's try to use the logical blocksize for the alignment. */ if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) { So the problem is that raw_probe_alignment() (and in fact bdrv_refresh_limits() as a whole) isn’t an atomic operation, so while the limits are refreshed, they may be (and are) invalid for some time, and if the block device is in an iothread, concurrent requests may stumble over these invalid limits. This also kind of reproduces upstream, although in a weaker form: Commits 4c002cef0e9 and 98ca45494fc (the latter already noted by Philippe) have made it so that the `alignment == 0` case leads to an error (passed to the guest) before we reach the assertion. I believe to fix this, we need to drain the BDS in bdrv_refresh_limits(). (Which makes sense, actually, because it seems natural to drain all requests before changing the limits.) I do not hit this issue on Red Hat Enterprise Linux release 9.0 Beta (Plow) 5.14.0-56.el9.x86_64 qemu-kvm-6.2.0-7.el9.x86_64 seabios-bin-1.15.0-1.el9.noarch 1. rm -rf /home/kvm_autotest_root/images/scratch*.img -rf 2. boot vm /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -device pvpanic,ioport=0x505,id=idZcGD6F \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-7,slot=7,chassis=7,addr=0x7,bus=pcie.0 \ -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-4,addr=0x0 \ \ -object iothread,id=iothread0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3,iothread=iothread0 \ \ -blockdev node-name=data_image1,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/kvm_autotest_root/images/stg1.qcow2,aio=threads \ -blockdev node-name=data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=data_image1 \ -device virtio-blk-pci,id=disk1,drive=data1,write-cache=on,bus=pcie.0-root-port-7,iothread=iothread0 \ -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \ -device virtio-net-pci,mac=9a:55:56:57:58:59,id=id18Xcuo,netdev=idGRsMas,bus=pcie.0-root-port-5,addr=0x0 \ -netdev tap,id=idGRsMas,vhost=on \ -m 4096 \ -smp 24,maxcpus=24,cores=12,threads=1,sockets=2 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :5 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -monitor stdio \ -qmp tcp:0:5955,server,nowait \ 3 create backing images during vm booting every second qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2 -F qcow2 /home/kvm_autotest_root/images/scratch0.img sleep 1 ... qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2 -F qcow2 /home/kvm_autotest_root/images/scratch19.img 4.hotplug the nodes {'execute':'qmp_capabilities'} {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp0","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch0.img"},"backing":"drive_image1"}} ... {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp19","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch19.img"},"backing":"drive_image1"}} Hi,hanna,could you please try qemu6.2 ? Yes, I hit the same issue in RHEL 9’s 6.2 and upstream (usually manifests as a floating-point exception). I’ll attach my reproducer script. Created attachment 1861167 [details]
Reproducer
I executed the script at #17,it crashed on QSD. It get different carsh,please refer to http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/qbugs/1879437/2022-02-16/ But this issue is reported on qemu. and the BT is not same as #5 So does it same issue? I also changed my test steps, it hits io error but not crash, it is same issue? 1.create backing file rm /home/kvm_autotest_root/images/scratch.img -rf qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2 -F qcow2 /home/kvm_autotest_root/images/scratch.img 2.boot vm /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -device pvpanic,ioport=0x505,id=idZcGD6F \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-7,slot=7,chassis=7,addr=0x7,bus=pcie.0 \ -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-4,addr=0x0 \ \ -object iothread,id=iothread0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3,iothread=iothread0,werror=stop,rerror=stop \ \ \ -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \ -device virtio-net-pci,mac=9a:55:56:57:58:59,id=id18Xcuo,netdev=idGRsMas,bus=pcie.0-root-port-5,addr=0x0 \ -netdev tap,id=idGRsMas,vhost=on \ -m 4096 \ -smp 24,maxcpus=24,cores=12,threads=1,sockets=2 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :5 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -monitor stdio \ -qmp tcp:0:5955,server,nowait \ 3. repeatedly execute blockdev-add/del backing node during vm booting {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch.img"},"backing":"drive_image1"}} {"return": {}} {"execute": "blockdev-del", "arguments": {"node-name": "tmp"}} {"return": {}} 4.vm guest step in pause status and qmp get BLOCK_IO_ERROR {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch.img"},"backing":"drive_image1"}} {"timestamp": {"seconds": 1644982200, "microseconds": 117805}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": false, "node-name": "drive_image1", "reason": "Invalid argument", "operation": "read", "action": "stop"}} {"execute": "blockdev-del", "arguments": {"node-name": "tmp"}} {"timestamp": {"seconds": 1644982200, "microseconds": 117828}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": false, "node-name": "drive_image1", "reason": "Invalid argument", "operation": "read", "action": "stop"}} {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch.img"},"backing":"drive_image1"}} {"timestamp": {"seconds": 1644982200, "microseconds": 118230}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": false, "node-name": "drive_image1", "reason": "Invalid argument", "operation": "read", "action": "stop"}} {"execute": "blockdev-del", "arguments": {"node-name": "tmp"}} {"return": {}} I believe it’s the same issue. The original issue was that the request alignment is not a power of two, and so an assertion failed. I believe the broader issue is that bdrv_refresh_limits() makes the request alignment invalid for a short time, and so concurrent I/O requests can see these invalid values. That can manifest in different ways, for example in said assertion failure, or in another assertion, or in a floating point exception. It’s absolutely possible that these BLOCK_IO_ERRORs happen because of this, too. As for QSD vs. QEMU, the block layer is the same between the two. What reproduced the bug for me was someone performing I/O requests on a block node, and then repeatedly adding/removing an overlay to/from that node (with blockdev-add/-del). It didn’t matter whether that I/O came from a guest or from an NBD server, so without the need for a guest, I used QSD for testing. I’ve created a downstream build with a patch that I hope fixes the bdrv_refresh_limits() issue, could you please test whether that resolves the BLOCK_IO_ERRORs? It’s here: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=43104517 The yum repo for the built RPMs is here: http://brew-task-repos.usersys.redhat.com/repos/scratch/hreitz/qemu-kvm/6.2.0/8.el9.hreitz202202160951/ Thanks! Setting ITR to 9.1.0, because I don’t think this is critical (very low chance of hitting this when doing blockdev-add on top of an existing node in an I/O thread, which is actively serving I/O), and because it’d be very tough getting this in in time for DTM/ITM 26. (In reply to Hanna Reitz from comment #19) > I believe it’s the same issue. The original issue was that the request > alignment is not a power of two, and so an assertion failed. I believe the > broader issue is that bdrv_refresh_limits() makes the request alignment > invalid for a short time, and so concurrent I/O requests can see these > invalid values. That can manifest in different ways, for example in said > assertion failure, or in another assertion, or in a floating point > exception. It’s absolutely possible that these BLOCK_IO_ERRORs happen > because of this, too. > > As for QSD vs. QEMU, the block layer is the same between the two. What > reproduced the bug for me was someone performing I/O requests on a block > node, and then repeatedly adding/removing an overlay to/from that node (with > blockdev-add/-del). It didn’t matter whether that I/O came from a guest or > from an NBD server, so without the need for a guest, I used QSD for testing. > > I’ve created a downstream build with a patch that I hope fixes the > bdrv_refresh_limits() issue, could you please test whether that resolves the > BLOCK_IO_ERRORs? > It’s here: > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=43104517 > The yum repo for the built RPMs is here: > http://brew-task-repos.usersys.redhat.com/repos/scratch/hreitz/qemu-kvm/6.2. > 0/8.el9.hreitz202202160951/ > > Thanks! No issue found on: Red Hat Enterprise Linux release 9.0 Beta (Plow) 5.14.0-56.el9.x86_64 qemu-kvm-6.2.0-8.el9.hreitz202202160951.x86_64 seabios-bin-1.15.0-1.el9.noarch Test scenario 1: run script at #17 test scenario 2: 1.create backing file qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg1.qcow2 2G qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/stg1.qcow2 -F qcow2 /home/kvm_autotest_root/images/scratch1.img 2. boot vm /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -device pvpanic,ioport=0x505,id=idZcGD6F \ -object iothread,id=iothread0 \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-7,slot=7,chassis=7,addr=0x7,bus=pcie.0 \ -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-4,addr=0x0,iothread=iothread0 \ \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel860-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device virtio-blk-pci,id=image1,drive=drive_image1,write-cache=on,bus=pcie.0-root-port-3,iothread=iothread0,bootindex=0,werror=stop,rerror=stop \ \ -blockdev node-name=data_image1,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/kvm_autotest_root/images/stg1.qcow2,aio=threads \ -blockdev node-name=data1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=data_image1 \ -device virtio-blk-pci,id=disk1,drive=data1,write-cache=on,bus=pcie.0-root-port-7,iothread=iothread0,werror=stop,rerror=stop \ \ -blockdev node-name=data_image2,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/kvm_autotest_root/images/stg2.qcow2,aio=threads \ -blockdev node-name=data2,driver=qcow2,cache.direct=on,cache.no-flush=off,file=data_image2 \ -device scsi-hd,id=disk2,drive=data2,write-cache=on,bus=scsi0.0,werror=stop,rerror=stop \ \ -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \ -device virtio-net-pci,mac=9a:55:56:57:58:59,id=id18Xcuo,netdev=idGRsMas,bus=pcie.0-root-port-5,addr=0x0 \ -netdev tap,id=idGRsMas,vhost=on \ -m 4096 \ -smp 24,maxcpus=24,cores=12,threads=1,sockets=2 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :5 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -monitor stdio \ -qmp tcp:0:5955,server,nowait \ 3. run fio on data disk fio --direct=1 --name=x --filename=/dev/vdb --size=2g --rw=randrw 4.execute blockdev-add/del backing node repeatedly {"execute":"blockdev-add","arguments":{"driver":"qcow2","node-name":"tmp","file":{"driver":"file","filename":"/home/kvm_autotest_root/images/scratch1.img"},"backing":"drive_image1"}} {"return": {}} {"execute": "blockdev-del", "arguments": {"node-name": "tmp"}} {"return": {}} Patch to hopefully fix this issue has been merged upstream (4d378bbd831bdd2f6e6adcd4ea5b77b6effaa627, “block: Make bdrv_refresh_limits() non-recursive”) and so will be in the qemu 7.0 release. QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Passed on Red Hat Enterprise Linux release 9.1 Beta (Plow) 5.14.0-80.el9.x86_64 qemu-kvm-7.0.0-1.el9.x86_64 seabios-bin-1.16.0-1.el9.noarch edk2-ovmf-20220221gitb24306f15d-1.el9.noarch virtio-win-prewhql-0.1-215.iso Test script: #!/bin/sh QEMU_IMG=qemu-img QSD=qemu-storage-daemon TMPD=/tmp/p291748 mkdir -p $TMPD if ! which $QSD;then echo "$QSD does not exist";exit 1; fi rm -f ${TMPD}/qsd.pid "$QEMU_IMG" create -f qcow2 -F raw -b null-co:// ${TMPD}/top.qcow2 (echo '{"execute": "qmp_capabilities"}' sleep 1 while true; do echo '{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "tmp", "backing": "node0", "file": {"driver": "file", "filename": "/tmp/p291748/top.qcow2"}}}' echo '{"execute": "blockdev-del", "arguments": {"node-name": "tmp"}}' done) | \ "$QSD" \ --chardev stdio,id=stdio \ --monitor mon0,chardev=stdio \ --object iothread,id=iothread0 \ --blockdev null-co,node-name=node0,read-zeroes=true \ --nbd-server addr.type=unix,addr.path=${TMPD}/nbd.sock \ --export nbd,id=exp0,node-name=node0,iothread=iothread0,fixed-iothread=true,writable=true \ --pidfile ${TMPD}/qsd.pid \ & while [ ! -f ${TMPD}/qsd.pid ]; do true done "$QEMU_IMG" bench -f raw -c 4000000 nbd+unix:///node0\?socket=${TMPD}/nbd.sock ret=$? kill %1 rm -rf ${TMPD} exit $ret Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7967 |