Bug 2234374
| Summary: | Qemu Core Dumped When Writing Larger Size Than The Size of A Data Disk | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Tingting Mao <timao> |
| Component: | qemu-kvm | Assignee: | Hanna Czenczek <hreitz> |
| qemu-kvm sub component: | virtio-blk,scsi | QA Contact: | qing.wang <qinwang> |
| Status: | CLOSED MIGRATED | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | aliang, chayang, coli, hreitz, jinzhao, juzhang, kwolf, mrezanin, qinwang, vgoyal, virt-maint, yfu, ymankad |
| Version: | 9.4 | Keywords: | CustomerScenariosInitiative, MigratedToJIRA, Regression, TestBlocker, Triaged |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-09-22 16:31:58 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Tried with qemu-kvm-8.0.0-12.el9, there is no qemu core dumped, and just hit 'No space left' hint info in the guest. So mark this bug with regression. Test with qemu-kvm-8.1.0-0.el9.preview, also hit the same issue, with the same coredump info.
Message: Process 30795 (qemu-kvm) of user 0 dumped core.
Stack trace of thread 30795:
#0 0x00005635ba703909 get_zones_wp (qemu-kvm + 0x86b909)
#1 0x00005635ba703e27 raw_co_prw (qemu-kvm + 0x86be27)
#2 0x00005635ba6a9c14 bdrv_driver_pwritev (qemu-kvm + 0x811c14)
#3 0x00005635ba6a457d bdrv_aligned_pwritev (qemu-kvm + 0x80c57d)
#4 0x00005635ba6a3bd3 bdrv_co_pwritev_part (qemu-kvm + 0x80bbd3)
#5 0x00005635ba6e1689 qcow2_co_pwritev_task (qemu-kvm + 0x849689)
#6 0x00005635ba6e1163 qcow2_co_pwritev_task_entry (qemu-kvm + 0x849163)
#7 0x00005635ba6e093a qcow2_add_task (qemu-kvm + 0x84893a)
#8 0x00005635ba6d8a8c qcow2_co_pwritev_part (qemu-kvm + 0x840a8c)
#9 0x00005635ba6a9aea bdrv_driver_pwritev (qemu-kvm + 0x811aea)
#10 0x00005635ba6a457d bdrv_aligned_pwritev (qemu-kvm + 0x80c57d)
#11 0x00005635ba6a3bd3 bdrv_co_pwritev_part (qemu-kvm + 0x80bbd3)
#12 0x00005635ba68eba6 blk_co_do_pwritev_part.llvm.8165632186031058405 (qemu-kvm + 0x7f6ba6)
#13 0x00005635ba68f4d2 blk_aio_write_entry.llvm.8165632186031058405 (qemu-kvm + 0x7f74d2)
#14 0x00005635ba87f306 coroutine_trampoline.llvm.6566130761695863925 (qemu-kvm + 0x9e7306)
#15 0x00007fcbe6c2a360 n/a (libc.so.6 + 0x2a360)
#16 0x0000000000000000 n/a (n/a + 0x0)
ELF object binary architecture: AMD x86-64
The problem is that get_zones_wp() and by extension update_zones_wp() expect zoning information to be present, but raw_co_prw()’s error path does not check whether it is before calling that function. There is an upstream patch for this problem, but I’m not sure what its status is: https://lists.nongnu.org/archive/html/qemu-devel/2023-06/msg01742.html – it reads like the author intended to send a different version, but I don’t think there ever was one (compare https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg05163.html). I think I’ll send a separate version myself to get things going again, not least because there are other issues here: There are various ways in which the presence of zoning information is checked (bs->wps != NULL and/or bs->bl.zone_size != 0), but judging from how raw_refresh_zoned_limits() is constructed, the only flag that is reliable is whether bs->bl.zoned != BLK_Z_NONE, so that should be changed everywhere, too. (raw_refresh_zoned_limits() never clears bs->wps or bs->bl.zone_size if there is no zoning information on refresh, but it does set bs->bl.zoned to BLK_Z_NONE.) Also, raw_refresh_zoned_limits() never clears anything on error, which does not seem right. It should at least reset bs->bl.zoned to BLK_Z_NONE. Update ~ The issue existing with the official downstream build: qemu-kvm-8.1.0-1.el9. As this is a scenario in the qemu-kvm component gating test, add 'TestBlocker' accordingly. Thanks! An easier way to reproduce the issue:
Tested with:
qemu-kvm-8.1.0-1.el9
kernel-5.14.0-362.el9
Steps:
#qemu-img create -f raw test.img 400M
#losetup /dev/loop0 test.img
#pvcreate /dev/loop0
#vgcreate test /dev/loop0
#lvcreate -n top --size 128M test
# qemu-img create -f qcow2 /dev/test/top 128M
# qemu-img create -f qcow2 top.img -F qcow2 -b /dev/test/top
# qemu-io -f qcow2 top.img -c "write 0 128M"
# qemu-img commit -f qcow2 -t none -b /dev/test/top -d -p top.img
Floating point exception (core dumped)
Stack trace of thread 903888:
#0 0x0000557af5b28aa9 get_zones_wp (qemu-img + 0x117aa9)
#1 0x0000557af5b28fc7 raw_co_prw (qemu-img + 0x117fc7)
#2 0x0000557af5acf484 bdrv_driver_pwritev (qemu-img + 0xbe484)
#3 0x0000557af5ac9dad bdrv_aligned_pwritev (qemu-img + 0xb8dad)
#4 0x0000557af5ac9403 bdrv_co_pwritev_part (qemu-img + 0xb8403)
#5 0x0000557af5b06e29 qcow2_co_pwritev_task (qemu-img + 0xf5e29)
#6 0x0000557af5b06903 qcow2_co_pwritev_task_entry (qemu-img + 0xf5903)
#7 0x0000557af5b06107 qcow2_add_task (qemu-img + 0xf5107)
#8 0x0000557af5afe22c qcow2_co_pwritev_part (qemu-img + 0xed22c)
#9 0x0000557af5acf35a bdrv_driver_pwritev (qemu-img + 0xbe35a)
#10 0x0000557af5ac9dad bdrv_aligned_pwritev (qemu-img + 0xb8dad)
#11 0x0000557af5ac9403 bdrv_co_pwritev_part (qemu-img + 0xb8403)
#12 0x0000557af5ab4aa6 blk_co_do_pwritev_part.llvm.8165632186031058405 (qemu-img + 0xa3aa6)
#13 0x0000557af5ad5e59 mirror_read_complete (qemu-img + 0xc4e59)
#14 0x0000557af5ad57ae mirror_co_read (qemu-img + 0xc47ae)
#15 0x0000557af5bdcd16 coroutine_trampoline.llvm.6566130761695863925 (qemu-img + 0x1cbd16)
#16 0x00007f012242a360 n/a (libc.so.6 + 0x2a360)
#17 0x00007f0100000002 n/a (n/a + 0x0)
ELF object binary architecture: AMD x86-64
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. |
Description of problem: As subject. Version-Release number of selected component (if applicable): qemu-kvm-8.1.0-0.rc4.el9.preview kernel-5.14.0-340.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Prepare a LV block file # qemu-img create -f raw /home/lvm.img 60G # losetup /dev/loop1 /home/lvm.img # pvcreate /dev/loop1 --metadatasize=1m --metadatacopies=2 --metadataignore=y # vgcreate vg /dev/loop1 --physicalextentsize=1m # lvcreate --autobackup n --contiguous n --size 1024M -n lv1 vg # qemu-img create -f qcow2 /dev/vg/lv1 60G 2. Boot up a guest with the LV as a data disk # /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' \ -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \ -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel930-64-virtio-scsi-ovmf_qcow2_filesystem_VARS.raw", "auto-read-only": true, "discard": "unmap"}' \ -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \ -machine q35,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars,memory-backend=mem-machine_mem \ -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \ -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' \ -nodefaults \ -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \ -m 30720 \ -object '{"size": 32212254720, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' \ -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2 \ -cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \ -chardev socket,wait=off,server=on,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_r9ni1_l3/monitor-qmpmonitor1-20230824-030703-TgoNPdRk \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,wait=off,server=on,id=qmp_id_catch_monitor,path=/var/tmp/avocado_r9ni1_l3/monitor-catch_monitor-20230824-030703-TgoNPdRk \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device '{"ioport": 1285, "driver": "pvpanic", "id": "idSpXmJe"}' \ -chardev socket,wait=off,server=on,id=chardev_serial0,path=/var/tmp/avocado_r9ni1_l3/serial-serial0-20230824-030703-TgoNPdRk \ -device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}' \ -chardev socket,id=seabioslog_id_20230824-030703-TgoNPdRk,path=/var/tmp/avocado_r9ni1_l3/seabios-20230824-030703-TgoNPdRk,server=on,wait=off \ -device isa-debugcon,chardev=seabioslog_id_20230824-030703-TgoNPdRk,iobase=0x402 \ -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \ -device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \ -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \ -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \ -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \ -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio-scsi-ovmf.qcow2", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \ -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \ -blockdev '{"node-name": "file_stg1", "driver": "host_device", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/dev/vg/lv1", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_stg1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_stg1"}' \ -device '{"driver": "scsi-hd", "id": "stg1", "drive": "drive_stg1", "write-cache": "on", "rerror": "stop", "werror": "stop", "serial": "TARGET_DISK0"}' \ -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \ -device '{"driver": "virtio-net-pci", "mac": "9a:c4:b4:6c:7c:ec", "id": "id9xdoY8", "netdev": "idXoIpNF", "bus": "pcie-root-port-3", "addr": "0x0"}' \ -netdev tap,id=idXoIpNF,vhost=on \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -monitor stdio \ -device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}' 3. Write data to the data disk in guest (guest)# dd if=/dev/urandom of=/dev/sdb bs=1M count=50000 oflag=direct Actual results: After about several sec/1 minute, the qemu coredumped (qemu) qemu.sh: line 46: 20707 Floating point exception(core dumped) /usr/libexec/qemu-kvm -S -name 'avocado-vt-vm1' -sandbox on -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel930-64-virtio-scsi-ovmf_qcow2_filesystem_VARS.raw", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' -machine q35,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars,memory-backend=mem-machine_mem -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' -nodefaults -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' -m 30720 -object '{"size": 32212254720, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2 -cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt -chardev socket,wait=off,server=on,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_r9ni1_l3/monitor-qmpmonitor1-20230824-030703-TgoNPdRk -mon chardev=qmp_id_qmpmonitor1,mode=control -chardev socket,wait=off,server=on,id=qmp_id_catch_monitor,path=/var/tmp/avocado_r9ni1_l3/monitor-catch_monitor-20230824-030703-TgoNPdRk -mon chardev=qmp_id_catch_monitor,mode=control -device '{"ioport": 1285, "driver": "pvpanic", "id": "idSpXmJe"}' -chardev socket,wait=off,server=on,id=chardev_serial0,path=/var/tmp/avocado_r9ni1_l3/serial-serial0-20230824-030703-TgoNPdRk -device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}' -chardev socket,id=seabioslog_id_20230824-030703-TgoNPdRk,path=/var/tmp/avocado_r9ni1_l3/seabios-20230824-030703-TgoNPdRk,server=on,wait=off -device isa-debugcon,chardev=seabioslog_id_20230824-030703-TgoNPdRk,iobase=0x402 -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' -device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio-scsi-ovmf.qcow2", "cache": {"direct": true, "no-flush": false}}' -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' -blockdev '{"node-name": "file_stg1", "driver": "host_device", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/dev/vg/lv1", "cache": {"direct": true, "no-flush": false}}' -blockdev '{"node-name": "drive_stg1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_stg1"}' -device '{"driver": "scsi-hd", "id": "stg1", "drive": "drive_stg1", "write-cache": "on", "rerror": "stop", "werror": "stop", "serial": "TARGET_DISK0"}' -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' -device '{"driver": "virtio-net-pci", "mac": "9a:c4:b4:6c:7c:ec", "id": "id9xdoY8", "netdev": "idXoIpNF", "bus": "pcie-root-port-3", "addr": "0x0"}' -netdev tap,id=idXoIpNF,vhost=on -vnc :0 -rtc base=utc,clock=host,driftfix=slew -boot menu=off,order=cdn,once=c,strict=off -enable-kvm -monitor stdio -device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}' Expected results: No core dumped. Additional info: Stack trace of thread 20707: #0 0x0000557f25964609 get_zones_wp (qemu-kvm + 0x86b609) #1 0x0000557f25964b27 raw_co_prw (qemu-kvm + 0x86bb27) #2 0x0000557f2590a914 bdrv_driver_pwritev (qemu-kvm + 0x811914) #3 0x0000557f2590527d bdrv_aligned_pwritev (qemu-kvm + 0x80c27d) #4 0x0000557f259048d3 bdrv_co_pwritev_part (qemu-kvm + 0x80b8d3) #5 0x0000557f25942389 qcow2_co_pwritev_task (qemu-kvm + 0x849389) #6 0x0000557f25941e63 qcow2_co_pwritev_task_entry (qemu-kvm + 0x848e63) #7 0x0000557f2594163a qcow2_add_task (qemu-kvm + 0x84863a) #8 0x0000557f2593978c qcow2_co_pwritev_part (qemu-kvm + 0x84078c) #9 0x0000557f2590a7ea bdrv_driver_pwritev (qemu-kvm + 0x8117ea) #10 0x0000557f2590527d bdrv_aligned_pwritev (qemu-kvm + 0x80c27d) #11 0x0000557f259048d3 bdrv_co_pwritev_part (qemu-kvm + 0x80b8d3) #12 0x0000557f258ef8a6 blk_co_do_pwritev_part.llvm.4925601650272942953 (qemu-kvm + 0x7f68a6) #13 0x0000557f258f01d2 blk_aio_write_entry.llvm.4925601650272942953 (qemu-kvm + 0x7f71d2) #14 0x0000557f25ae0006 coroutine_trampoline.llvm.17812621179749689021 (qemu-kvm + 0x9e7006) #15 0x00007f4aa1e2a360 n/a (libc.so.6 + 0x2a360) #16 0x40a47d7e8f1e8300 n/a (n/a + 0x0) ELF object binary architecture: AMD x86-64