This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2234374 - Qemu Core Dumped When Writing Larger Size Than The Size of A Data Disk
Summary: Qemu Core Dumped When Writing Larger Size Than The Size of A Data Disk
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Hanna Czenczek
QA Contact: qing.wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-24 08:14 UTC by Tingting Mao
Modified: 2023-09-22 16:35 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-22 16:31:58 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-7360 0 None Migrated None 2023-09-22 16:36:10 UTC
Red Hat Issue Tracker RHELPLAN-166362 0 None None None 2023-08-24 08:20:03 UTC

Description Tingting Mao 2023-08-24 08:14:29 UTC
Description of problem:
As subject.


Version-Release number of selected component (if applicable):
qemu-kvm-8.1.0-0.rc4.el9.preview
kernel-5.14.0-340.el9.x86_64


How reproducible:
100%


Steps to Reproduce:
1. Prepare a LV block file
# qemu-img create -f raw /home/lvm.img 60G
# losetup /dev/loop1 /home/lvm.img
# pvcreate /dev/loop1 --metadatasize=1m --metadatacopies=2 --metadataignore=y
# vgcreate vg /dev/loop1 --physicalextentsize=1m
# lvcreate --autobackup n --contiguous n  --size 1024M -n lv1 vg
# qemu-img create -f qcow2 /dev/vg/lv1 60G

2. Boot up a guest with the LV as a data disk
# /usr/libexec/qemu-kvm \
-S  \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \
-blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel930-64-virtio-scsi-ovmf_qcow2_filesystem_VARS.raw", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \
-machine q35,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars,memory-backend=mem-machine_mem \
-device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \
-device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}'  \
-nodefaults \
-device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \
-m 30720 \
-object '{"size": 32212254720, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}'  \
-smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
-chardev socket,wait=off,server=on,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_r9ni1_l3/monitor-qmpmonitor1-20230824-030703-TgoNPdRk  \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,wait=off,server=on,id=qmp_id_catch_monitor,path=/var/tmp/avocado_r9ni1_l3/monitor-catch_monitor-20230824-030703-TgoNPdRk  \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device '{"ioport": 1285, "driver": "pvpanic", "id": "idSpXmJe"}' \
-chardev socket,wait=off,server=on,id=chardev_serial0,path=/var/tmp/avocado_r9ni1_l3/serial-serial0-20230824-030703-TgoNPdRk \
-device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}'  \
-chardev socket,id=seabioslog_id_20230824-030703-TgoNPdRk,path=/var/tmp/avocado_r9ni1_l3/seabios-20230824-030703-TgoNPdRk,server=on,wait=off \
-device isa-debugcon,chardev=seabioslog_id_20230824-030703-TgoNPdRk,iobase=0x402 \
-device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \
-device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \
-device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \
-device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \
-device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \
-blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio-scsi-ovmf.qcow2", "cache": {"direct": true, "no-flush": false}}' \
-blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
-device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
-blockdev '{"node-name": "file_stg1", "driver": "host_device", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/dev/vg/lv1", "cache": {"direct": true, "no-flush": false}}' \
-blockdev '{"node-name": "drive_stg1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_stg1"}' \
-device '{"driver": "scsi-hd", "id": "stg1", "drive": "drive_stg1", "write-cache": "on", "rerror": "stop", "werror": "stop", "serial": "TARGET_DISK0"}' \
-device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \
-device '{"driver": "virtio-net-pci", "mac": "9a:c4:b4:6c:7c:ec", "id": "id9xdoY8", "netdev": "idXoIpNF", "bus": "pcie-root-port-3", "addr": "0x0"}'  \
-netdev tap,id=idXoIpNF,vhost=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm \
-monitor stdio \
-device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}'

3. Write data to the data disk in guest
(guest)# dd if=/dev/urandom of=/dev/sdb bs=1M count=50000 oflag=direct


Actual results:
After about several sec/1 minute, the qemu coredumped
(qemu) qemu.sh: line 46: 20707 Floating point exception(core dumped) /usr/libexec/qemu-kvm -S -name 'avocado-vt-vm1' -sandbox on -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel930-64-virtio-scsi-ovmf_qcow2_filesystem_VARS.raw", "auto-read-only": true, "discard": "unmap"}' -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' -machine q35,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars,memory-backend=mem-machine_mem -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' -nodefaults -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' -m 30720 -object '{"size": 32212254720, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2 -cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt -chardev socket,wait=off,server=on,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_r9ni1_l3/monitor-qmpmonitor1-20230824-030703-TgoNPdRk -mon chardev=qmp_id_qmpmonitor1,mode=control -chardev socket,wait=off,server=on,id=qmp_id_catch_monitor,path=/var/tmp/avocado_r9ni1_l3/monitor-catch_monitor-20230824-030703-TgoNPdRk -mon chardev=qmp_id_catch_monitor,mode=control -device '{"ioport": 1285, "driver": "pvpanic", "id": "idSpXmJe"}' -chardev socket,wait=off,server=on,id=chardev_serial0,path=/var/tmp/avocado_r9ni1_l3/serial-serial0-20230824-030703-TgoNPdRk -device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}' -chardev socket,id=seabioslog_id_20230824-030703-TgoNPdRk,path=/var/tmp/avocado_r9ni1_l3/seabios-20230824-030703-TgoNPdRk,server=on,wait=off -device isa-debugcon,chardev=seabioslog_id_20230824-030703-TgoNPdRk,iobase=0x402 -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' -device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio-scsi-ovmf.qcow2", "cache": {"direct": true, "no-flush": false}}' -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' -blockdev '{"node-name": "file_stg1", "driver": "host_device", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/dev/vg/lv1", "cache": {"direct": true, "no-flush": false}}' -blockdev '{"node-name": "drive_stg1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_stg1"}' -device '{"driver": "scsi-hd", "id": "stg1", "drive": "drive_stg1", "write-cache": "on", "rerror": "stop", "werror": "stop", "serial": "TARGET_DISK0"}' -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' -device '{"driver": "virtio-net-pci", "mac": "9a:c4:b4:6c:7c:ec", "id": "id9xdoY8", "netdev": "idXoIpNF", "bus": "pcie-root-port-3", "addr": "0x0"}' -netdev tap,id=idXoIpNF,vhost=on -vnc :0 -rtc base=utc,clock=host,driftfix=slew -boot menu=off,order=cdn,once=c,strict=off -enable-kvm -monitor stdio -device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}'

                
Expected results:
No core dumped.


Additional info:
Stack trace of thread 20707:
                #0  0x0000557f25964609 get_zones_wp (qemu-kvm + 0x86b609)
                #1  0x0000557f25964b27 raw_co_prw (qemu-kvm + 0x86bb27)
                #2  0x0000557f2590a914 bdrv_driver_pwritev (qemu-kvm + 0x811914)
                #3  0x0000557f2590527d bdrv_aligned_pwritev (qemu-kvm + 0x80c27d)
                #4  0x0000557f259048d3 bdrv_co_pwritev_part (qemu-kvm + 0x80b8d3)
                #5  0x0000557f25942389 qcow2_co_pwritev_task (qemu-kvm + 0x849389)
                #6  0x0000557f25941e63 qcow2_co_pwritev_task_entry (qemu-kvm + 0x848e63)
                #7  0x0000557f2594163a qcow2_add_task (qemu-kvm + 0x84863a)
                #8  0x0000557f2593978c qcow2_co_pwritev_part (qemu-kvm + 0x84078c)
                #9  0x0000557f2590a7ea bdrv_driver_pwritev (qemu-kvm + 0x8117ea)
                #10 0x0000557f2590527d bdrv_aligned_pwritev (qemu-kvm + 0x80c27d)
                #11 0x0000557f259048d3 bdrv_co_pwritev_part (qemu-kvm + 0x80b8d3)
                #12 0x0000557f258ef8a6 blk_co_do_pwritev_part.llvm.4925601650272942953 (qemu-kvm + 0x7f68a6)
                #13 0x0000557f258f01d2 blk_aio_write_entry.llvm.4925601650272942953 (qemu-kvm + 0x7f71d2)
                #14 0x0000557f25ae0006 coroutine_trampoline.llvm.17812621179749689021 (qemu-kvm + 0x9e7006)
                #15 0x00007f4aa1e2a360 n/a (libc.so.6 + 0x2a360)
                #16 0x40a47d7e8f1e8300 n/a (n/a + 0x0)
                ELF object binary architecture: AMD x86-64

Comment 1 Tingting Mao 2023-08-24 08:18:47 UTC
Tried with qemu-kvm-8.0.0-12.el9, there is no qemu core dumped, and just hit 'No space left' hint info in the guest.
So mark this bug with regression.

Comment 2 aihua liang 2023-08-24 09:54:59 UTC
Test with qemu-kvm-8.1.0-0.el9.preview, also hit the same issue, with the same coredump info.
  Message: Process 30795 (qemu-kvm) of user 0 dumped core.
                
                Stack trace of thread 30795:
                #0  0x00005635ba703909 get_zones_wp (qemu-kvm + 0x86b909)
                #1  0x00005635ba703e27 raw_co_prw (qemu-kvm + 0x86be27)
                #2  0x00005635ba6a9c14 bdrv_driver_pwritev (qemu-kvm + 0x811c14)
                #3  0x00005635ba6a457d bdrv_aligned_pwritev (qemu-kvm + 0x80c57d)
                #4  0x00005635ba6a3bd3 bdrv_co_pwritev_part (qemu-kvm + 0x80bbd3)
                #5  0x00005635ba6e1689 qcow2_co_pwritev_task (qemu-kvm + 0x849689)
                #6  0x00005635ba6e1163 qcow2_co_pwritev_task_entry (qemu-kvm + 0x849163)
                #7  0x00005635ba6e093a qcow2_add_task (qemu-kvm + 0x84893a)
                #8  0x00005635ba6d8a8c qcow2_co_pwritev_part (qemu-kvm + 0x840a8c)
                #9  0x00005635ba6a9aea bdrv_driver_pwritev (qemu-kvm + 0x811aea)
                #10 0x00005635ba6a457d bdrv_aligned_pwritev (qemu-kvm + 0x80c57d)
                #11 0x00005635ba6a3bd3 bdrv_co_pwritev_part (qemu-kvm + 0x80bbd3)
                #12 0x00005635ba68eba6 blk_co_do_pwritev_part.llvm.8165632186031058405 (qemu-kvm + 0x7f6ba6)
                #13 0x00005635ba68f4d2 blk_aio_write_entry.llvm.8165632186031058405 (qemu-kvm + 0x7f74d2)
                #14 0x00005635ba87f306 coroutine_trampoline.llvm.6566130761695863925 (qemu-kvm + 0x9e7306)
                #15 0x00007fcbe6c2a360 n/a (libc.so.6 + 0x2a360)
                #16 0x0000000000000000 n/a (n/a + 0x0)
                ELF object binary architecture: AMD x86-64

Comment 3 Hanna Czenczek 2023-08-24 12:26:06 UTC
The problem is that get_zones_wp() and by extension update_zones_wp() expect zoning information to be present, but raw_co_prw()’s error path does not check whether it is before calling that function.

There is an upstream patch for this problem, but I’m not sure what its status is: https://lists.nongnu.org/archive/html/qemu-devel/2023-06/msg01742.html – it reads like the author intended to send a different version, but I don’t think there ever was one (compare https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg05163.html).  I think I’ll send a separate version myself to get things going again, not least because there are other issues here:

There are various ways in which the presence of zoning information is checked (bs->wps != NULL and/or bs->bl.zone_size != 0), but judging from how raw_refresh_zoned_limits() is constructed, the only flag that is reliable is whether bs->bl.zoned != BLK_Z_NONE, so that should be changed everywhere, too.  (raw_refresh_zoned_limits() never clears bs->wps or bs->bl.zone_size if there is no zoning information on refresh, but it does set bs->bl.zoned to BLK_Z_NONE.)

Also, raw_refresh_zoned_limits() never clears anything on error, which does not seem right.  It should at least reset bs->bl.zoned to BLK_Z_NONE.

Comment 4 Hanna Czenczek 2023-08-25 09:06:32 UTC
Sent https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg04283.html upstream

Comment 9 Yanan Fu 2023-09-05 05:40:12 UTC
Update ~
The issue existing with the official downstream build: qemu-kvm-8.1.0-1.el9.
As this is a scenario in the qemu-kvm component gating test, add 'TestBlocker' accordingly.
Thanks!

Comment 14 Tingting Mao 2023-09-14 03:19:55 UTC
An easier way to reproduce the issue:

Tested with:
qemu-kvm-8.1.0-1.el9
kernel-5.14.0-362.el9


Steps:
#qemu-img create -f raw test.img 400M
#losetup /dev/loop0 test.img
#pvcreate /dev/loop0 
#vgcreate test /dev/loop0
#lvcreate -n top --size 128M test
# qemu-img create -f qcow2 /dev/test/top 128M
# qemu-img create -f qcow2 top.img -F qcow2 -b /dev/test/top
# qemu-io -f qcow2 top.img -c "write 0 128M"
# qemu-img commit -f qcow2 -t none -b /dev/test/top -d -p top.img
Floating point exception (core dumped)


Stack trace of thread 903888:
                #0  0x0000557af5b28aa9 get_zones_wp (qemu-img + 0x117aa9)
                #1  0x0000557af5b28fc7 raw_co_prw (qemu-img + 0x117fc7)
                #2  0x0000557af5acf484 bdrv_driver_pwritev (qemu-img + 0xbe484)
                #3  0x0000557af5ac9dad bdrv_aligned_pwritev (qemu-img + 0xb8dad)
                #4  0x0000557af5ac9403 bdrv_co_pwritev_part (qemu-img + 0xb8403)
                #5  0x0000557af5b06e29 qcow2_co_pwritev_task (qemu-img + 0xf5e29)
                #6  0x0000557af5b06903 qcow2_co_pwritev_task_entry (qemu-img + 0xf5903)
                #7  0x0000557af5b06107 qcow2_add_task (qemu-img + 0xf5107)
                #8  0x0000557af5afe22c qcow2_co_pwritev_part (qemu-img + 0xed22c)
                #9  0x0000557af5acf35a bdrv_driver_pwritev (qemu-img + 0xbe35a)
                #10 0x0000557af5ac9dad bdrv_aligned_pwritev (qemu-img + 0xb8dad)
                #11 0x0000557af5ac9403 bdrv_co_pwritev_part (qemu-img + 0xb8403)
                #12 0x0000557af5ab4aa6 blk_co_do_pwritev_part.llvm.8165632186031058405 (qemu-img + 0xa3aa6)
                #13 0x0000557af5ad5e59 mirror_read_complete (qemu-img + 0xc4e59)
                #14 0x0000557af5ad57ae mirror_co_read (qemu-img + 0xc47ae)
                #15 0x0000557af5bdcd16 coroutine_trampoline.llvm.6566130761695863925 (qemu-img + 0x1cbd16)
                #16 0x00007f012242a360 n/a (libc.so.6 + 0x2a360)
                #17 0x00007f0100000002 n/a (n/a + 0x0)
                ELF object binary architecture: AMD x86-64

Comment 16 RHEL Program Management 2023-09-22 16:29:05 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.


Note You need to log in before you can comment on or make changes to this bug.