Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2203512

Summary: [qemu-kvm] detect_zeros=unmap causes postgres to crash
Product: Red Hat Enterprise Linux 9 Reporter: Cory Bolar <cory.bolar>
Component: qemu-kvmAssignee: Stefan Hajnoczi <stefanha>
qemu-kvm sub component: virtio-blk,scsi QA Contact: qing.wang <qinwang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: medium CC: aliang, coli, jinzhao, juzhang, kwolf, mrezanin, qinwang, vgoyal, virt-maint, xuwei, yfu
Version: 9.2Keywords: CustomerScenariosInitiative, Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-8.0.0-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:27:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Cory Bolar 2023-05-13 13:22:00 UTC
Description of problem:

detect_zeros=unmap with qemu 7.2 breaks postgres applications with data on qemu volumes

Maybe these commits didn't make it in for 9.3?

https://lore.kernel.org/qemu-devel/20230126201401.348845-1-stefanha@redhat.com/T/#t

Version-Release number of selected component (if applicable):

QEMU 7.2
RHEL 9.3

How reproducible:
Run postgres with data stored on qemu disk using detect_zeros=unmap

Comment 1 qing.wang 2023-05-17 06:39:58 UTC
Could you please show your VM command line or VM configuration?

Comment 2 qing.wang 2023-05-17 09:44:41 UTC
Reproduce this issue on

Red Hat Enterprise Linux release 9.3 Beta (Plow)
5.14.0-311.el9.x86_64
qemu-kvm-7.2.0-14.el9_2.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20230301gitf80f052277c8-3.el9.noarch
virtio-win-prewhql-0.1-236.iso

It can not hit the issue on

qemu-kvm-7.1.0-7.el9.x86_64
qemu-kvm-8.0.0-2.el9.x86_64

steps:
1. create data image file
qemu-img create -f qcow2 /home/mstg1.qcow2 1G
...
qemu-img create -f qcow2 /home/mstg6.qcow2 1G

2. boot vm 

/usr/libexec/qemu-kvm \
  -name testvm \
  -machine q35 \
  -m  6G \
  -smp 2 \
  -cpu host,+kvm_pv_unhalt \
  -device ich9-usb-ehci1,id=usb1 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
   \
   \
  -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x3,chassis=1 \
  -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x3.0x1,bus=pcie.0,chassis=2 \
  -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x3.0x2,bus=pcie.0,chassis=3 \
  -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x3.0x3,bus=pcie.0,chassis=4 \
  -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x3.0x4,bus=pcie.0,chassis=5 \
  -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x3.0x5,bus=pcie.0,chassis=6 \
  -device pcie-root-port,id=pcie-root-port-6,port=0x6,addr=0x3.0x6,bus=pcie.0,chassis=7 \
  -device pcie-root-port,id=pcie-root-port-7,port=0x7,addr=0x3.0x7,bus=pcie.0,chassis=8 \
  -device pcie-root-port,id=pcie_extra_root_port_0,bus=pcie.0,addr=0x4  \
  -device virtio-scsi-pci,id=scsi0,bus=pcie-root-port-0 \
  -blockdev driver=qcow2,file.driver=file,cache.direct=off,cache.no-flush=on,file.filename=/home/kvm_autotest_root/images/rhel930-64-virtio-scsi.qcow2,node-name=drive_image1,file.aio=threads   \
  -device scsi-hd,id=os,drive=drive_image1,bus=scsi0.0,bootindex=0,serial=OS_DISK   \
  \
  -blockdev driver=qcow2,discard=ignore,detect-zeroes=off,file.driver=file,file.filename=/home/mstg1.qcow2,node-name=data_image1   \
  -device virtio-blk-pci,id=data1,drive=data_image1,bus=pcie-root-port-1,bootindex=1,serial=DATA_DISK1   \
  -blockdev driver=qcow2,discard=ignore,detect-zeroes=on,file.driver=file,file.filename=/home/mstg2.qcow2,node-name=data_image2   \
  -device virtio-blk-pci,id=data2,drive=data_image2,bus=pcie-root-port-2,bootindex=2,serial=DATA_DISK2   \
  \
  -blockdev driver=qcow2,discard=unmap,detect-zeroes=off,file.driver=file,file.filename=/home/mstg4.qcow2,node-name=data_image4   \
  -device virtio-blk-pci,id=data4,drive=data_image4,bus=pcie-root-port-4,bootindex=4,serial=DATA_DISK4   \
  -blockdev driver=qcow2,discard=unmap,detect-zeroes=on,file.driver=file,file.filename=/home/mstg5.qcow2,node-name=data_image5   \
  -device virtio-blk-pci,id=data5,drive=data_image5,bus=pcie-root-port-5,bootindex=5,serial=DATA_DISK5   \
  -blockdev driver=qcow2,discard=unmap,detect-zeroes=unmap,file.driver=file,file.filename=/home/mstg6.qcow2,node-name=data_image6   \
  -device virtio-blk-pci,id=data6,drive=data_image6,bus=pcie-root-port-6,bootindex=6,serial=DATA_DISK6   \
  -vnc :5 \
  -monitor stdio \
  -qmp tcp:0:5955,server=on,wait=off \
  -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b7,id=nic1,netdev=nicpci,bus=pcie-root-port-7 \
  -netdev tap,id=nicpci \
  -boot menu=on,reboot-timeout=1000,strict=off \
  \
  -chardev socket,id=socket-serial,path=/var/tmp/socket-serial,logfile=/var/tmp/file-serial.log,mux=on,server=on,wait=off \
  -serial chardev:socket-serial \
  -chardev file,path=/var/tmp/file-bios.log,id=file-bios \
  -device isa-debugcon,chardev=file-bios,iobase=0x402 \
  \
  -chardev socket,id=socket-qmp,path=/var/tmp/socket-qmp,logfile=/var/tmp/file-qmp.log,mux=on,server=on,wait=off \
  -mon chardev=socket-qmp,mode=control \
  -chardev socket,id=socket-hmp,path=/var/tmp/socket-hmp,logfile=/var/tmp/file-hmp.log,mux=on,server=on,wait=off \
  -mon chardev=socket-hmp,mode=readline \


3.login guest check disks
lsblk
find disks vda-vde

4.run format script on target disk
./format.sh vda

(content of format.sh)
dev=$1
devfs=/tmp/$dev
mkdir -p $devfs

mkfs.xfs -f /dev/$dev 
mount /dev/$dev $devfs
mount |grep $dev
dd if=/dev/zero of=$devfs/x.img bs=1M count=512 oflag=direct
ls $devfs/x.img
lpdev=$(losetup -f --show $devfs/x.img)
echo "lpdev:$lpdev"
mkfs.xfs -f $lpdev
ret=$?
losetup|grep $dev
mount |grep $dev
if [ $ret -ne 0 ] ;then
	echo "error"
fi

it hit error on vdb/vdd/vde (the detect_zeroes is on or unmap)

kfs.xfs: pwrite failed: Input/output error
libxfs_bwrite: write failed on (unknown) bno 0xbfff00/0x100, err=5
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: pwrite failed: Input/output error
libxfs_bwrite: write failed on (unknown) bno 0x0/0x100, err=5
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: Lost a write to the data device!
/dev/vde on /tmp/vde type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
dd: error writing '/tmp/vde/x.img': Input/output error

Comment 5 Cory Bolar 2023-05-23 03:14:06 UTC
The reproducer you have is accurate.  Is there anything else you need from me?

Comment 6 Kevin Wolf 2023-05-24 09:59:12 UTC
According to comment 2, this was already fixed by the rebase to QEMU 8.0, so I don't think there is anything left to do from the development side.

Mirek, can this be moved straight to ON_QA or do you still need to do something else to document that it was fixed in a build?

Comment 8 Yanan Fu 2023-05-30 03:01:07 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass

Comment 11 qing.wang 2023-05-31 05:56:40 UTC
Passed test with comment #2  steps

Red Hat Enterprise Linux release 9.3 Beta (Plow)
5.14.0-316.el9.x86_64
qemu-kvm-8.0.0-4.el9.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20230301gitf80f052277c8-5.el9.noarch
libvirt-9.3.0-2.el9.x86_64

Comment 13 errata-xmlrpc 2023-11-07 08:27:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6368