Bug 2203512

Summary: [qemu-kvm] detect_zeros=unmap causes postgres to crash
Product: Red Hat Enterprise Linux 9 Reporter: Cory Bolar <cory.bolar>
Component: qemu-kvmAssignee: Stefan Hajnoczi <stefanha>
qemu-kvm sub component: virtio-blk,scsi QA Contact: qing.wang <qinwang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: medium CC: aliang, coli, jinzhao, juzhang, kwolf, mrezanin, qinwang, vgoyal, virt-maint, xuwei, yfu
Version: 9.2Keywords: CustomerScenariosInitiative, Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-8.0.0-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:27:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Cory Bolar 2023-05-13 13:22:00 UTC
Description of problem:

detect_zeros=unmap with qemu 7.2 breaks postgres applications with data on qemu volumes

Maybe these commits didn't make it in for 9.3?

https://lore.kernel.org/qemu-devel/20230126201401.348845-1-stefanha@redhat.com/T/#t

Version-Release number of selected component (if applicable):

QEMU 7.2
RHEL 9.3

How reproducible:
Run postgres with data stored on qemu disk using detect_zeros=unmap

Comment 1 qing.wang 2023-05-17 06:39:58 UTC
Could you please show your VM command line or VM configuration?

Comment 2 qing.wang 2023-05-17 09:44:41 UTC
Reproduce this issue on

Red Hat Enterprise Linux release 9.3 Beta (Plow)
5.14.0-311.el9.x86_64
qemu-kvm-7.2.0-14.el9_2.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20230301gitf80f052277c8-3.el9.noarch
virtio-win-prewhql-0.1-236.iso

It can not hit the issue on

qemu-kvm-7.1.0-7.el9.x86_64
qemu-kvm-8.0.0-2.el9.x86_64

steps:
1. create data image file
qemu-img create -f qcow2 /home/mstg1.qcow2 1G
...
qemu-img create -f qcow2 /home/mstg6.qcow2 1G

2. boot vm 

/usr/libexec/qemu-kvm \
  -name testvm \
  -machine q35 \
  -m  6G \
  -smp 2 \
  -cpu host,+kvm_pv_unhalt \
  -device ich9-usb-ehci1,id=usb1 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
   \
   \
  -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x3,chassis=1 \
  -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x3.0x1,bus=pcie.0,chassis=2 \
  -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x3.0x2,bus=pcie.0,chassis=3 \
  -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x3.0x3,bus=pcie.0,chassis=4 \
  -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x3.0x4,bus=pcie.0,chassis=5 \
  -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x3.0x5,bus=pcie.0,chassis=6 \
  -device pcie-root-port,id=pcie-root-port-6,port=0x6,addr=0x3.0x6,bus=pcie.0,chassis=7 \
  -device pcie-root-port,id=pcie-root-port-7,port=0x7,addr=0x3.0x7,bus=pcie.0,chassis=8 \
  -device pcie-root-port,id=pcie_extra_root_port_0,bus=pcie.0,addr=0x4  \
  -device virtio-scsi-pci,id=scsi0,bus=pcie-root-port-0 \
  -blockdev driver=qcow2,file.driver=file,cache.direct=off,cache.no-flush=on,file.filename=/home/kvm_autotest_root/images/rhel930-64-virtio-scsi.qcow2,node-name=drive_image1,file.aio=threads   \
  -device scsi-hd,id=os,drive=drive_image1,bus=scsi0.0,bootindex=0,serial=OS_DISK   \
  \
  -blockdev driver=qcow2,discard=ignore,detect-zeroes=off,file.driver=file,file.filename=/home/mstg1.qcow2,node-name=data_image1   \
  -device virtio-blk-pci,id=data1,drive=data_image1,bus=pcie-root-port-1,bootindex=1,serial=DATA_DISK1   \
  -blockdev driver=qcow2,discard=ignore,detect-zeroes=on,file.driver=file,file.filename=/home/mstg2.qcow2,node-name=data_image2   \
  -device virtio-blk-pci,id=data2,drive=data_image2,bus=pcie-root-port-2,bootindex=2,serial=DATA_DISK2   \
  \
  -blockdev driver=qcow2,discard=unmap,detect-zeroes=off,file.driver=file,file.filename=/home/mstg4.qcow2,node-name=data_image4   \
  -device virtio-blk-pci,id=data4,drive=data_image4,bus=pcie-root-port-4,bootindex=4,serial=DATA_DISK4   \
  -blockdev driver=qcow2,discard=unmap,detect-zeroes=on,file.driver=file,file.filename=/home/mstg5.qcow2,node-name=data_image5   \
  -device virtio-blk-pci,id=data5,drive=data_image5,bus=pcie-root-port-5,bootindex=5,serial=DATA_DISK5   \
  -blockdev driver=qcow2,discard=unmap,detect-zeroes=unmap,file.driver=file,file.filename=/home/mstg6.qcow2,node-name=data_image6   \
  -device virtio-blk-pci,id=data6,drive=data_image6,bus=pcie-root-port-6,bootindex=6,serial=DATA_DISK6   \
  -vnc :5 \
  -monitor stdio \
  -qmp tcp:0:5955,server=on,wait=off \
  -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b7,id=nic1,netdev=nicpci,bus=pcie-root-port-7 \
  -netdev tap,id=nicpci \
  -boot menu=on,reboot-timeout=1000,strict=off \
  \
  -chardev socket,id=socket-serial,path=/var/tmp/socket-serial,logfile=/var/tmp/file-serial.log,mux=on,server=on,wait=off \
  -serial chardev:socket-serial \
  -chardev file,path=/var/tmp/file-bios.log,id=file-bios \
  -device isa-debugcon,chardev=file-bios,iobase=0x402 \
  \
  -chardev socket,id=socket-qmp,path=/var/tmp/socket-qmp,logfile=/var/tmp/file-qmp.log,mux=on,server=on,wait=off \
  -mon chardev=socket-qmp,mode=control \
  -chardev socket,id=socket-hmp,path=/var/tmp/socket-hmp,logfile=/var/tmp/file-hmp.log,mux=on,server=on,wait=off \
  -mon chardev=socket-hmp,mode=readline \


3.login guest check disks
lsblk
find disks vda-vde

4.run format script on target disk
./format.sh vda

(content of format.sh)
dev=$1
devfs=/tmp/$dev
mkdir -p $devfs

mkfs.xfs -f /dev/$dev 
mount /dev/$dev $devfs
mount |grep $dev
dd if=/dev/zero of=$devfs/x.img bs=1M count=512 oflag=direct
ls $devfs/x.img
lpdev=$(losetup -f --show $devfs/x.img)
echo "lpdev:$lpdev"
mkfs.xfs -f $lpdev
ret=$?
losetup|grep $dev
mount |grep $dev
if [ $ret -ne 0 ] ;then
	echo "error"
fi

it hit error on vdb/vdd/vde (the detect_zeroes is on or unmap)

kfs.xfs: pwrite failed: Input/output error
libxfs_bwrite: write failed on (unknown) bno 0xbfff00/0x100, err=5
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: pwrite failed: Input/output error
libxfs_bwrite: write failed on (unknown) bno 0x0/0x100, err=5
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: Lost a write to the data device!
/dev/vde on /tmp/vde type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
dd: error writing '/tmp/vde/x.img': Input/output error

Comment 5 Cory Bolar 2023-05-23 03:14:06 UTC
The reproducer you have is accurate.  Is there anything else you need from me?

Comment 6 Kevin Wolf 2023-05-24 09:59:12 UTC
According to comment 2, this was already fixed by the rebase to QEMU 8.0, so I don't think there is anything left to do from the development side.

Mirek, can this be moved straight to ON_QA or do you still need to do something else to document that it was fixed in a build?

Comment 8 Yanan Fu 2023-05-30 03:01:07 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass

Comment 11 qing.wang 2023-05-31 05:56:40 UTC
Passed test with comment #2  steps

Red Hat Enterprise Linux release 9.3 Beta (Plow)
5.14.0-316.el9.x86_64
qemu-kvm-8.0.0-4.el9.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20230301gitf80f052277c8-5.el9.noarch
libvirt-9.3.0-2.el9.x86_64

Comment 13 errata-xmlrpc 2023-11-07 08:27:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6368