Bug 1744207

Summary: qemu, qemu-img fail to detect alignment with XFS and Gluster/XFS on 4k block device
Product: Red Hat Enterprise Linux 8 Reporter: Ademar Reis <areis>
Component: qemu-kvmAssignee: Hanna Czenczek <hreitz>
Status: CLOSED DUPLICATE QA Contact: Xueqiang Wei <xuwei>
Severity: medium Docs Contact:
Priority: medium    
Version: ---CC: aliang, coli, jinzhao, juzhang, rbalakri, timao, virt-maint, xuwei
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-02 12:46:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1743360, 1743365, 1749134    
Bug Blocks:    

Description Ademar Reis 2019-08-21 14:24:08 UTC
Clone for base RHEL8. Not critical, but worth fixing, probably reusing the backport from qemu-kvm-rhev in RHEL-7.7.z.

This bug was initially created as a copy of Bug #1743360

Description of problem:

When using storage with sector size of 4k, qemu and qemu-img fail to probe 
the alignment requirement for direct I/O, and fail with EINVAL when accesing
storage.

Several flows may fail:
- Provisioning a VM on 4k storage, fail when the installer try to create
  filesystems
- Coping disk image from 4k storage, fail when reading from source image
- Copying disk to 4k storage, fail when writing to the target disk

I reproduced the failures with:
- xfs on loop devices using 4k sector size
- gluster backed by xfs, on vdo device (exposing 4k sector size)

The root cause for both issues is alignment probing. The issue was fixed
upstream in this commit:
https://github.com/qemu/qemu/commit/a6b257a08e3d72219f03e461a52152672fec0612

This is the RHEL version of the these Fedora bugs:
- Bug 1737256 - Provisioning VM on 4k gluster storage fails with "Invalid argument" - qemu fail to detect block size
- Bug 1738657 - qemu-img convert fail to read with "Invalid argument" on gluster storage with 4k sector size

I merged both bugs for RHEL since we understand now that both issue are the
same.

Version-Release number of selected component (if applicable):
Tested with qemu/qemu-img 4.1 rc2 on Fedora 29
Tested with qemu-rhev/qemu-img-rhev on CentOS 7.6

How reproducible:
Always
Note: copying disk depends on the disk content, not all disk fail.


Steps to Reproduce - provisioning - xfs on loop device:

1. Create loop device with 4 sector size:

    losetup -f backing-file --show --sector-size=4096

2. Create xfs file system

    mkfs -t xfs /dev/loop0

3. Mount 

    mkdir /tmp/loop0
    mount /dev/loop0 /tmp/loop0

4. Create new disk

   qemu-img create -f raw /tmp/loop0/disk.img

5. Start a VM:

   qemu-system-x86_64 -accel kvm -m 2048 -smp 2 \
   -drive file=/tmp/loop0/disk.img,format=raw,cache=none \
   -cdrom Fedora-Server-dvd-x86_64-29-1.2.iso

6. Try to install with default options.

The installer fails in few seconds when trying to create filesystem on the
root logical volume.


Steps to Reproduce - copying disk from 4k storage - xfs on loop device:

1. Create a new image on 4k storage. One way is to use virt-builder:

    virt-builder fedora-29 -o disk.img

2. Copy the disk to target image elsewhere:

    qemu-img convert -f raw -O raw -t none -T none /tmp/loop0/disk.img \
        disk-clone.img

Will fail with EINVAL when reading from the image:
qemu-img: error while reading sector XXX: Invalid argument


Steps to Reproduce - copying disk to 4k storage - xfs on loop device:

1. Create a new image on 4k storage. One way is to use virt-builder:

    virt-builder fedora-29 -o /tmp/loop0/disk.img

2. Copy the disk to target image on the 4k storage:

    qemu-img convert -f raw -O raw -t none -T none disk.img \
       /tmp/loop0/disk-clone.img

Will fail with EINVAL when writing to target image:
qemu-img: error while writing sector XXX: Invalid argument


Steps to reproduce - gluster/xfs/vdo storage

Creating this storage is more complex.
I reproduced this using 3 vms, deployed using these scripts:
- https://github.com/oVirt/vdsm/blob/master/contrib/deploy-gluster.sh
- https://github.com/oVirt/vdsm/blob/master/contrib/create-vdo-brick.sh
- https://github.com/oVirt/vdsm/blob/master/contrib/create-gluster-volume.sh

You need also to set this gluster volume option:

    gluster volume set gv0 performance.strict-o-direct on

Once all gluster nodes are up, mount the storage:

    mkdir /tmp/gv0
    mount -t glusterfs gluster1:/gv0 /tmp/gv0

Now you can reproduce using the same flows explained above for loop device,
replacing /tmp/loop0 with /tmp/gv0.

Comment 1 Ademar Reis 2019-08-21 19:18:17 UTC
Max: we need a backport in base RHEL as well. It's too late for RHEL-8.1 at this point, so the target is RHEL-8.2. The backport should be similiar to RHEL-7.7.z (qemu-2.12).

Comment 2 Hanna Czenczek 2019-09-02 12:46:09 UTC

*** This bug has been marked as a duplicate of bug 1738839 ***