Bug 1834646
| Summary: | qemu-img convert abort when converting image with unaligned size | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Xueqiang Wei <xuwei> |
| Component: | qemu-kvm | Assignee: | Kevin Wolf <kwolf> |
| qemu-kvm sub component: | Storage | QA Contact: | Xueqiang Wei <xuwei> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | high | CC: | aliang, chayang, coli, ddepaula, hreitz, jinzhao, juzhang, nsoffer, songyi2k, virt-maint |
| Version: | 8.3 | Keywords: | Regression |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | 8.3 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1834281 | Environment: | |
| Last Closed: | 2020-11-17 17:48:34 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1834281 | ||
|
Description
Xueqiang Wei
2020-05-12 06:47:17 UTC
Hit it on rhel8.3 fast train Versions: kernel-4.18.0-194.el8.x86_64 qemu-kvm-5.0.0-0.scrmod+el8.3.0+6495+1936fa11.wrb200506 # truncate -s 11136 test.img # qemu-io -c 'write -P 1 0 10K' test.img -f raw wrote 10240/10240 bytes at offset 0 10 KiB, 1 ops; 00.04 sec (247.384 KiB/sec and 24.7384 ops/sec) # qemu-img convert -f raw -O raw -p -t none -T none test.img tgt.img -o preallocation=falloc qemu-img: /builddir/build/BUILD/qemu-5.0.0/block/io.c:1887: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. Aborted (core dumped) Also hit it on rhel8.2.1 fast train - qemu-kvm-4.2.0-21.module+el8.2.1+6586+8b7713b9 # truncate -s 11136 test.img # qemu-io -c 'write -P 1 0 10K' test.img -f raw wrote 10240/10240 bytes at offset 0 10 KiB, 1 ops; 00.04 sec (266.294 KiB/sec and 26.6294 ops/sec) # qemu-img convert -f raw -O raw -p -t none -T none test.img tgt.img -o preallocation=falloc qemu-img: block/io.c:1871: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. Aborted (core dumped) Adjusting dependency order - fix goes into RHEL AV first, then RHEL. Assigned to Ademar for initial triage per bz process and age of bug created or assigned to virt-maint without triage. Not sure if this would be Kevin or Max. If fix goes into RHEL AV 8.2.1, then bug 1834281 would pick up change when rebase occurs Also hit it on the latest package. Versions: kernel-4.18.0-200.el8.x86_64 qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420 This regression seems to be related to upstream commit a6b257a08e3 ('file-posix: Handle undetectable alignment') by Nir.
It seems that the algorithm used for detecting the required O_DIRECT alignment fails on NFS (because NFS is fine with byte alignment) and therefore chooses the safe default of 4k. Creating the target image rounds the image size up to full 512 bytes, but accessing the image in 4k granularity means that we'll still write past the end of the created image file.
We need to find a way to deal with this situation: Either automatically round the target image size up to a multiple of the request alignment (though then we wouldn't be creating an exact copy any more!) or error out. For NFS specifically, it would be good to find a way to figure out that byte alignment is actually what is needed. Without a kernel interface that just tells us the right alignment, I'm afraid our probing code can never be 100% reliable.
(In reply to Kevin Wolf from comment #6) In oVirt this is not an issue, since we enforce 4k alignment in all volumes. There is no way to create a volume which is not aligned to 4k, regardless of the storage. > # truncate -s 11136 test.img What is the use case for creating a volume with size < 4k? This change was added to allow 4k storage in RHHI. The chance that we will have kernel interface reporting the required alignment and that it will be exposed via fuse/gluster (or other storage) is low, and it will take years to get there. I think rounding the image up to request_alignment is the way to go, making such errors impossible. Tested with qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901, not hit this issue. So set status to VERIFIED. Versions: kernel-4.18.0-232.el8.x86_64 qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901 1. # mount -t nfs -o soft,vers=4.2 10.66.61.132:/home/nfs_server/ /home/nfs_test 2. # cd /home/nfs_test 3. # truncate -s 11136 test.img 4. # qemu-io -c 'write -P 1 0 10K' test.img -f raw wrote 10240/10240 bytes at offset 0 10 KiB, 1 ops; 00.05 sec (207.017 KiB/sec and 20.7017 ops/sec) 5. # qemu-img convert -f raw -O raw -p -t none -T none test.img tgt.img -o preallocation=falloc (100.00/100%) Automation result: (1/1) Host_RHEL.m8.u3.product_av.raw.virtio_blk.up.virtio_net.Guest.RHEL.8.3.0.x86_64.io-github-autotest-qemu.qemu_img_convert_image_with_unaligned_size.q35: PASS (2.89 s) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5137 Hi, Does this bug impact kvm run time disk access method? As https://access.redhat.com/support/cases/#/case/03068601 We have a raw file on NFS as below, and kvm open the file with O_DIRECT flag. <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none' io='native'/> <source file='/var/lib/nova/mnt/ef05afb0863beae8492ff11d98795fc7/volume-fd0f5fd7-f115-4863-9b72-de4b766ac0b3'/> <target dev='vdc' bus='virtio'/> <shareable/> <serial>fd0f5fd7-f115-4863-9b72-de4b766ac0b3</serial> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> We found when run fio sequential 5k block write test in VM, host has extra read IO, equal or a little more than write IOPS. [root@overcloud-novacompute-20 ~]# nfsiostat 10 /var/lib/nova/mnt/ef05afb0863beae8492ff11d98795fc7 xxx.xx.xx.xx:/vol1_dedup mounted on /var/lib/nova/mnt/ef05afb0863beae8492ff11d98795fc7: ops/s rpc bklog 2166.800 0.000 read: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms) avg queue (ms) errors 1177.400 7643.735 6.492 0 (0.0%) 0.176 0.232 0.039 0 (0.0%) write: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms) avg queue (ms) errors 979.000 8504.083 8.686 0 (0.0%) 0.227 0.273 0.032 0 (0.0%) As this bug behavior, qemu set 4K alignment size for O_DIRECT file on NFS. Then, when VM is not writing in 4K block, qemu has to read back 4K block from NFS server, made modification, then write back the 4K block, right? Thanks Yes, this is precisely what happens when QEMU incorrectly infers a 4k alignment for NFS shares. I expect that this problem goes away when you update to a more recent version that contains the fix for this bug report. Thanks for the confirm. We met the problem on qemu-kvm-4.2.0-29.module+el8.2.1+11280+70ae3d73.8.x86_64. And the problem is gone after upgrade to qemu-kvm-5.1.0-14.module+el8.3.0+8438+644aff69.x86_64 |