Bug 1142331
| Summary: | qemu-img convert intermittently corrupts output images | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Stefan Hajnoczi <stefanha> | |
| Component: | qemu-kvm-rhev | Assignee: | Hanna Czenczek <hreitz> | |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 7.1 | CC: | hhuang, hreitz, huding, juli, juzhang, knoel, kwolf, pbonzini, pbrady, sgordon, sluo, stefanha, tdosek, virt-maint, xfu | |
| Target Milestone: | beta | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | qemu-kvm-rhev-2.1.2-11.el7 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1160237 (view as bug list) | Environment: | ||
| Last Closed: | 2015-03-05 09:55:36 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1160237 | |||
|
Description
Stefan Hajnoczi
2014-09-16 15:12:01 UTC
Is this something we need an async release for? Bumping up priority since this is a data corrupter. My suggested fix is now upstream: http://git.qemu.org/?p=qemu.git;a=commit;h=38c4d0ae http://git.qemu.org/?p=qemu.git;a=commit;h=7c159037 From a RHEL-OSP point of view I would say yes. POST as of October 24. Fix included in qemu-kvm-rhev-2.1.2-11.el7 Hi Stefan, Could you give a method on how to reproduce this issue. QE just try as followings: Version of components: qemu-kvm-rhev-2.1.2-8.el7.x86_64 # cat test.sh #! /bin/sh SRC_PATH=/mnt/RHEL-Server-7.0-64-virtio.qcow2 TMP_PATH=/mnt/test.qcow2 DST_PATH=/mnt/test.raw QEMU_IMG_PATH=qemu-img cat $SRC_PATH > $TMP_PATH && $QEMU_IMG_PATH convert -O raw $TMP_PATH $DST_PATH && cksum $DST_PATH Steps: 1, mount an ext4 fs block device to /mnt. 2, sh test.sh But after step2, can not reproduce this bz. Could you give some suggestions? Thanks. Best Regards, Jun Li This is awkward to reproduce. The main thing is the source file must be sparse.
The following case was known to trigger the issue on ext4 from linux 2.6 time at least, though I'm not seeing the issue on my 3.17 kernel here on ext4 or xfs.
Now the issue is dependent on the kernel generating unwritten extents,
so you may need cache pressure or other activity on the file system etc. to trigger this these days. The upstream openstack trigger was with other openstack services hitting the file system also.
QEMU_IMG_PATH=qemu-img
for i in $(seq 1 2 21); do
for j in 1 2 31 100; do
perl -e '$n = '$i' * 1024; *F = *STDOUT;' \
-e 'for (1..'$j') { sysseek (*F, $n, 1)' \
-e '&& syswrite (*F, chr($_)x$n) or die "$!"}' > f1
$QEMU_IMG_PATH convert -O raw f1 f2
cmp f1 f2 || { echo "data loss i=$i j=$j" >&2; exit 1; }
done
done
(In reply to Pádraig Brady from comment #9) > This is awkward to reproduce. The main thing is the source file must be > sparse. > The following case was known to trigger the issue on ext4 from linux 2.6 > time at least, though I'm not seeing the issue on my 3.17 kernel here on > ext4 or xfs. I also did not reproduce this issue, tried your script with cache_pressure disable. # cat /proc/sys/vm/swappiness 10 # cat /proc/sys/vm/vfs_cache_pressure 100 # echo 0 > /proc/sys/vm/vfs_cache_pressure # echo 0 > /proc/sys/vm/swappiness # cat /proc/sys/vm/swappiness 0 # cat /proc/sys/vm/vfs_cache_pressure 0 > Now the issue is dependent on the kernel generating unwritten extents, > so you may need cache pressure or other activity on the file system etc. to > trigger this these days. Could you show me details how to cache pressure or other activity on the file system or provide a method to verify this issue, thanks in advance. > The upstream openstack trigger was with other > openstack services hitting the file system also. > > QEMU_IMG_PATH=qemu-img > for i in $(seq 1 2 21); do > for j in 1 2 31 100; do > perl -e '$n = '$i' * 1024; *F = *STDOUT;' \ > -e 'for (1..'$j') { sysseek (*F, $n, 1)' \ > -e '&& syswrite (*F, chr($_)x$n) or die "$!"}' > f1 > $QEMU_IMG_PATH convert -O raw f1 f2 > cmp f1 f2 || { echo "data loss i=$i j=$j" >&2; exit 1; } > done > done This one may be difficult to reproduce. Please just verify that the patch is included in the RPM. (In reply to Stefan Hajnoczi from comment #11) > This one may be difficult to reproduce. Please just verify that the patch > is included in the RPM. Thanks for your important infos. Verify this issue on qemu-kvm-rhev-2.1.2-14.el7.x86_64. host info: # uname -r && rpm -q qemu-kvm-rhev 3.10.0-205.el7.x86_64 qemu-kvm-rhev-2.1.2-14.el7.x86_64 # rpm -ql --changelog qemu-kvm-rhev-2.1.2-14.el7.x86_64 | grep 1142331 - kvm-block-raw-posix-Fix-disk-corruption-in-try_fiemap.patch [bz#1142331] - kvm-block-raw-posix-use-seek_hole-ahead-of-fiemap.patch [bz#1142331] - kvm-raw-posix-Fix-raw_co_get_block_status-after-EOF.patch [bz#1142331] - kvm-raw-posix-raw_co_get_block_status-return-value.patch [bz#1142331] - kvm-raw-posix-SEEK_HOLE-suffices-get-rid-of-FIEMAP.patch [bz#1142331] - kvm-raw-posix-The-SEEK_HOLE-code-is-flawed-rewrite-it.patch [bz#1142331] - Resolves: bz#1142331 Base on above, the fixed patch has been included in the RPM build. Move to VERIFIED status, please correct me if any mistake, thanks. Best Regards, sluo Append one question, why this bug fixed patch miss one, where is the kvm-block-raw-posix-Try-both-FIEMAP-and-SEEK_HOLE.patch ? Move back to ON_QA first. Hi Sluo, this patch is missing from the RHEV backport because that commit was already included in the original upstream 2.1.2 (which qemu-kvm-rhev-2.1.2 is based on). Therefore, it was unnecessary. Max (In reply to Max Reitz from comment #14) > Hi Sluo, > > this patch is missing from the RHEV backport because that commit was already > included in the original upstream 2.1.2 (which qemu-kvm-rhev-2.1.2 is based > on). Therefore, it was unnecessary. > > Max OK, thanks for your kindly explains, continue to move to VERIFIED status, please correct me if any mistake, thanks. Best Regards, sluo Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0624.html |