Bug 688146
Summary: | qcow2: Some paths fail to handle I/O errors | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Kevin Wolf <kwolf> |
Component: | qemu-kvm | Assignee: | Kevin Wolf <kwolf> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.1 | CC: | ehabkost, juzhang, mjenner, mkenneth, tburke, virt-maint |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-0.12.1.2-2.151.el6 | Doc Type: | Bug Fix |
Doc Text: |
Cause: bugs in the handling of errors in the qcow2 code.
Consequence: some error cases were being ignored, and could cause image corruption.
Fix: backport of error handling fixes on qcow2 code.
Result: safer error handling and qcow2 image corruption avoided on error cases.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2011-05-19 11:21:17 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Kevin Wolf
2011-03-16 13:18:29 UTC
(In reply to comment #0) > Upstream has some fixes for error handling that need to be backported: > > - Immediate I/O error for reading from the backing file were ignored I both tried fixed version(qemu-kvm-0.12.1.2-2.151.el6) and unfixed version(qemu-kvm-0.12.1.2-2.149.el6),both failed.seems immediate error still be ignored.any mistake,please fix me. 1.wrote blkdebug configuration file cat > blkdebug.cfg <<EOF [inject-error] event = "" errno = "5" immediately = "off" EOF 2.create qcow2 qemu-img create -f qcow2 test.qcow2 2G 3.read qemu-io blkdebug:blkdebug.cfg:test.qcow2 qemu-io> read 0 1G read 1073741824/1073741824 bytes at offset 0 1 GiB, 1 ops; 0.0000 sec (6.575 GiB/sec and 6.5751 ops/sec) > - I/O errors in reading compressed clusters were ignored Can't reproduce this issue.would you please provide effectively methods? 1.create qcow2 img. qemu-img create -f "qcow2" zhang.qcow2 6G 2.covert to compressed img qemu-img convert -f qcow2 zhang.qcow2 -O qcow2 -c zhangconvert1.qcow2 3.compressed image on an NFS mount 4.boot guest with compressed img as second img and rerror=stop. -drive file=/root/nfs/zhangconvert1.qcow2,if=none,id=test1,cache=none,format=qcow2,werror=stop,rerror=stop -device virtio-blk-pci,drive=test1 5.In guest,keep reading form compressed img while true;do dd if=/dev/vdb of=/dev/null;done 6.disconned nfs server. Results: vm still running. > - COW of L2 tables with internal snapshots used an unsafe order so that I/O > errors or crashes in the middle could cause image corruption I still can not find reproduce methods,would you please provide effectively methods? (In reply to comment #6) > (In reply to comment #0) > > Upstream has some fixes for error handling that need to be backported: > > > > - Immediate I/O error for reading from the backing file were ignored > I both tried fixed version(qemu-kvm-0.12.1.2-2.151.el6) and unfixed > version(qemu-kvm-0.12.1.2-2.149.el6),both failed.seems immediate error still be > ignored.any mistake,please fix me. > 1.wrote blkdebug configuration file > cat > blkdebug.cfg <<EOF > [inject-error] > event = "" > errno = "5" > immediately = "off" > EOF A rule with an empty event name is never triggered. You may use event = "aio_read". Also, please note that this is about failed reads from the backing file. So you need a backing file and a overlay, like this: qemu-img create -f qcow2 base.qcow2 2G qemu-img create -f qcow2 -b blkdebug:blkdebug.cfg:base.qcow2 snap1.qcow2 And then try to read from snap1.qcow2 (without having written to it before, so that it tries to access the backing file). > > - I/O errors in reading compressed clusters were ignored > Can't reproduce this issue.would you please provide effectively methods? > [...] > Results: > vm still running. Expected result with rerror=stop is that the VM stops, so in fact you have reproduced the bug. > > - COW of L2 tables with internal snapshots used an unsafe order so that I/O > > errors or crashes in the middle could cause image corruption > I still can not find reproduce methods,would you please provide effectively > methods? Use blkdebug to let it fail for the event "l2_alloc.cow_read". Create an internal snapshot and try to write to it. After the I/O has failed, use qemu-img check. After communicated with kwolf,we think did functional testing can cover this issue. We ran two functional testing,didn't find any new bugs and any regression bugs. https://tcms.engineering.redhat.com/run/19552/ https://tcms.engineering.redhat.com/run/19551/ I also check these packages has applied to qemu-kvm-0.12.1.2-2.158.el6.x86_64. #rpm -qa --changelog qemu-kvm | grep 688146 - kvm-qcow2-Fix-error-handling-for-immediate-backing-file-.patch [bz#688146] - kvm-qcow2-Fix-error-handling-for-reading-compressed-clus.patch [bz#688146] - kvm-qcow2-Fix-order-in-L2-table-COW.patch [bz#688146] Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: bugs in the handling of errors in the qcow2 code. Consquence: some error cases were being ignored, and could cause image corruption. Fix: backport of error handling fixes on qcow2 code. Result: safer error handling and qcow2 image corruption avoided on error cases. Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,6 +1,6 @@ Cause: bugs in the handling of errors in the qcow2 code. -Consquence: some error cases were being ignored, and could cause image corruption. +Consequence: some error cases were being ignored, and could cause image corruption. Fix: backport of error handling fixes on qcow2 code. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0534.html An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0534.html |