Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 688146 - qcow2: Some paths fail to handle I/O errors
qcow2: Some paths fail to handle I/O errors
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.1
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Kevin Wolf
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-03-16 09:18 EDT by Kevin Wolf
Modified: 2013-01-09 18:39 EST (History)
6 users (show)

See Also:
Fixed In Version: qemu-kvm-0.12.1.2-2.151.el6
Doc Type: Bug Fix
Doc Text:
Cause: bugs in the handling of errors in the qcow2 code. Consequence: some error cases were being ignored, and could cause image corruption. Fix: backport of error handling fixes on qcow2 code. Result: safer error handling and qcow2 image corruption avoided on error cases.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-05-19 07:21:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0534 normal SHIPPED_LIVE Important: qemu-kvm security, bug fix, and enhancement update 2011-05-19 07:20:36 EDT

  None (edit)
Description Kevin Wolf 2011-03-16 09:18:29 EDT
Upstream has some fixes for error handling that need to be backported:

- Immediate I/O error for reading from the backing file were ignored
- I/O errors in reading compressed clusters were ignored
- COW of L2 tables with internal snapshots used an unsafe order so that I/O errors or crashes in the middle could cause image corruption
Comment 6 juzhang 2011-03-21 01:43:40 EDT
(In reply to comment #0)
> Upstream has some fixes for error handling that need to be backported:
> 
> - Immediate I/O error for reading from the backing file were ignored
I both tried fixed version(qemu-kvm-0.12.1.2-2.151.el6) and unfixed version(qemu-kvm-0.12.1.2-2.149.el6),both failed.seems immediate error still be ignored.any mistake,please fix me.
1.wrote blkdebug configuration file
cat > blkdebug.cfg <<EOF
[inject-error]
event = ""
errno = "5"
immediately = "off"
EOF
2.create qcow2 
qemu-img create -f qcow2 test.qcow2 2G
3.read
qemu-io blkdebug:blkdebug.cfg:test.qcow2
qemu-io> read 0 1G
read 1073741824/1073741824 bytes at offset 0
1 GiB, 1 ops; 0.0000 sec (6.575 GiB/sec and 6.5751 ops/sec)

> - I/O errors in reading compressed clusters were ignored
Can't reproduce this issue.would you please provide effectively methods?
1.create qcow2 img.
qemu-img  create  -f "qcow2" zhang.qcow2 6G

2.covert to compressed img
qemu-img convert -f qcow2 zhang.qcow2 -O qcow2 -c zhangconvert1.qcow2

3.compressed image on an NFS mount

4.boot guest with compressed img as second img and rerror=stop.
-drive file=/root/nfs/zhangconvert1.qcow2,if=none,id=test1,cache=none,format=qcow2,werror=stop,rerror=stop -device virtio-blk-pci,drive=test1

5.In guest,keep reading form compressed img
while true;do dd if=/dev/vdb of=/dev/null;done
6.disconned  nfs server.

Results:
vm still running.

> - COW of L2 tables with internal snapshots used an unsafe order so that I/O
> errors or crashes in the middle could cause image corruption
I still can not find reproduce methods,would you please provide effectively methods?
Comment 7 Kevin Wolf 2011-04-15 05:26:05 EDT
(In reply to comment #6)
> (In reply to comment #0)
> > Upstream has some fixes for error handling that need to be backported:
> > 
> > - Immediate I/O error for reading from the backing file were ignored
> I both tried fixed version(qemu-kvm-0.12.1.2-2.151.el6) and unfixed
> version(qemu-kvm-0.12.1.2-2.149.el6),both failed.seems immediate error still be
> ignored.any mistake,please fix me.
> 1.wrote blkdebug configuration file
> cat > blkdebug.cfg <<EOF
> [inject-error]
> event = ""
> errno = "5"
> immediately = "off"
> EOF

A rule with an empty event name is never triggered. You may use event = "aio_read".

Also, please note that this is about failed reads from the backing file. So you need a backing file and a overlay, like this:

qemu-img create -f qcow2 base.qcow2 2G
qemu-img create -f qcow2 -b blkdebug:blkdebug.cfg:base.qcow2 snap1.qcow2

And then try to read from snap1.qcow2 (without having written to it before, so that it tries to access the backing file).

> > - I/O errors in reading compressed clusters were ignored
> Can't reproduce this issue.would you please provide effectively methods?
> [...] 
> Results:
> vm still running.

Expected result with rerror=stop is that the VM stops, so in fact you have reproduced the bug.

> > - COW of L2 tables with internal snapshots used an unsafe order so that I/O
> > errors or crashes in the middle could cause image corruption
> I still can not find reproduce methods,would you please provide effectively
> methods?

Use blkdebug to let it fail for the event "l2_alloc.cow_read". Create an internal snapshot and try to write to it. After the I/O has failed, use qemu-img check.
Comment 8 juzhang 2011-04-19 04:15:05 EDT
After communicated with kwolf,we think did functional testing can cover this issue.

We ran two functional testing,didn't find any new bugs and any regression bugs.

https://tcms.engineering.redhat.com/run/19552/
https://tcms.engineering.redhat.com/run/19551/




I also check these packages has applied to qemu-kvm-0.12.1.2-2.158.el6.x86_64.

 #rpm -qa --changelog qemu-kvm | grep 688146 
- kvm-qcow2-Fix-error-handling-for-immediate-backing-file-.patch [bz#688146]
- kvm-qcow2-Fix-error-handling-for-reading-compressed-clus.patch [bz#688146]
- kvm-qcow2-Fix-order-in-L2-table-COW.patch [bz#688146]
Comment 9 juzhang 2011-04-19 04:15:46 EDT
According to comment8,set this issue as verified.
Comment 10 Eduardo Habkost 2011-05-03 15:14:27 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: bugs in the handling of errors in the qcow2 code.

Consquence: some error cases were being ignored, and could cause image corruption.

Fix: backport of error handling fixes on qcow2 code.

Result: safer error handling and qcow2 image corruption avoided on error cases.
Comment 11 Eduardo Habkost 2011-05-03 16:56:23 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,6 +1,6 @@
 Cause: bugs in the handling of errors in the qcow2 code.
 
-Consquence: some error cases were being ignored, and could cause image corruption.
+Consequence: some error cases were being ignored, and could cause image corruption.
 
 Fix: backport of error handling fixes on qcow2 code.
Comment 12 errata-xmlrpc 2011-05-19 07:21:17 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html
Comment 13 errata-xmlrpc 2011-05-19 09:02:19 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html

Note You need to log in before you can comment on or make changes to this bug.