RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 688146 - qcow2: Some paths fail to handle I/O errors
Summary: qcow2: Some paths fail to handle I/O errors
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-16 13:18 UTC by Kevin Wolf
Modified: 2013-01-09 23:39 UTC (History)
6 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.151.el6
Doc Type: Bug Fix
Doc Text:
Cause: bugs in the handling of errors in the qcow2 code. Consequence: some error cases were being ignored, and could cause image corruption. Fix: backport of error handling fixes on qcow2 code. Result: safer error handling and qcow2 image corruption avoided on error cases.
Clone Of:
Environment:
Last Closed: 2011-05-19 11:21:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0534 0 normal SHIPPED_LIVE Important: qemu-kvm security, bug fix, and enhancement update 2011-05-19 11:20:36 UTC

Description Kevin Wolf 2011-03-16 13:18:29 UTC
Upstream has some fixes for error handling that need to be backported:

- Immediate I/O error for reading from the backing file were ignored
- I/O errors in reading compressed clusters were ignored
- COW of L2 tables with internal snapshots used an unsafe order so that I/O errors or crashes in the middle could cause image corruption

Comment 6 juzhang 2011-03-21 05:43:40 UTC
(In reply to comment #0)
> Upstream has some fixes for error handling that need to be backported:
> 
> - Immediate I/O error for reading from the backing file were ignored
I both tried fixed version(qemu-kvm-0.12.1.2-2.151.el6) and unfixed version(qemu-kvm-0.12.1.2-2.149.el6),both failed.seems immediate error still be ignored.any mistake,please fix me.
1.wrote blkdebug configuration file
cat > blkdebug.cfg <<EOF
[inject-error]
event = ""
errno = "5"
immediately = "off"
EOF
2.create qcow2 
qemu-img create -f qcow2 test.qcow2 2G
3.read
qemu-io blkdebug:blkdebug.cfg:test.qcow2
qemu-io> read 0 1G
read 1073741824/1073741824 bytes at offset 0
1 GiB, 1 ops; 0.0000 sec (6.575 GiB/sec and 6.5751 ops/sec)

> - I/O errors in reading compressed clusters were ignored
Can't reproduce this issue.would you please provide effectively methods?
1.create qcow2 img.
qemu-img  create  -f "qcow2" zhang.qcow2 6G

2.covert to compressed img
qemu-img convert -f qcow2 zhang.qcow2 -O qcow2 -c zhangconvert1.qcow2

3.compressed image on an NFS mount

4.boot guest with compressed img as second img and rerror=stop.
-drive file=/root/nfs/zhangconvert1.qcow2,if=none,id=test1,cache=none,format=qcow2,werror=stop,rerror=stop -device virtio-blk-pci,drive=test1

5.In guest,keep reading form compressed img
while true;do dd if=/dev/vdb of=/dev/null;done
6.disconned  nfs server.

Results:
vm still running.

> - COW of L2 tables with internal snapshots used an unsafe order so that I/O
> errors or crashes in the middle could cause image corruption
I still can not find reproduce methods,would you please provide effectively methods?

Comment 7 Kevin Wolf 2011-04-15 09:26:05 UTC
(In reply to comment #6)
> (In reply to comment #0)
> > Upstream has some fixes for error handling that need to be backported:
> > 
> > - Immediate I/O error for reading from the backing file were ignored
> I both tried fixed version(qemu-kvm-0.12.1.2-2.151.el6) and unfixed
> version(qemu-kvm-0.12.1.2-2.149.el6),both failed.seems immediate error still be
> ignored.any mistake,please fix me.
> 1.wrote blkdebug configuration file
> cat > blkdebug.cfg <<EOF
> [inject-error]
> event = ""
> errno = "5"
> immediately = "off"
> EOF

A rule with an empty event name is never triggered. You may use event = "aio_read".

Also, please note that this is about failed reads from the backing file. So you need a backing file and a overlay, like this:

qemu-img create -f qcow2 base.qcow2 2G
qemu-img create -f qcow2 -b blkdebug:blkdebug.cfg:base.qcow2 snap1.qcow2

And then try to read from snap1.qcow2 (without having written to it before, so that it tries to access the backing file).

> > - I/O errors in reading compressed clusters were ignored
> Can't reproduce this issue.would you please provide effectively methods?
> [...] 
> Results:
> vm still running.

Expected result with rerror=stop is that the VM stops, so in fact you have reproduced the bug.

> > - COW of L2 tables with internal snapshots used an unsafe order so that I/O
> > errors or crashes in the middle could cause image corruption
> I still can not find reproduce methods,would you please provide effectively
> methods?

Use blkdebug to let it fail for the event "l2_alloc.cow_read". Create an internal snapshot and try to write to it. After the I/O has failed, use qemu-img check.

Comment 8 juzhang 2011-04-19 08:15:05 UTC
After communicated with kwolf,we think did functional testing can cover this issue.

We ran two functional testing,didn't find any new bugs and any regression bugs.

https://tcms.engineering.redhat.com/run/19552/
https://tcms.engineering.redhat.com/run/19551/




I also check these packages has applied to qemu-kvm-0.12.1.2-2.158.el6.x86_64.

 #rpm -qa --changelog qemu-kvm | grep 688146 
- kvm-qcow2-Fix-error-handling-for-immediate-backing-file-.patch [bz#688146]
- kvm-qcow2-Fix-error-handling-for-reading-compressed-clus.patch [bz#688146]
- kvm-qcow2-Fix-order-in-L2-table-COW.patch [bz#688146]

Comment 9 juzhang 2011-04-19 08:15:46 UTC
According to comment8,set this issue as verified.

Comment 10 Eduardo Habkost 2011-05-03 19:14:27 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: bugs in the handling of errors in the qcow2 code.

Consquence: some error cases were being ignored, and could cause image corruption.

Fix: backport of error handling fixes on qcow2 code.

Result: safer error handling and qcow2 image corruption avoided on error cases.

Comment 11 Eduardo Habkost 2011-05-03 20:56:23 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,6 +1,6 @@
 Cause: bugs in the handling of errors in the qcow2 code.
 
-Consquence: some error cases were being ignored, and could cause image corruption.
+Consequence: some error cases were being ignored, and could cause image corruption.
 
 Fix: backport of error handling fixes on qcow2 code.

Comment 12 errata-xmlrpc 2011-05-19 11:21:17 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html

Comment 13 errata-xmlrpc 2011-05-19 13:02:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html


Note You need to log in before you can comment on or make changes to this bug.