Bug 586747

Summary: QEMU should report error other than EIO when facing a corrupted QCOW2
Product: Red Hat Enterprise Linux 5 Reporter: Yaniv Kaul <ykaul>
Component: kvmAssignee: Kevin Wolf <kwolf>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5.zCC: jkt, lcapitulino, virt-maint, ykaul
Target Milestone: rcKeywords: Improvement
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-28 14:47:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 580946    

Description Yaniv Kaul 2010-04-28 10:04:43 UTC
Description of problem:
Regretfully, qcow2 images sometime get corrupted. This is propagated back up in some cases as EIO. Example from block-qcow2.c, qcow_aio_read_cb():
 if ((acb->cluster_offset & 511) != 0) {
            ret = -EIO;
            goto fail;
        }

This causes management (VDSM) to PAUSE the VM (as we do in EIO cases). However, this is not going to help:
1. The VM is not be able to be successfully 'cont'
2. The user would not understand the issue.

Please change from EIO to something else. We'll have to map it to something related, as we don't have some E-QCOW2-CORRUPTED... 

Version-Release number of selected component (if applicable):
kvm-83-164.el5_5.6

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Kevin Wolf 2010-05-05 09:24:44 UTC
I don't think changing the error code to something different (either choosing something completely unrelated or inventing new codes) is the right way to do this. Moreover, it looks like you won't get error codes at all in upstream/RHEL6, so this would be a solution for RHEL5 only.

We should probably expose this in a different way. In upstream/RHEL6 this would probably be a QMP event and possibly some query-stop-reason command if it's going to be introduced. In RHEL5 we would only use the stop reason to indicate this.

Would this work for VDSM?

Comment 2 Luiz Capitulino 2010-05-06 18:40:20 UTC
We already have the I/O error event in upstream/rhel6, but besides not providing the I/O error reason (which is a relevant limitation) what Yaniv is asking for seems to be more complex: he wants us to be able to say what what an error is in a high-level way, eg. 'this qcow2 image got corrupted'.

Can the block layer provide such information?

Comment 3 Kevin Wolf 2010-05-07 08:47:10 UTC
Currently it can't, but we surely can add this. I was thinking of some bdrv_image_corrupted() call that block drivers can use and that sets some status in the BlockDriverState. The monitor could then look at that status and if it's set to an error condition, it could append that information to the I/O error event.

Basically, once we have decided what information to provide and in what format, the actual implementation should be easy enough.

Comment 4 Luiz Capitulino 2010-05-07 13:18:21 UTC
Seems excellent to me.

Comment 7 RHEL Program Management 2011-01-11 20:27:52 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 8 RHEL Program Management 2011-01-11 22:51:10 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 10 Kevin Wolf 2011-07-28 14:47:19 UTC
It's in the long-term upstream wishlist, but certainly isn't going to appear in RHEL 5.