Bug 586747 - QEMU should report error other than EIO when facing a corrupted QCOW2
Summary: QEMU should report error other than EIO when facing a corrupted QCOW2
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.5.z
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: Rhel5KvmTier3
TreeView+ depends on / blocked
 
Reported: 2010-04-28 10:04 UTC by Yaniv Kaul
Modified: 2013-07-04 01:50 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-28 14:47:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Yaniv Kaul 2010-04-28 10:04:43 UTC
Description of problem:
Regretfully, qcow2 images sometime get corrupted. This is propagated back up in some cases as EIO. Example from block-qcow2.c, qcow_aio_read_cb():
 if ((acb->cluster_offset & 511) != 0) {
            ret = -EIO;
            goto fail;
        }

This causes management (VDSM) to PAUSE the VM (as we do in EIO cases). However, this is not going to help:
1. The VM is not be able to be successfully 'cont'
2. The user would not understand the issue.

Please change from EIO to something else. We'll have to map it to something related, as we don't have some E-QCOW2-CORRUPTED... 

Version-Release number of selected component (if applicable):
kvm-83-164.el5_5.6

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Kevin Wolf 2010-05-05 09:24:44 UTC
I don't think changing the error code to something different (either choosing something completely unrelated or inventing new codes) is the right way to do this. Moreover, it looks like you won't get error codes at all in upstream/RHEL6, so this would be a solution for RHEL5 only.

We should probably expose this in a different way. In upstream/RHEL6 this would probably be a QMP event and possibly some query-stop-reason command if it's going to be introduced. In RHEL5 we would only use the stop reason to indicate this.

Would this work for VDSM?

Comment 2 Luiz Capitulino 2010-05-06 18:40:20 UTC
We already have the I/O error event in upstream/rhel6, but besides not providing the I/O error reason (which is a relevant limitation) what Yaniv is asking for seems to be more complex: he wants us to be able to say what what an error is in a high-level way, eg. 'this qcow2 image got corrupted'.

Can the block layer provide such information?

Comment 3 Kevin Wolf 2010-05-07 08:47:10 UTC
Currently it can't, but we surely can add this. I was thinking of some bdrv_image_corrupted() call that block drivers can use and that sets some status in the BlockDriverState. The monitor could then look at that status and if it's set to an error condition, it could append that information to the I/O error event.

Basically, once we have decided what information to provide and in what format, the actual implementation should be easy enough.

Comment 4 Luiz Capitulino 2010-05-07 13:18:21 UTC
Seems excellent to me.

Comment 7 RHEL Program Management 2011-01-11 20:27:52 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 8 RHEL Program Management 2011-01-11 22:51:10 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 10 Kevin Wolf 2011-07-28 14:47:19 UTC
It's in the long-term upstream wishlist, but certainly isn't going to appear in RHEL 5.


Note You need to log in before you can comment on or make changes to this bug.