Description of problem: Guest vm with thin privision allocation policy got paused. Because of that there was no possibility to recover the QCOW2 disk and the data is now corrupt.
Version-Release number of selected component (if applicable):
RHEV Hypervisor: 6.4 - 20131016.0.el6
How reproducible: Undetermined
Steps to Reproduce:
1. Create a guest vm with thin provisioning disk
2. Wait for VM to flip its state to pause with the message: "VM has been paused fue to a storage IO error"
Actual results: The VM got paused
Expected results: The VM shouldn't be paused.
Additional info: Modifications to the vdsm.conf file were applied to workaround the issue without possitive results (KCS: https://access.redhat.com/site/solutions/385283)
VMs automatically move to paused if they receive EIO in RHEV, this has nothing to do with thin provisioning.
Resuming the VM would make it retry the same I/O. If the original storage problem (that caused the EIO) persists then it will pause again.
1. What make you determine that there is a qcow2 corruption here?
2. please attach vdsm and libvirt logs.
Nir, please take a look.
Please attach engine, vdsm, libvirt and qemu logs.
Please attach also the output of "qemu-img info /path/to/disk" for all disks on this vm.
What is the oldest qemu-kvm version that was used with this specific image?
Trying to make sure that it's not a duplicate of bug 974617 (copied as bug 996151
for 6.4.z), which was fixed in qemu-kvm-0.12.1.2-2.355.el6_4.7.
This is for Juan to answer. I am not technically driving this bug.
We cannot make progress without further info. Please reopen once it is available.