Description of problem: Guest vm with thin privision allocation policy got paused. Because of that there was no possibility to recover the QCOW2 disk and the data is now corrupt. Version-Release number of selected component (if applicable): VDSM: vdsm-4.10.2-27.0.el6ev RHEV Hypervisor: 6.4 - 20131016.0.el6 How reproducible: Undetermined Steps to Reproduce: 1. Create a guest vm with thin provisioning disk 2. Wait for VM to flip its state to pause with the message: "VM has been paused fue to a storage IO error" Actual results: The VM got paused Expected results: The VM shouldn't be paused. Additional info: Modifications to the vdsm.conf file were applied to workaround the issue without possitive results (KCS: https://access.redhat.com/site/solutions/385283)
VMs automatically move to paused if they receive EIO in RHEV, this has nothing to do with thin provisioning. Resuming the VM would make it retry the same I/O. If the original storage problem (that caused the EIO) persists then it will pause again. 1. What make you determine that there is a qcow2 corruption here? 2. please attach vdsm and libvirt logs. Thanks.
Nir, please take a look.
Please attach engine, vdsm, libvirt and qemu logs. engine: /path/to/ovirt-engine/var/log/engine.log vdsm: /var/log/vdsm/vdsm.log libvirt: /var/log/libvirtd.log qemu: /var/log/libvirt/qemu/vmname.log
Please attach also the output of "qemu-img info /path/to/disk" for all disks on this vm.
What is the oldest qemu-kvm version that was used with this specific image? Trying to make sure that it's not a duplicate of bug 974617 (copied as bug 996151 for 6.4.z), which was fixed in qemu-kvm-0.12.1.2-2.355.el6_4.7.
This is for Juan to answer. I am not technically driving this bug.
We cannot make progress without further info. Please reopen once it is available.