Created attachment 1268683 [details] Engine Log
Created attachment 1268684 [details] io during failing live merge
Created attachment 1268685 [details] VDSM Log
Created attachment 1268690 [details] vdsm log 2nd error
Created attachment 1268700 [details] Task State
Fix of this BZ will be based on handling libvirt events when pivot is ready.
*** Bug 1441941 has been marked as a duplicate of this bug. ***
The attached patch (http://gerrit.ovirt.org/75954) does not fix the issue of slow merge, I don't think we can fix the case when the vm is doing lot of io so the merge never converge. Maybe this issue should be handled in qemu. The patch does fix the issue of detecting when a block job is ready. Previously we thought that the only way to detect this is using libvirt events, but with this patch using libvirt events is an optimization that we should consider for future version, but for 4.1 we can use xml detection. With this patch we can resolve this bug in 4.1.3.
Removed patches copied when the patch was cloned, they are not relevant to this bug.
(In reply to Nir Soffer from comment #8) > The attached patch (http://gerrit.ovirt.org/75954) does not fix the issue of > slow > merge, I don't think we can fix the case when the vm is doing lot of io so > the > merge never converge. Maybe this issue should be handled in qemu. > > The patch does fix the issue of detecting when a block job is ready. > Previously we > thought that the only way to detect this is using libvirt events, but with > this > patch using libvirt events is an optimization that we should consider for > future > version, but for 4.1 we can use xml detection. > > With this patch we can resolve this bug in 4.1.3. Ack
Nir, Ala, are the reproductions steps similar to https://bugzilla.redhat.com/show_bug.cgi?id=1376580#c30 ?
The bugs are related to pivot behavior; however, they are different. This one fixes the logic used to determine when disk is ready for pivot. The other one, nicely handles a use case where we try to do pivot while the disk isn't ready for pivot. Actually, with this fix, we shouldn't encounter the previous bug. Reproducing the "disk is not for ready pivot" isn't trivial and, somehow, we never were able to reproduce. I believe that if you simply run live merge, the merge will successfully complete.
Thanks Ala. We're constantly executing the live merge test plan [1]. The latest execution with [2] ended with 100% success. [1] https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/wiki/Storage/3_5_Storage_Live_Merge [2] vdsm-4.19.17-1.el7ev.x86_64 libvirt-daemon-2.0.0-10.el7_3.9.x86_64 rhevm-4.1.3.1-0.1.el7.noarch