Bug 1438850
| Summary: | LiveMerge fails with libvirtError: Block copy still active. Disk not ready for pivot | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Ala Hino <ahino> | ||||||||||||
| Component: | General | Assignee: | Ala Hino <ahino> | ||||||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Elad <ebenahar> | ||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||
| Priority: | unspecified | ||||||||||||||
| Version: | 4.17.28 | CC: | ahino, alitke, amureini, bugs, creatmbox, eedri, jspanko, kgoldbla, mkalinin, mst, nsoffer, rabraham, stirabos, tnisan, ylavi | ||||||||||||
| Target Milestone: | ovirt-4.1.3 | Keywords: | ZStream | ||||||||||||
| Target Release: | 4.19.16 | Flags: | rule-engine:
ovirt-4.1+
|
||||||||||||
| Hardware: | Unspecified | ||||||||||||||
| OS: | Unspecified | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
| Doc Text: |
Cause:
Testing completion of a live merge operation was incorrect, checking live merge progress value available via libvirt api which does not provide the status of a live merge operation.
Consequence:
Live merge was detected as completed before the operation was actually completed. Trying to finalize the merge operation failed repeatedly until the operation was actually completed, logging multiple errors during the process.
Fix:
Detect live merge completion using the libvirt xml.
Result:
Live merge operation will complete successfully without logging errors.
|
Story Points: | --- | ||||||||||||
| Clone Of: | 1376580 | Environment: | |||||||||||||
| Last Closed: | 2017-07-06 13:31:50 UTC | Type: | Bug | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Embargoed: | |||||||||||||||
| Bug Depends On: | 1376580 | ||||||||||||||
| Bug Blocks: | 1427184, 1447437 | ||||||||||||||
| Attachments: |
|
||||||||||||||
Created attachment 1268684 [details]
io during failing live merge
Created attachment 1268685 [details]
VDSM Log
Created attachment 1268690 [details]
vdsm log 2nd error
Created attachment 1268700 [details]
Task State
Fix of this BZ will be based on handling libvirt events when pivot is ready. *** Bug 1441941 has been marked as a duplicate of this bug. *** The attached patch (http://gerrit.ovirt.org/75954) does not fix the issue of slow merge, I don't think we can fix the case when the vm is doing lot of io so the merge never converge. Maybe this issue should be handled in qemu. The patch does fix the issue of detecting when a block job is ready. Previously we thought that the only way to detect this is using libvirt events, but with this patch using libvirt events is an optimization that we should consider for future version, but for 4.1 we can use xml detection. With this patch we can resolve this bug in 4.1.3. Removed patches copied when the patch was cloned, they are not relevant to this bug. (In reply to Nir Soffer from comment #8) > The attached patch (http://gerrit.ovirt.org/75954) does not fix the issue of > slow > merge, I don't think we can fix the case when the vm is doing lot of io so > the > merge never converge. Maybe this issue should be handled in qemu. > > The patch does fix the issue of detecting when a block job is ready. > Previously we > thought that the only way to detect this is using libvirt events, but with > this > patch using libvirt events is an optimization that we should consider for > future > version, but for 4.1 we can use xml detection. > > With this patch we can resolve this bug in 4.1.3. Ack Nir, Ala, are the reproductions steps similar to https://bugzilla.redhat.com/show_bug.cgi?id=1376580#c30 ? The bugs are related to pivot behavior; however, they are different. This one fixes the logic used to determine when disk is ready for pivot. The other one, nicely handles a use case where we try to do pivot while the disk isn't ready for pivot. Actually, with this fix, we shouldn't encounter the previous bug. Reproducing the "disk is not for ready pivot" isn't trivial and, somehow, we never were able to reproduce. I believe that if you simply run live merge, the merge will successfully complete. Thanks Ala. We're constantly executing the live merge test plan [1]. The latest execution with [2] ended with 100% success. [1] https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/wiki/Storage/3_5_Storage_Live_Merge [2] vdsm-4.19.17-1.el7ev.x86_64 libvirt-daemon-2.0.0-10.el7_3.9.x86_64 rhevm-4.1.3.1-0.1.el7.noarch |
Created attachment 1268683 [details] Engine Log