Bug 1438850 - LiveMerge fails with libvirtError: Block copy still active. Disk not ready for pivot
Summary: LiveMerge fails with libvirtError: Block copy still active. Disk not ready fo...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.17.28
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.3
: 4.19.16
Assignee: Ala Hino
QA Contact: Elad
URL:
Whiteboard:
Depends On: 1376580
Blocks: 1427184 1447437
TreeView+ depends on / blocked
 
Reported: 2017-04-04 14:37 UTC by Ala Hino
Modified: 2017-11-12 13:23 UTC (History)
15 users (show)

Fixed In Version:
Clone Of: 1376580
Environment:
Last Closed: 2017-07-06 13:31:50 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+


Attachments (Terms of Use)
Engine Log (564.93 KB, text/plain)
2017-04-04 14:40 UTC, Ala Hino
no flags Details
io during failing live merge (50.18 KB, image/png)
2017-04-04 14:42 UTC, Ala Hino
no flags Details
VDSM Log (121.68 KB, text/plain)
2017-04-04 14:43 UTC, Ala Hino
no flags Details
vdsm log 2nd error (239.33 KB, application/zip)
2017-04-04 14:43 UTC, Ala Hino
no flags Details
Task State (17.47 KB, image/png)
2017-04-04 14:44 UTC, Ala Hino
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1438575 0 unspecified CLOSED Live merge fails after preview of in-chain snapshot 2021-02-22 00:41:40 UTC
oVirt gerrit 75954 0 master MERGED vm: Detect when a block job is ready 2017-05-09 10:11:25 UTC
oVirt gerrit 76645 0 ovirt-4.1 MERGED vm: Detect when a block job is ready 2017-05-23 08:44:05 UTC

Internal Links: 1438575

Comment 1 Ala Hino 2017-04-04 14:40:54 UTC
Created attachment 1268683 [details]
Engine Log

Comment 2 Ala Hino 2017-04-04 14:42:05 UTC
Created attachment 1268684 [details]
io during failing live merge

Comment 3 Ala Hino 2017-04-04 14:43:15 UTC
Created attachment 1268685 [details]
VDSM Log

Comment 4 Ala Hino 2017-04-04 14:43:52 UTC
Created attachment 1268690 [details]
vdsm log 2nd error

Comment 5 Ala Hino 2017-04-04 14:44:20 UTC
Created attachment 1268700 [details]
Task State

Comment 6 Ala Hino 2017-04-04 14:48:57 UTC
Fix of this BZ will be based on handling libvirt events when pivot is ready.

Comment 7 Ala Hino 2017-04-19 09:45:10 UTC
*** Bug 1441941 has been marked as a duplicate of this bug. ***

Comment 8 Nir Soffer 2017-05-09 10:10:44 UTC
The attached patch (http://gerrit.ovirt.org/75954) does not fix the issue of slow
merge, I don't think we can fix the case when the vm is doing lot of io so the
merge never converge. Maybe this issue should be handled in qemu.

The patch does fix the issue of detecting when a block job is ready. Previously we
thought that the only way to detect this is using libvirt events, but with this
patch using libvirt events is an optimization that we should consider for future 
version, but for 4.1 we can use xml detection.

With this patch we can resolve this bug in 4.1.3.

Comment 9 Nir Soffer 2017-05-09 22:35:13 UTC
Removed patches copied when the patch was cloned, they are not relevant to this
bug.

Comment 10 Ala Hino 2017-05-16 08:19:59 UTC
(In reply to Nir Soffer from comment #8)
> The attached patch (http://gerrit.ovirt.org/75954) does not fix the issue of
> slow
> merge, I don't think we can fix the case when the vm is doing lot of io so
> the
> merge never converge. Maybe this issue should be handled in qemu.
> 
> The patch does fix the issue of detecting when a block job is ready.
> Previously we
> thought that the only way to detect this is using libvirt events, but with
> this
> patch using libvirt events is an optimization that we should consider for
> future 
> version, but for 4.1 we can use xml detection.
> 
> With this patch we can resolve this bug in 4.1.3.

Ack

Comment 13 Elad 2017-06-05 10:51:46 UTC
Nir, Ala, are the reproductions steps similar to https://bugzilla.redhat.com/show_bug.cgi?id=1376580#c30 ?

Comment 14 Ala Hino 2017-06-05 11:12:17 UTC
The bugs are related to pivot behavior; however, they are different.
This one fixes the logic used to determine when disk is ready for pivot.
The other one, nicely handles a use case where we try to do pivot while the disk isn't ready for pivot.
Actually, with this fix, we shouldn't encounter the previous bug.

Reproducing the "disk is not for ready pivot" isn't trivial and, somehow, we never were able to reproduce.

I believe that if you simply run live merge, the merge will successfully complete.

Comment 15 Elad 2017-06-05 11:49:50 UTC
Thanks Ala.

We're constantly executing the live merge test plan [1]. The latest execution with [2] ended with 100% success.


[1]
https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/wiki/Storage/3_5_Storage_Live_Merge

[2]
vdsm-4.19.17-1.el7ev.x86_64
libvirt-daemon-2.0.0-10.el7_3.9.x86_64
rhevm-4.1.3.1-0.1.el7.noarch


Note You need to log in before you can comment on or make changes to this bug.