Bug 1544853 - Detect and fix broken volume leases
Summary: Detect and fix broken volume leases
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.20.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.2.3
: ---
Assignee: Ala Hino
QA Contact: Yosi Ben Shimon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-13 15:56 UTC by Ala Hino
Modified: 2018-05-10 06:30 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Running legacy cold merge, when the cluster level <= 4.0, ends with broken volume leases, because the volumes are renamed but not their corresponding leases Consequence: When upgrading to ovirt >= 4.1, they will not be able to perform storage operations that acquire volumes lease, e.g. new cold merge, copy disk, etc Fix: Run the check-volume-leases tool to detect and fix broken leases Result: Broken leases will be fixed
Clone Of:
Environment:
Last Closed: 2018-05-10 06:30:28 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: exception+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1538472 0 high CLOSED After cold merge in version earlier than 4.1, cold merge will fail in versions greater or equal 4.1 2021-02-22 00:41:40 UTC
Red Hat Knowledge Base (Solution) 3351331 0 None None None 2018-03-04 23:30:25 UTC
oVirt gerrit 88057 0 master MERGED doc: Add documentation to check-volume-leases command 2021-02-17 15:54:55 UTC
oVirt gerrit 88058 0 master ABANDONED [WIP]api: Introduce StorageDomain.check_volumes_lease API 2021-02-17 15:54:54 UTC
oVirt gerrit 88710 0 master MERGED vol-info: Add lease path and offset to volume info 2021-02-17 15:54:55 UTC
oVirt gerrit 88776 0 master MERGED tool: Add check-volume-leases command 2021-02-17 15:54:55 UTC
oVirt gerrit 89210 0 master MERGED tool: Extract the confirm helper into a separate module 2021-02-17 15:54:55 UTC
oVirt gerrit 89671 0 ovirt-4.2 MERGED doc: Add documentation to check-volume-leases command 2021-02-17 15:54:55 UTC
oVirt gerrit 89672 0 ovirt-4.2 MERGED vol-info: Add lease path and offset to volume info 2021-02-17 15:54:55 UTC
oVirt gerrit 89673 0 ovirt-4.2 MERGED tool: Extract the confirm helper into a separate module 2021-02-17 15:54:55 UTC
oVirt gerrit 89674 0 ovirt-4.2 MERGED tool: Add check-volume-leases command 2021-02-17 15:54:55 UTC

Internal Links: 1538472

Description Ala Hino 2018-02-13 15:56:37 UTC
Description of problem:
Some storage operations, e,g, old (legacy) cold merge (DC <= 4.0), break the volume leases, because those operations rename the volumes but not their corresponding leases.

Broken leases will prevent performing storage jobs after the environment is upgraded.

We want to provide an option (Vdsm API) to detect and fix existing broken leases.

This bug is related to bug 1538472. However, while the fix of bug 1538472 does rename leases after renaming volumes, that fix doesn't affect this one as this one is about invalid *existing* leases.

Comment 1 Nir Soffer 2018-02-13 17:57:18 UTC
Note that this issue is very old, probably since cold merge was introduced. The
issue was discovered in 4.1, since we started to use volume leases for storage
operations.

Since 4.1 volume leases are not effected by cold merge since are not renaming
volumes any more during cold merged.

Comment 2 Germano Veit Michel 2018-02-22 01:44:38 UTC
Thank you Ala!

Just one question: are there any plans to automate fixing this by the engine making use of this new vdsm api call?

Because otherwise we need a how-to KCS Solution/Article, I can write it.

Comment 3 Ala Hino 2018-02-22 09:28:48 UTC
Hi Germano,

For now, there is no plan to add a support in the engine to call this new api.
I believe that we will have such support in the future.

Suggest to write a KCS Solution/Article for this.

Comment 4 Germano Veit Michel 2018-02-23 01:39:56 UTC
(In reply to Ala Hino from comment #3)
> Hi Germano,
> 
> For now, there is no plan to add a support in the engine to call this new
> api.
> I believe that we will have such support in the future.
> 
> Suggest to write a KCS Solution/Article for this.

Hi Ala,

Does this look good to you?
https://access.redhat.com/solutions/3362171

I tried to add some steps for the user to find out the Storage Domain IDs without going through the DB.

Thank you

Comment 5 Ala Hino 2018-02-28 09:03:10 UTC
(In reply to Germano Veit Michel from comment #4)
> (In reply to Ala Hino from comment #3)
> > Hi Germano,
> > 
> > For now, there is no plan to add a support in the engine to call this new
> > api.
> > I believe that we will have such support in the future.
> > 
> > Suggest to write a KCS Solution/Article for this.
> 
> Hi Ala,
> 
> Does this look good to you?
> https://access.redhat.com/solutions/3362171

Looks very good.

> 
> I tried to add some steps for the user to find out the Storage Domain IDs
> without going through the DB.
> 
> Thank you

Please note that currently I am checking the option to provide a standalone script, i.e. can be executed without vdsm-client.
This may be simpler to execute and maintain.

I will update ASAP about the direction we choose to go with.

Comment 6 Germano Veit Michel 2018-03-04 23:31:49 UTC
(In reply to Ala Hino from comment #5)
> Please note that currently I am checking the option to provide a standalone
> script, i.e. can be executed without vdsm-client.
> This may be simpler to execute and maintain.

This would be nice. Please raise needinfo on me once we can modify the KCS solution with simpler steps, if possible.

Thanks

Comment 7 Yosi Ben Shimon 2018-04-17 15:09:26 UTC
Hi Ala,
What are the steps to reproduce?
According to https://bugzilla.redhat.com/show_bug.cgi?id=1538472#c3
The bug is already verified for 4.1.
I can verify it using the same steps to 4.2 but from which version of DC? 4.0 / 4.1 ?

Thank you

Comment 8 Ala Hino 2018-04-17 19:52:12 UTC
Hi Yosi,

Exactly same steps. You have to upgrade from DC 4.0. Broken leases exists only in DC <= 4.0.

Comment 9 Yosi Ben Shimon 2018-04-22 09:05:50 UTC
Verified using the steps from https://bugzilla.redhat.com/show_bug.cgi?id=1538472#c3.
DC upgraded from 4.0 to 4.2.

Works fine.

Moving to VERIFIED

Comment 10 Sandro Bonazzola 2018-05-10 06:30:28 UTC
This bugzilla is included in oVirt 4.2.3 release, published on May 4th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.