Bug 1716951 - [downstream clone - 4.3.5] Highly Available (HA) VMs with a VM lease failed to start after a 4.1 to 4.2 upgrade.
Summary: [downstream clone - 4.3.5] Highly Available (HA) VMs with a VM lease failed t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.2.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.3.5
: 4.3.5
Assignee: Steven Rosenberg
QA Contact: Liran Rotenberg
URL:
Whiteboard:
Depends On: 1659574
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-04 12:58 UTC by RHV bug bot
Modified: 2022-03-13 17:05 UTC (History)
17 users (show)

Fixed In Version: ovirt-engine-4.3.5
Doc Type: Bug Fix
Doc Text:
Previously, when lease data was moved from the VM Static to the VM Dynamic DB table, there was no consideration that upgrading from 4.1 to later versions would leave the lease data empty when a lease storage domain ID had been specified. This caused validation to fail when the VM launched, so that the VM no longer ran without the user resetting the lease storage domain ID. Consequently, HA VMs with lease storage domain IDs failed to execute. This bug is now fixed, such that validation no longer takes place when the VM runs, and the lease data is automatically regenerated when the lease storage domain ID is set. After the lease data is regenerated, the VM has the information it needs to run. Now, after upgrading from 4.1 to later versions, HA VMs with lease storage domain IDs execute normally.
Clone Of: 1659574
Environment:
Last Closed: 2019-08-12 11:53:49 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:
lrotenbe: testing_plan_complete+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-43599 0 None None None 2021-09-09 15:35:45 UTC
Red Hat Knowledge Base (Solution) 3487811 0 None None RHV HA VMs fail to start, reporting "Invalid VM lease. Please note that it may take few minutes to create the lease." 2019-06-04 12:59:38 UTC
Red Hat Product Errata RHBA-2019:2432 0 None None None 2019-08-12 11:53:57 UTC
oVirt gerrit 100409 0 master MERGED engine: Upgrade HA VM lease failure during Launch 2019-06-04 12:59:38 UTC
oVirt gerrit 100549 0 ovirt-engine-4.3 MERGED engine: Upgrade HA VM lease failure during Launch 2019-06-04 17:16:23 UTC

Description RHV bug bot 2019-06-04 12:58:58 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1659574 +++
======================================================================

Description of problem:

After upgrade from 4.1 to 4.2 RHV HA VMs fail to start, reporting "Invalid VM lease. Please note that it may take few minutes to create the lease."

Version-Release number of selected component (if applicable):

RHV 4.2.7

How reproducible:

100% at this customer site

Steps to Reproduce:
1. Have HA VMs running with a VM Lease
2. Update from RHV 4.1 to 4.2
3. Try to reboot the VMs

Actual results:

"VMNAME: Cannot run VM. Invalid VM lease. Please note that it may take few minutes to create the lease."


Expected results:

VMs should start just fine after an upgrade of RHV.


Additional info:



An HA VM that has a VM lease should normally have a non-null lease_sd_id field in the vm_static table and a non-null lease_info field in the vm_dynamic table in the RHV database.

In this case, the lease_info field in the vm_dynamic table in the RHV database was empty.

(Originally by Frank DeLorey)

Comment 3 RHV bug bot 2019-06-04 12:59:04 UTC
logs?

(Originally by michal.skrivanek)

Comment 4 RHV bug bot 2019-06-04 12:59:06 UTC
I am one of the affected customers, ovirt-log-collector and a few sos reports are attached to case 02272652

(Originally by klaas)

Comment 5 RHV bug bot 2019-06-04 12:59:07 UTC
A simple workaround, until a proper fix is introduced, could be to change the storage domain that the lease is configured on for all the HA VMs with a lease that did not restart since the upgrade to 4.2 was made (i.e., those with pending configuration).

(Originally by Arik Hadas)

Comment 6 RHV bug bot 2019-06-04 12:59:10 UTC
If you only have one storage domain you can also just disable the lease and re-enable it after the first configuration change has finished.

For the devs: If any additional logs are needed feel free to reach out to me directly.

(Originally by klaas)

Comment 7 RHV bug bot 2019-06-04 12:59:12 UTC
Re-targeting to 4.3.1 since it is missing a patch, an acked blocker flag, or both

(Originally by Ryan Barry)

Comment 11 RHV bug bot 2019-06-04 12:59:19 UTC
probably related to the change early in 4.2 - https://gerrit.ovirt.org/#/c/86504/

(Originally by michal.skrivanek)

Comment 12 RHV bug bot 2019-06-04 12:59:20 UTC
(In reply to Michal Skrivanek from comment #11)
> probably related to the change early in 4.2 -
> https://gerrit.ovirt.org/#/c/86504/

I'll still be the one to blame, yet I think it is most likely a consequence of: https://gerrit.ovirt.org/#/c/79226/.
In 4.1 the lease_info was not stored on the engine side.
In 4.2 and above, we expect the lease_info to be stored in the database - VMs with a lease that were created in 4.1 will lack it so we should probably either recreate the lease or fetch the lease_info from the host.

(Originally by Arik Hadas)

Comment 13 RHV bug bot 2019-06-04 12:59:22 UTC
*** Bug 1697313 has been marked as a duplicate of this bug. ***

(Originally by Ryan Barry)

Comment 19 Liran Rotenberg 2019-06-26 12:57:20 UTC
Verified on:
ovirt-engine-4.3.5.1-0.1.el7.noarch

Steps:
1. Install RHV 4.1.
2. Create a VM with lease.
3. Upgrade to 4.2, then 4.3.
4. Reboot the VM.

Results:
The engine DB shows before upgrade:
engine=# select vm_guid,vm_name,lease_sd_id from vm_static;
               vm_guid                |      vm_name      |             lease_sd_id              
--------------------------------------+-------------------+--------------------------------------
 00000003-0003-0003-0003-0000000000be | Tiny              | 
 00000005-0005-0005-0005-0000000002e6 | Small             | 
 00000009-0009-0009-0009-0000000000f1 | Large             | 
 0000000b-000b-000b-000b-00000000021f | XLarge            | 
 00000007-0007-0007-0007-00000000010a | Medium            | 
 00000000-0000-0000-0000-000000000000 | Blank             | 
 3b8e7df0-1e33-4f71-96df-c117e67ca499 | HostedEngine      | 
 0f08e1cc-8f0d-45e2-9ad3-9cf71ed9d41e | VM-UP-HA-LEASE    | 4f27d1b6-cc9e-46f4-9cdd-3d000221f960
 1a657e28-aa6f-40e2-bc68-88591d9e1da8 | VM-HA             | 
 58032b43-f0d4-4d54-83e7-f2c492a0f405 | VM-44-HA-LEASE-UP | 4f27d1b6-cc9e-46f4-9cdd-3d000221f960
 9f4fcc08-c46f-4fc2-b561-0fbb1aea83fd | VM-44-HA          | 
 ae7a426c-7ec0-45e9-9a54-3826352c7953 | VM-44-HA-LEASE    | 4f27d1b6-cc9e-46f4-9cdd-3d000221f960
 bbc407da-98ee-44d3-8633-654d98d940f4 | VM-HA-LEASE       | 4f27d1b6-cc9e-46f4-9cdd-3d000221f960
 ffb38ace-62c0-4b0e-87e4-4d31c2ce0e81 | el76_guest_ga     | 
(14 rows)

engine=# select vm_guid, lease_info from vm_dynamic;
               vm_guid                | lease_info 
--------------------------------------+------------
 58032b43-f0d4-4d54-83e7-f2c492a0f405 | 
 9f4fcc08-c46f-4fc2-b561-0fbb1aea83fd | 
 0f08e1cc-8f0d-45e2-9ad3-9cf71ed9d41e | 
 ae7a426c-7ec0-45e9-9a54-3826352c7953 | 
 bbc407da-98ee-44d3-8633-654d98d940f4 | 
 1a657e28-aa6f-40e2-bc68-88591d9e1da8 | 
 3b8e7df0-1e33-4f71-96df-c117e67ca499 | 
(7 rows)


After the upgrade and reboot, the VMs rebooted successfully.
DB shows:
engine=# select vm_guid, lease_info from vm_dynamic;
               vm_guid                |                                                                           lease_info                                                                           
--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------
 58032b43-f0d4-4d54-83e7-f2c492a0f405 | 
 9f4fcc08-c46f-4fc2-b561-0fbb1aea83fd | 
 ae7a426c-7ec0-45e9-9a54-3826352c7953 | 
 3b8e7df0-1e33-4f71-96df-c117e67ca499 | 
 bbc407da-98ee-44d3-8633-654d98d940f4 | {                                                                                                                                                             +
                                      |   "path" : "/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Compute__NFS_GE_compute-he-6_nfs__0/4f27d1b6-cc9e-46f4-9cdd-3d000221f960/dom_md/xleases",+
                                      |   "offset" : "3145728"                                                                                                                                        +
                                      | }
 0f08e1cc-8f0d-45e2-9ad3-9cf71ed9d41e | {                                                                                                                                                             +
                                      |   "path" : "/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Compute__NFS_GE_compute-he-6_nfs__0/4f27d1b6-cc9e-46f4-9cdd-3d000221f960/dom_md/xleases",+
                                      |   "offset" : "4194304"                                                                                                                                        +
                                      | }
 1a657e28-aa6f-40e2-bc68-88591d9e1da8 | 
(7 rows)

As expected.

Comment 20 RHV bug bot 2019-06-27 11:39:54 UTC
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Project 'ovirt-engine'/Component 'vdsm' mismatch]

For more info please contact: rhv-devops

Comment 21 RHV bug bot 2019-06-27 11:48:43 UTC
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Project 'ovirt-engine'/Component 'vdsm' mismatch]

For more info please contact: rhv-devops

Comment 23 errata-xmlrpc 2019-08-12 11:53:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2432


Note You need to log in before you can comment on or make changes to this bug.