Bug 1802650 - [Docs][RFE][mvp-4.4] Update Upgrade Guide for 4.4 GA
Summary: [Docs][RFE][mvp-4.4] Update Upgrade Guide for 4.4 GA
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: Documentation
Version: 4.4.0
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ovirt-4.4.1-1
: 4.4.1
Assignee: Steve Goodman
QA Contact: Petr Matyáš
URL:
Whiteboard:
: 1841527 1866788 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-13 15:57 UTC by Steve Goodman
Modified: 2023-10-06 19:11 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-20 15:25:00 UTC
oVirt Team: Integration
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Steve Goodman 2020-02-13 15:57:55 UTC
Update RHV Upgrade Guide for RHV 4.4 GA

Comment 4 Steve Goodman 2020-04-01 07:41:17 UTC
This message from the 4.0 upgrade guide is relevant for the 4.4 upgrade as well:

[Important]
====
Backups can only be restored to environments of the same major release as that of the backup. For example, a backup of a Red Hat Virtualization version 4.0 environment can only be restored to another Red Hat Virtualization version 4.0 environment. To view the version of Red Hat Virtualization contained in a backup file, unpack the backup file and read the value in the version file located in the root directory of the unpacked files.
====

https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/html-single/administration_guide/index#sect-Backing_Up_and_Restoring_the_Red_Hat_Enterprise_Virtualization_Manager
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/html/upgrade_guide/upgrading_to_red_hat_virtualization_manager_4.0

Comment 5 Steve Goodman 2020-04-06 16:55:11 UTC
I understand that the 4.3 to 4.4 upgrade process is similar to the process of upgrading from 3.6 to 4.0 [1].

So I will use this as the basis for a rough draft.


[1] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/html-single/upgrade_guide/index#Upgrading_to_Red_Hat_Virtualization_Manager_4.0

Comment 6 Steve Goodman 2020-04-07 11:08:03 UTC
After speaking with Lukas, this is the outline of the procedure to migrate from RHV 4.3 to 4.4:

1. Migrate the engine from 4.3 to 4.4 using the procedure from RHEV 3.6 to RHV 4.0, as detailed in the 4.0 Upgrade Guide
2. Regarding hosts/VMs:
3. Pick a host to upgrade. Live migrate all VMs to another host in the same 4.3 cluster.
   a. Install RHVH 4.4 or EL 8 + host enabling repos etc, on the host you want to upgrade.
   b. Add this host to the Manager.
   c. At this point you can migrate VMs onto this host.
4. Repeat step 3 to migrate VMs and upgrade hosts for the rest of the hosts in the same cluster, one by one, until all are running 4.4 vdsm. Repeat for all other clusters in the environment.
5. Upgrade compatibility level in hosts to 4.4.

Comment 7 Steve Goodman 2020-04-07 11:11:49 UTC
Lukas,

(In reply to comment #4)

> [Important]
> ====
> Backups can only be restored to environments of the same major release as
> that of the backup. For example, a backup of a Red Hat Virtualization
> version 4.0 environment can only be restored to another Red Hat
> Virtualization version 4.0 environment. To view the version of Red Hat
> Virtualization contained in a backup file, unpack the backup file and read
> the value in the version file located in the root directory of the unpacked
> files.
> ====
> 

1. Does this mean that I can or cannot restore a backup from a RHV 4.3 environment to a RHV 4.4 environment?
2. What is the name of the 'version file located in the root directory of the unpacked files"?

Comment 8 Lukas Svaty 2020-04-07 13:55:47 UTC
1. You cannot, but you should (there is bug to enable it - https://bugzilla.redhat.com/show_bug.cgi?id=1812906) I would fail it as this log msg should be removed2
2. What? Sorry, no idea :) I guess another bug as this message should be more clear about it.

Both comments are actually on Sandro's team.

Comment 9 Steve Goodman 2020-04-12 07:23:43 UTC
After it's updated, I need to add a cross reference to the upgrade guide in common/admin/proc_Configuring_cluster_to_use_Q35_or_UEFI.adoc

Comment 10 Steve Goodman 2020-04-13 11:52:49 UTC
(In reply to Lukas Svaty from comment #8)
> 1. You cannot, but you should (there is bug to enable it -
> https://bugzilla.redhat.com/show_bug.cgi?id=1812906) I would fail it as this
> log msg should be removed2

I see that bug 1812906 is already on verified, so from the point of view of documentation, you can restore a backup from a RHV 4.3 environment to a RHV 4.4. environment.

Comment 11 Sandro Bonazzola 2020-04-14 12:32:46 UTC
(In reply to Lukas Svaty from comment #8)
> 1. You cannot, but you should (there is bug to enable it -
> https://bugzilla.redhat.com/show_bug.cgi?id=1812906) I would fail it as this
> log msg should be removed2

bug is verified, you should be able to restore a 4.3 backup on 4.4.

> 2. What? Sorry, no idea :) I guess another bug as this message should be
> more clear about it.

the name of the 'version file located in the root directory of the unpacked files"  is "version" and is located in the root directory of the unpacked files.



> 
> Both comments are actually on Sandro's team.

Comment 12 Steve Goodman 2020-04-16 19:53:39 UTC
Sandro, Martin, Yuval,

Please take a look at the draft [1] based on the 4.0 upgrade guide.

There are several open questions.

[1] https://docs.google.com/document/d/1hU2yPyaGyM_eYVl9jThGVG8HjaB3-a8NbJZMvAFC4bw/edit#

Comment 15 Steve Goodman 2020-05-11 14:05:54 UTC
From bug 1828931#c6 (https://bugzilla.redhat.com/show_bug.cgi?id=1828931#c6)

Martin Perina 2020-05-11 13:20:20 UTC

(In reply to Eli Mesika from comment #5)
> This paragraph should be added to the release doc:
> 
> 
> For remote vacuuming if you got errors like 'permission denied for schema
> pg_temp_XX' please do the following:
>      1) log in into the remote database machine
>      2) run
>            psql  -U <db-admin-role> -Atc \"select 'drop schema if exists '
> || nspname || ' cascade;'
>            from (select distinct nspname from pg_class join pg_namespace on
> (relnamespace=pg_namespace.oid)
>            where pg_is_other_temp_schema(relnamespace)) as foo\" <engine
> database name > <temporary file>
>      3) run
>            psql  -U <db-admin-role> -f <temporary file>
>      4) try to run engine-vacuum again"

Steven, we should also add this as a step into 4.4 upgrade guide to upgrade remote database chapters. Probably after restoring 4.3 database backup and upgrading this database from 10.6 to 12 and before running engine-setup.

Comment 18 Lukas Svaty 2020-05-25 15:21:28 UTC
clearing needinfo

Comment 19 Michal Skrivanek 2020-05-28 10:14:23 UTC
lgtm

Comment 20 Martin Tessun 2020-05-29 08:32:31 UTC
Added some comments for Part I - still working on the rest.

Anyways I would suggest that QE does run this documented procedure step-by-step to ensure there are no issues.
For that adding needinfo on Lukas.

Comment 21 Sandro Bonazzola 2020-06-09 11:57:47 UTC
looks good to me

Comment 22 Lukas Svaty 2020-06-19 10:00:02 UTC
For that, we have QE assigned as QE contact :) The steps would be automated for regression testing.

Comment 23 Steve Goodman 2020-06-29 14:51:44 UTC
*** Bug 1841527 has been marked as a duplicate of this bug. ***

Comment 24 Sandro Bonazzola 2020-07-09 07:24:39 UTC
Let's be sure to add a note about https://bugzilla.redhat.com/show_bug.cgi?id=1853225 issue: global maintenance must be handled selecting the 4.4 host once upgraded to 4.4 otherwise the 4.3 Hosted Engine will be started again.

Comment 25 Steve Goodman 2020-07-09 12:58:30 UTC
(In reply to Sandro Bonazzola from comment #24)
> Let's be sure to add a note about
> https://bugzilla.redhat.com/show_bug.cgi?id=1853225 issue: global
> maintenance must be handled selecting the 4.4 host once upgraded to 4.4
> otherwise the 4.3 Hosted Engine will be started again.

In the Upgrade Guide we tell users to use the CLI to enable/disable global maintenance mode:  
----
1. Log in to the Manager virtual machine and shut it down.

2. Log in to one of the self-hosted engine nodes and disable global maintenance mode:

   # hosted-engine --set-maintenance --mode=none

   When you exit global maintenance mode, ovirt-ha-agent starts the Manager virtual machine, and then the Manager automatically starts. It can take up to ten minutes for the Manager to start. 
----


I propose changing step 2 like so:

----
2. Log in to a self-hosted engine node --> that has the 4.4 engine running on it <-- and disable global maintenance mode:

   # hosted-engine --set-maintenance --mode=none

   When you exit global maintenance mode, ovirt-ha-agent starts the Manager virtual machine, and then the Manager automatically starts. It can take up to ten minutes for the Manager to start. 
 
  [NOTE]
  ====
  Make sure you log into the self-hosted engine node with the 4.4 engine.
  If you are logged into a self-hosted engine node with the 4.3 engine running
  on it when you disable global maintenance mode, the 4.3 hosted engine starts again.
  ====

 ----

Comment 26 Sandro Bonazzola 2020-07-09 13:21:00 UTC
Looks good to me

Comment 27 Steve Goodman 2020-07-13 10:42:41 UTC
Pavol,

This needs QE testing. Especially the part about SHE upgrade.

Comment 28 Nikolai Sednev 2020-07-13 13:14:11 UTC
(In reply to Steve Goodman from comment #25)
> (In reply to Sandro Bonazzola from comment #24)
> > Let's be sure to add a note about
> > https://bugzilla.redhat.com/show_bug.cgi?id=1853225 issue: global
> > maintenance must be handled selecting the 4.4 host once upgraded to 4.4
> > otherwise the 4.3 Hosted Engine will be started again.
> 
> In the Upgrade Guide we tell users to use the CLI to enable/disable global
> maintenance mode:  
> ----
> 1. Log in to the Manager virtual machine and shut it down.
> 
> 2. Log in to one of the self-hosted engine nodes and disable global
> maintenance mode:
> 
>    # hosted-engine --set-maintenance --mode=none
> 
>    When you exit global maintenance mode, ovirt-ha-agent starts the Manager
> virtual machine, and then the Manager automatically starts. It can take up
> to ten minutes for the Manager to start. 
> ----
> 
> 
> I propose changing step 2 like so:
> 
> ----
> 2. Log in to a self-hosted engine node --> that has the 4.4 engine running
> on it <-- and disable global maintenance mode:
> 
>    # hosted-engine --set-maintenance --mode=none
> 
>    When you exit global maintenance mode, ovirt-ha-agent starts the Manager
> virtual machine, and then the Manager automatically starts. It can take up
> to ten minutes for the Manager to start. 
>  
>   [NOTE]
>   ====
>   Make sure you log into the self-hosted engine node with the 4.4 engine.
>   If you are logged into a self-hosted engine node with the 4.3 engine
> running
>   on it when you disable global maintenance mode, the 4.3 hosted engine
> starts again.
>   ====
> 
>  ----

I don't think that we should cast global maintenance from CLI, it's totally possible to use UI and from there to enable global maintenance in 4.3. 
Then to stop the engine service on engine's VM (HE-VM).
Then backup.
Then copy the backup to safe place.
Then to reprovision the host on which HE-VM was running.
Install all HE packages on reprovisioned to RHEL8.2 host.
Copy backup file to RHEL8.2 host.
Run restore from CLI on RHEL8.2 host.
Once completed, either from CLI or UI, disable global maintenance ONLY from RHEL8.2 ha-host (from UI customer has to click ONLY and SPECIFICALLY on RHEL8.2 ha-host to get option of disabling global maintenance, customer must not use 4.3 ha-hosts for disabling global maintenance from UI in no circumstance), another option is to finish upgrading all other hosts from RHEL7.8 to RHEL8.2 and reattaching them back to the engine, whil there is still global maintenance, so only when all ha-hosts were upgraded to RHEL8.2, remove global maintenance.  

Ath the moment I can't verify the flow as 4.3 deployment is blocked by lvm2 broken dependency.

Comment 29 Steve Goodman 2020-07-14 10:19:09 UTC
(In reply to Nikolai Sednev from comment #28)
> I don't think that we should cast global maintenance from CLI, it's totally
> possible to use UI and from there to enable global maintenance in 4.3. 
> Then to stop the engine service on engine's VM (HE-VM).

Do you suggest that this be done from the CLI, as currently documented?

> Run restore from CLI on RHEL8.2 host.
> Once completed, either from CLI or UI, disable global maintenance ONLY from
> RHEL8.2 ha-host (from UI customer has to click ONLY and SPECIFICALLY on
> RHEL8.2 ha-host to get option of disabling global maintenance, customer must
> not use 4.3 ha-hosts for disabling global maintenance from UI in no
> circumstance),

Let me make sure I understand:
Once the restore is complete, the engine automatically starts. You then log in to the Admin Portal on the RHEL8.2 host, i.e. the host that you just reprovisioned, and disable global maintenance from the Admin Portal.

> another option is to finish upgrading all other hosts from
> RHEL7.8 to RHEL8.2 and reattaching them back to the engine, whil there is
> still global maintenance, so only when all ha-hosts were upgraded to
> RHEL8.2, remove global maintenance.  

Would the second option require more downtime for the VMs running in the environment on all those other hosts? If so, then I think the first option is better.

Do you recommend one option over the other?

Comment 30 Nikolai Sednev 2020-07-14 11:15:36 UTC
(In reply to Steve Goodman from comment #29)
> (In reply to Nikolai Sednev from comment #28)
> > I don't think that we should cast global maintenance from CLI, it's totally
> > possible to use UI and from there to enable global maintenance in 4.3. 
> > Then to stop the engine service on engine's VM (HE-VM).
> 
> Do you suggest that this be done from the CLI, as currently documented?
I'd add both ways as they're both legit and working.
> 
> > Run restore from CLI on RHEL8.2 host.
> > Once completed, either from CLI or UI, disable global maintenance ONLY from
> > RHEL8.2 ha-host (from UI customer has to click ONLY and SPECIFICALLY on
> > RHEL8.2 ha-host to get option of disabling global maintenance, customer must
> > not use 4.3 ha-hosts for disabling global maintenance from UI in no
> > circumstance),
> 
> Let me make sure I understand:
> Once the restore is complete, the engine automatically starts. You then log
> in to the Admin Portal on the RHEL8.2 host, i.e. the host that you just
> reprovisioned, and disable global maintenance from the Admin Portal.
> 
Rephrasing is correct. Please pay attention that you have to disable GM ONLY by clicking on RHEL8.2 ha-host from UI or CLI, otherwise you'll hit https://bugzilla.redhat.com/show_bug.cgi?id=1853225.
> > another option is to finish upgrading all other hosts from
> > RHEL7.8 to RHEL8.2 and reattaching them back to the engine, whil there is
> > still global maintenance, so only when all ha-hosts were upgraded to
> > RHEL8.2, remove global maintenance.  
> 
> Would the second option require more downtime for the VMs running in the
> environment on all those other hosts? If so, then I think the first option
> is better.
When done properly, no downtime is required for guest-vms, they all should be manually migrated to 4.3 ha-hosts, prior to reprovisioning the ha-host running the 4.3 engine. 
> 
> Do you recommend one option over the other?
I'd recommend the second option as there are less chances for making changes, leading to undesirable results.

Comment 31 Petr Matyáš 2020-07-16 10:01:13 UTC
Posted comments in the merge request. Mostly it looks good.

Comment 32 Petr Matyáš 2020-07-29 10:22:23 UTC
Documentation looks good.

Just noticed some of my comments were for commented out parts and the other ones are actually just small nitpicks, so I'll just verify.

Comment 33 Steve Goodman 2020-07-29 14:50:00 UTC
> Martin Perina 2020-05-11 13:20:20 UTC
> 
> (In reply to Eli Mesika from comment #5)
> > This paragraph should be added to the release doc:
> > 
> > 
> > For remote vacuuming if you got errors like 'permission denied for schema
> > pg_temp_XX' please do the following:
> >      1) log in into the remote database machine
> >      2) run
> >            psql  -U <db-admin-role> -Atc \"select 'drop schema if exists '
> > || nspname || ' cascade;'
> >            from (select distinct nspname from pg_class join pg_namespace on
> > (relnamespace=pg_namespace.oid)
> >            where pg_is_other_temp_schema(relnamespace)) as foo\" <engine
> > database name > <temporary file>
> >      3) run
> >            psql  -U <db-admin-role> -f <temporary file>
> >      4) try to run engine-vacuum again"
Martin,

1. When do you run engine-vacuum during the upgrade process? We don't discuss this for the 4.4 upgrade.
2. How does the user know the value for all these variables?

I find this very confusing in the context of the 4.4 upgrade. Can you please clarify?

Comment 34 Steve Goodman 2020-07-29 14:50:39 UTC
Moving to Modified for peer review.

Comment 35 Sandro Bonazzola 2020-08-12 12:39:04 UTC
Can this be moved to ON_QA?

Comment 36 Steve Goodman 2020-08-16 07:09:22 UTC
Peer review complete, comments implemented.

Implemented Nikolai's comments. Nikolai, please review in this new merge request:

https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1750

There is a link to the preview of the Upgrade Guide at the bottom of the merge request.

Comment 37 Steve Goodman 2020-08-16 07:13:45 UTC
*** Bug 1866788 has been marked as a duplicate of this bug. ***

Comment 38 Nikolai Sednev 2020-08-20 05:00:19 UTC
(In reply to Steve Goodman from comment #36)
> Peer review complete, comments implemented.
> 
> Implemented Nikolai's comments. Nikolai, please review in this new merge
> request:
> 
> https://gitlab.cee.redhat.com/rhci-documentation/docs-
> Red_Hat_Enterprise_Virtualization/-/merge_requests/1750
> 
> There is a link to the preview of the Upgrade Guide at the bottom of the
> merge request.

Ack.

Comment 39 Steve Goodman 2020-08-20 15:22:41 UTC
Merged.


Note You need to log in before you can comment on or make changes to this bug.