Update RHV Upgrade Guide for RHV 4.4 GA
This message from the 4.0 upgrade guide is relevant for the 4.4 upgrade as well: [Important] ==== Backups can only be restored to environments of the same major release as that of the backup. For example, a backup of a Red Hat Virtualization version 4.0 environment can only be restored to another Red Hat Virtualization version 4.0 environment. To view the version of Red Hat Virtualization contained in a backup file, unpack the backup file and read the value in the version file located in the root directory of the unpacked files. ==== https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/html-single/administration_guide/index#sect-Backing_Up_and_Restoring_the_Red_Hat_Enterprise_Virtualization_Manager https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/html/upgrade_guide/upgrading_to_red_hat_virtualization_manager_4.0
I understand that the 4.3 to 4.4 upgrade process is similar to the process of upgrading from 3.6 to 4.0 [1]. So I will use this as the basis for a rough draft. [1] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.0/html-single/upgrade_guide/index#Upgrading_to_Red_Hat_Virtualization_Manager_4.0
After speaking with Lukas, this is the outline of the procedure to migrate from RHV 4.3 to 4.4: 1. Migrate the engine from 4.3 to 4.4 using the procedure from RHEV 3.6 to RHV 4.0, as detailed in the 4.0 Upgrade Guide 2. Regarding hosts/VMs: 3. Pick a host to upgrade. Live migrate all VMs to another host in the same 4.3 cluster. a. Install RHVH 4.4 or EL 8 + host enabling repos etc, on the host you want to upgrade. b. Add this host to the Manager. c. At this point you can migrate VMs onto this host. 4. Repeat step 3 to migrate VMs and upgrade hosts for the rest of the hosts in the same cluster, one by one, until all are running 4.4 vdsm. Repeat for all other clusters in the environment. 5. Upgrade compatibility level in hosts to 4.4.
Lukas, (In reply to comment #4) > [Important] > ==== > Backups can only be restored to environments of the same major release as > that of the backup. For example, a backup of a Red Hat Virtualization > version 4.0 environment can only be restored to another Red Hat > Virtualization version 4.0 environment. To view the version of Red Hat > Virtualization contained in a backup file, unpack the backup file and read > the value in the version file located in the root directory of the unpacked > files. > ==== > 1. Does this mean that I can or cannot restore a backup from a RHV 4.3 environment to a RHV 4.4 environment? 2. What is the name of the 'version file located in the root directory of the unpacked files"?
1. You cannot, but you should (there is bug to enable it - https://bugzilla.redhat.com/show_bug.cgi?id=1812906) I would fail it as this log msg should be removed2 2. What? Sorry, no idea :) I guess another bug as this message should be more clear about it. Both comments are actually on Sandro's team.
After it's updated, I need to add a cross reference to the upgrade guide in common/admin/proc_Configuring_cluster_to_use_Q35_or_UEFI.adoc
(In reply to Lukas Svaty from comment #8) > 1. You cannot, but you should (there is bug to enable it - > https://bugzilla.redhat.com/show_bug.cgi?id=1812906) I would fail it as this > log msg should be removed2 I see that bug 1812906 is already on verified, so from the point of view of documentation, you can restore a backup from a RHV 4.3 environment to a RHV 4.4. environment.
(In reply to Lukas Svaty from comment #8) > 1. You cannot, but you should (there is bug to enable it - > https://bugzilla.redhat.com/show_bug.cgi?id=1812906) I would fail it as this > log msg should be removed2 bug is verified, you should be able to restore a 4.3 backup on 4.4. > 2. What? Sorry, no idea :) I guess another bug as this message should be > more clear about it. the name of the 'version file located in the root directory of the unpacked files" is "version" and is located in the root directory of the unpacked files. > > Both comments are actually on Sandro's team.
Sandro, Martin, Yuval, Please take a look at the draft [1] based on the 4.0 upgrade guide. There are several open questions. [1] https://docs.google.com/document/d/1hU2yPyaGyM_eYVl9jThGVG8HjaB3-a8NbJZMvAFC4bw/edit#
From bug 1828931#c6 (https://bugzilla.redhat.com/show_bug.cgi?id=1828931#c6) Martin Perina 2020-05-11 13:20:20 UTC (In reply to Eli Mesika from comment #5) > This paragraph should be added to the release doc: > > > For remote vacuuming if you got errors like 'permission denied for schema > pg_temp_XX' please do the following: > 1) log in into the remote database machine > 2) run > psql -U <db-admin-role> -Atc \"select 'drop schema if exists ' > || nspname || ' cascade;' > from (select distinct nspname from pg_class join pg_namespace on > (relnamespace=pg_namespace.oid) > where pg_is_other_temp_schema(relnamespace)) as foo\" <engine > database name > <temporary file> > 3) run > psql -U <db-admin-role> -f <temporary file> > 4) try to run engine-vacuum again" Steven, we should also add this as a step into 4.4 upgrade guide to upgrade remote database chapters. Probably after restoring 4.3 database backup and upgrading this database from 10.6 to 12 and before running engine-setup.
clearing needinfo
lgtm
Added some comments for Part I - still working on the rest. Anyways I would suggest that QE does run this documented procedure step-by-step to ensure there are no issues. For that adding needinfo on Lukas.
looks good to me
For that, we have QE assigned as QE contact :) The steps would be automated for regression testing.
*** Bug 1841527 has been marked as a duplicate of this bug. ***
Let's be sure to add a note about https://bugzilla.redhat.com/show_bug.cgi?id=1853225 issue: global maintenance must be handled selecting the 4.4 host once upgraded to 4.4 otherwise the 4.3 Hosted Engine will be started again.
(In reply to Sandro Bonazzola from comment #24) > Let's be sure to add a note about > https://bugzilla.redhat.com/show_bug.cgi?id=1853225 issue: global > maintenance must be handled selecting the 4.4 host once upgraded to 4.4 > otherwise the 4.3 Hosted Engine will be started again. In the Upgrade Guide we tell users to use the CLI to enable/disable global maintenance mode: ---- 1. Log in to the Manager virtual machine and shut it down. 2. Log in to one of the self-hosted engine nodes and disable global maintenance mode: # hosted-engine --set-maintenance --mode=none When you exit global maintenance mode, ovirt-ha-agent starts the Manager virtual machine, and then the Manager automatically starts. It can take up to ten minutes for the Manager to start. ---- I propose changing step 2 like so: ---- 2. Log in to a self-hosted engine node --> that has the 4.4 engine running on it <-- and disable global maintenance mode: # hosted-engine --set-maintenance --mode=none When you exit global maintenance mode, ovirt-ha-agent starts the Manager virtual machine, and then the Manager automatically starts. It can take up to ten minutes for the Manager to start. [NOTE] ==== Make sure you log into the self-hosted engine node with the 4.4 engine. If you are logged into a self-hosted engine node with the 4.3 engine running on it when you disable global maintenance mode, the 4.3 hosted engine starts again. ==== ----
Looks good to me
Pavol, This needs QE testing. Especially the part about SHE upgrade.
(In reply to Steve Goodman from comment #25) > (In reply to Sandro Bonazzola from comment #24) > > Let's be sure to add a note about > > https://bugzilla.redhat.com/show_bug.cgi?id=1853225 issue: global > > maintenance must be handled selecting the 4.4 host once upgraded to 4.4 > > otherwise the 4.3 Hosted Engine will be started again. > > In the Upgrade Guide we tell users to use the CLI to enable/disable global > maintenance mode: > ---- > 1. Log in to the Manager virtual machine and shut it down. > > 2. Log in to one of the self-hosted engine nodes and disable global > maintenance mode: > > # hosted-engine --set-maintenance --mode=none > > When you exit global maintenance mode, ovirt-ha-agent starts the Manager > virtual machine, and then the Manager automatically starts. It can take up > to ten minutes for the Manager to start. > ---- > > > I propose changing step 2 like so: > > ---- > 2. Log in to a self-hosted engine node --> that has the 4.4 engine running > on it <-- and disable global maintenance mode: > > # hosted-engine --set-maintenance --mode=none > > When you exit global maintenance mode, ovirt-ha-agent starts the Manager > virtual machine, and then the Manager automatically starts. It can take up > to ten minutes for the Manager to start. > > [NOTE] > ==== > Make sure you log into the self-hosted engine node with the 4.4 engine. > If you are logged into a self-hosted engine node with the 4.3 engine > running > on it when you disable global maintenance mode, the 4.3 hosted engine > starts again. > ==== > > ---- I don't think that we should cast global maintenance from CLI, it's totally possible to use UI and from there to enable global maintenance in 4.3. Then to stop the engine service on engine's VM (HE-VM). Then backup. Then copy the backup to safe place. Then to reprovision the host on which HE-VM was running. Install all HE packages on reprovisioned to RHEL8.2 host. Copy backup file to RHEL8.2 host. Run restore from CLI on RHEL8.2 host. Once completed, either from CLI or UI, disable global maintenance ONLY from RHEL8.2 ha-host (from UI customer has to click ONLY and SPECIFICALLY on RHEL8.2 ha-host to get option of disabling global maintenance, customer must not use 4.3 ha-hosts for disabling global maintenance from UI in no circumstance), another option is to finish upgrading all other hosts from RHEL7.8 to RHEL8.2 and reattaching them back to the engine, whil there is still global maintenance, so only when all ha-hosts were upgraded to RHEL8.2, remove global maintenance. Ath the moment I can't verify the flow as 4.3 deployment is blocked by lvm2 broken dependency.
(In reply to Nikolai Sednev from comment #28) > I don't think that we should cast global maintenance from CLI, it's totally > possible to use UI and from there to enable global maintenance in 4.3. > Then to stop the engine service on engine's VM (HE-VM). Do you suggest that this be done from the CLI, as currently documented? > Run restore from CLI on RHEL8.2 host. > Once completed, either from CLI or UI, disable global maintenance ONLY from > RHEL8.2 ha-host (from UI customer has to click ONLY and SPECIFICALLY on > RHEL8.2 ha-host to get option of disabling global maintenance, customer must > not use 4.3 ha-hosts for disabling global maintenance from UI in no > circumstance), Let me make sure I understand: Once the restore is complete, the engine automatically starts. You then log in to the Admin Portal on the RHEL8.2 host, i.e. the host that you just reprovisioned, and disable global maintenance from the Admin Portal. > another option is to finish upgrading all other hosts from > RHEL7.8 to RHEL8.2 and reattaching them back to the engine, whil there is > still global maintenance, so only when all ha-hosts were upgraded to > RHEL8.2, remove global maintenance. Would the second option require more downtime for the VMs running in the environment on all those other hosts? If so, then I think the first option is better. Do you recommend one option over the other?
(In reply to Steve Goodman from comment #29) > (In reply to Nikolai Sednev from comment #28) > > I don't think that we should cast global maintenance from CLI, it's totally > > possible to use UI and from there to enable global maintenance in 4.3. > > Then to stop the engine service on engine's VM (HE-VM). > > Do you suggest that this be done from the CLI, as currently documented? I'd add both ways as they're both legit and working. > > > Run restore from CLI on RHEL8.2 host. > > Once completed, either from CLI or UI, disable global maintenance ONLY from > > RHEL8.2 ha-host (from UI customer has to click ONLY and SPECIFICALLY on > > RHEL8.2 ha-host to get option of disabling global maintenance, customer must > > not use 4.3 ha-hosts for disabling global maintenance from UI in no > > circumstance), > > Let me make sure I understand: > Once the restore is complete, the engine automatically starts. You then log > in to the Admin Portal on the RHEL8.2 host, i.e. the host that you just > reprovisioned, and disable global maintenance from the Admin Portal. > Rephrasing is correct. Please pay attention that you have to disable GM ONLY by clicking on RHEL8.2 ha-host from UI or CLI, otherwise you'll hit https://bugzilla.redhat.com/show_bug.cgi?id=1853225. > > another option is to finish upgrading all other hosts from > > RHEL7.8 to RHEL8.2 and reattaching them back to the engine, whil there is > > still global maintenance, so only when all ha-hosts were upgraded to > > RHEL8.2, remove global maintenance. > > Would the second option require more downtime for the VMs running in the > environment on all those other hosts? If so, then I think the first option > is better. When done properly, no downtime is required for guest-vms, they all should be manually migrated to 4.3 ha-hosts, prior to reprovisioning the ha-host running the 4.3 engine. > > Do you recommend one option over the other? I'd recommend the second option as there are less chances for making changes, leading to undesirable results.
Posted comments in the merge request. Mostly it looks good.
Documentation looks good. Just noticed some of my comments were for commented out parts and the other ones are actually just small nitpicks, so I'll just verify.
> Martin Perina 2020-05-11 13:20:20 UTC > > (In reply to Eli Mesika from comment #5) > > This paragraph should be added to the release doc: > > > > > > For remote vacuuming if you got errors like 'permission denied for schema > > pg_temp_XX' please do the following: > > 1) log in into the remote database machine > > 2) run > > psql -U <db-admin-role> -Atc \"select 'drop schema if exists ' > > || nspname || ' cascade;' > > from (select distinct nspname from pg_class join pg_namespace on > > (relnamespace=pg_namespace.oid) > > where pg_is_other_temp_schema(relnamespace)) as foo\" <engine > > database name > <temporary file> > > 3) run > > psql -U <db-admin-role> -f <temporary file> > > 4) try to run engine-vacuum again" Martin, 1. When do you run engine-vacuum during the upgrade process? We don't discuss this for the 4.4 upgrade. 2. How does the user know the value for all these variables? I find this very confusing in the context of the 4.4 upgrade. Can you please clarify?
Moving to Modified for peer review.
Can this be moved to ON_QA?
Peer review complete, comments implemented. Implemented Nikolai's comments. Nikolai, please review in this new merge request: https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1750 There is a link to the preview of the Upgrade Guide at the bottom of the merge request.
*** Bug 1866788 has been marked as a duplicate of this bug. ***
(In reply to Steve Goodman from comment #36) > Peer review complete, comments implemented. > > Implemented Nikolai's comments. Nikolai, please review in this new merge > request: > > https://gitlab.cee.redhat.com/rhci-documentation/docs- > Red_Hat_Enterprise_Virtualization/-/merge_requests/1750 > > There is a link to the preview of the Upgrade Guide at the bottom of the > merge request. Ack.
Merged.
Published. https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html-single/upgrade_guide/index