Bug 1946944 - Document machine type requirements for live migration in mixed RHEL setups
Summary: Document machine type requirements for live migration in mixed RHEL setups
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Irina
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-07 10:14 UTC by Kashyap Chamarthy
Modified: 2022-02-02 16:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-02-02 16:26:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-1903 0 None None None 2022-01-12 10:57:53 UTC

Description Kashyap Chamarthy 2021-04-07 10:14:13 UTC
Summary: 

Assume an OSP-17 environment based on RHEL-9, and there are two mixed 
RHEL compute nodes, running RHEL-9.1 and RHEL-9.2.  Now, if you want to
live-migrate an instance in *both* directions — i.e. from RHEL-9.1 to
RHEL-9.2 and vice-versa — then you need to "pin" the machine type
version that matches the older RHEL release (i.e. RHEL-9.1).

                      * * *

Here's some draft text from an old upstream document[1] that applies to OSP-16; but needs to modified for OSP-17, and be included in official docs:


Live migration and versioned machine types
-----------------------------------------

Today, TripleO explicitly specifies[1] versioned machine types -- to
ensure migration compatibility during upgrade windows.  E.g. to quote
TripleO's config for machine types:

    ...
    NovaHWMachineType:
      description: >
        To specify a default machine type per host architecture.
    default: 'x86_64=pc-i440fx-rhel7.6.0,aarch64=virt-rhel7.6.0,ppc64=pseries-rhel7.6.0,ppc64le=pseries-rhel7.6.0'
    type: string
    ...

Why?
----

Let's take a real-world example to make the point.  (I'm using the Red
Hat-based OpenStack, RHOS, as an example for ease of explanation.  I'm
sticking with "RHEL" here, otherwise the terminology gets muddied — the
versioned machine types reported by CentOS's QEMU binary have 'rhel' in
their names, because CentOS ships RHEL's QEMU binary as-is, minus the
branding.  You could replace "RHEL" with "CentOS", and the following
example would be true.) 

Back to the example:

  - Assume that a RHOS deployment is running on a RHEL 7.6 host, the
    Nova instances will be given a RHEL 7.6 (pc-i440fx-rhel7.6.0)
    machine type.

  - And if the same deployment gets additional compute nodes added in
    the future, which use a newer RHEL (e.g. RHEL-7.7), the guests
    launched on those new compute nodes will get a newer machine type
    (i.e. 'pc-i440fx-rhel7.7.0').

Here's is the problem: 

    It will now be _impossible_ to migrate a guest from a RHEL 7.7-based
    compute node to a RHEL 7.6-based — because RHEL 7.6 won't know about
    the RHEL 7.7's machine type.

To deal with this trouble, you need to be able to force set an explicit
machine type across all compute nodes in a deployment (as shown above).
The machine type should match the latest machine type present at the
time each RHOS version is released.  I.e. since RHOS 14 is released
initially against RHEL-7.6 base OS, Nova is configured to always use the
'rhel-7.6.0' machine type regardless of whether the compute node has
upgrade to RHEL 7.7 or 7.8.

This is done in ``nova.conf`` by setting the
``[libvirt]hw_machine_type``. There needs to be one machine type listed
per architecture, to cover all supported. architectures on KVM.
===============================================================================

[1] https://kashyapc.fedorapeople.org/versioned-machine-types-and-live-migration-gotcha.txt


Note You need to log in before you can comment on or make changes to this bug.