Bug 1886450 - Keepalived router id check not documented for RHV/VMware IPI
Summary: Keepalived router id check not documented for RHV/VMware IPI
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: Donna DaCosta
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-08 13:35 UTC by Andrew Downs
Modified: 2022-03-10 16:02 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:02:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:02:50 UTC

Description Andrew Downs 2020-10-08 13:35:03 UTC
Document URL: 

N/A

Section Number and Name: 

RHV and VMware Docs below, may not be exhaustive.


https://docs.openshift.com/container-platform/4.5/installing/installing_rhv/installing-rhv-default.html
https://docs.openshift.com/container-platform/4.5/installing/installing_rhv/installing-rhv-customizations.html
https://docs.openshift.com/container-platform/4.5/installing/installing_vsphere/installing-vsphere-installer-provisioned.html
https://docs.openshift.com/container-platform/4.5/installing/installing_vsphere/installing-vsphere-installer-provisioned-customizations.html
https://docs.openshift.com/container-platform/4.5/installing/installing_vsphere/installing-vsphere-installer-provisioned-network-customizations.html

Describe the issue: 

The BZ[1] has a work around to allow customers to work out if there will be a clash in the virtual router id assigned. This is only documented for Bare metal but it affects RHV and VMware IPI installations as well. 

Not knowing that this can happen leads to clusters failing to install for a none obvious reason.

Suggestions for improvement: 

Add the the matching docs from the bare metal install[2] so customers can run the check.

Additional information: 



[1] https://bugzilla.redhat.com/show_bug.cgi?id=1821667
[2] https://github.com/openshift/installer/blob/master/docs/user/metal/install_ipi.md

Comment 1 Gal Zaidman 2021-03-31 09:16:16 UTC
As far as I saw this is only documented in the Dev docs [1] not in the OCP docs of baremetal.
Do you think we need to add this to our Docs?
If so those it make sense to reference the tool: quay.io/openshift/origin-baremetal-runtimecfg in the official docs?

[1] https://github.com/openshift/installer/blob/master/docs/user/metal/install_ipi.md

Comment 2 Andrew Downs 2021-03-31 09:52:05 UTC
From my perspective, I'd made the assumption that these changes would be flowing into the main docs, so I'd say yes it should be. 

I think it is pretty key bit of information for any customer installing with the IPI methods. It is difficult to debug if you don't know about it. Fundamentally we are asking customers to work out a way to manage these amongst all of their clusters, so we should tell them how to gather the info and how it could affect them. Arguably the installer should present this information as well.

Comment 3 Steve Goodman 2021-05-10 12:45:11 UTC
(In reply to Andrew Downs from comment #0)
 
> The BZ[1] has a work around to allow customers to work out if there will be
> a clash in the virtual router id assigned. This is only documented for Bare
> metal but it affects RHV and VMware IPI installations as well. 
> 
> Not knowing that this can happen leads to clusters failing to install for a
> none obvious reason.
> 
> Suggestions for improvement: 
> 
> Add the the matching docs from the bare metal install[2] so customers can
> run the check.

> [2] https://github.com/openshift/installer/blob/master/docs/user/metal/install_ipi.md

Is this the content that you're talking about?:

----

When the Virtual IPs are managed using multicast (VRRPv2 or VRRPv3), there is a limitation for 255 unique virtual routers per multicast domain. In case you have pre-existing virtual routers using the standard IPv4 or IPv6 multicast groups, you can learn the VIPs the installation will choose by running the following command:

$ podman run quay.io/openshift/origin-baremetal-runtimecfg:TAG vr-ids cnf10
APIVirtualRouterID: 147
DNSVirtualRouterID: 158
IngressVirtualRouterID: 2

Where TAG is the release you are going to install, e.g., 4.5. Let's see another example:

$ podman run quay.io/openshift/origin-baremetal-runtimecfg:TAG vr-ids cnf11
APIVirtualRouterID: 228
DNSVirtualRouterID: 239
IngressVirtualRouterID: 147

In the example output above you can see that installing two clusters in the same multicast domain with names cnf10 and cnf11 would lead to a conflict. You should also take care that none of those are taken by other independent VRRP virtual routers running in the same broadcast domain.

----

Comment 6 Andrew Downs 2021-05-11 15:57:36 UTC
(In reply to Steve Goodman from comment #3)
> (In reply to Andrew Downs from comment #0)
>  
> > The BZ[1] has a work around to allow customers to work out if there will be
> > a clash in the virtual router id assigned. This is only documented for Bare
> > metal but it affects RHV and VMware IPI installations as well. 
> > 
> > Not knowing that this can happen leads to clusters failing to install for a
> > none obvious reason.
> > 
> > Suggestions for improvement: 
> > 
> > Add the the matching docs from the bare metal install[2] so customers can
> > run the check.
> 
> > [2] https://github.com/openshift/installer/blob/master/docs/user/metal/install_ipi.md
> 
> Is this the content that you're talking about?:
> 
> ----
> 
> When the Virtual IPs are managed using multicast (VRRPv2 or VRRPv3), there
> is a limitation for 255 unique virtual routers per multicast domain. In case
> you have pre-existing virtual routers using the standard IPv4 or IPv6
> multicast groups, you can learn the VIPs the installation will choose by
> running the following command:
> 
> $ podman run quay.io/openshift/origin-baremetal-runtimecfg:TAG vr-ids cnf10
> APIVirtualRouterID: 147
> DNSVirtualRouterID: 158
> IngressVirtualRouterID: 2
> 
> Where TAG is the release you are going to install, e.g., 4.5. Let's see
> another example:
> 
> $ podman run quay.io/openshift/origin-baremetal-runtimecfg:TAG vr-ids cnf11
> APIVirtualRouterID: 228
> DNSVirtualRouterID: 239
> IngressVirtualRouterID: 147
> 
> In the example output above you can see that installing two clusters in the
> same multicast domain with names cnf10 and cnf11 would lead to a conflict.
> You should also take care that none of those are taken by other independent
> VRRP virtual routers running in the same broadcast domain.
> 
> ----

Yep that is the info, although depending on where it ends up in RHV/VMware sections I think the "When the Virtual IPs are managed" is not a When but more like "With IPI installation the Virtual IPs are managed"

Comment 14 Gal Zaidman 2021-08-08 14:54:48 UTC
commented on the PR lets move the discussion to the PR

Comment 26 errata-xmlrpc 2022-03-10 16:02:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.