Is there a thought why this is filed as an RHCOS BZ? Once any of the nodes are booted into the OS and containers have started, RHCOS is mostly out of the picture. > My guess is that the VIPs moves from the bootstrap to the masters before the control plane is completely ready.` If this is the case, this is not something that RHCOS controls, but I think would be handled by the installer or maybe api server itself?
Higher priority work has prevented this issue from being solved; adding the UpcomingSprint keyword
Moving to the installer team as this does seem to be related to installation flow.
At first glance, this may be an issue with the keepalived logic. Moving to the kni team, as they maintain that.
*** Bug 1932464 has been marked as a duplicate of this bug. ***
Just checking in... We are still seeing this issue with OKD installs. I'm not sure if there is any additional information I can provide
I did this this while I was investigating. It does look like there was some work done regarding API VIP failover fairly recently https://github.com/openshift/machine-config-operator/pull/2107
*** Bug 1963161 has been marked as a duplicate of this bug. ***
I've verified that when the fixes in the following are applied I am able to get a successful install on VMware via IPI: https://github.com/openshift/machine-config-operator/pull/2586 https://github.com/openshift/installer/pull/4972
I believe this problem was fixed by https://github.com/openshift/installer/pull/4973. Duplicating to that bug. *** This bug has been marked as a duplicate of bug 1966862 ***
*** This bug has been marked as a duplicate of bug 1966862 ***
(In reply to Ben Nemec from comment #13) > I believe this problem was fixed by > https://github.com/openshift/installer/pull/4973. Duplicating to that bug. > > *** This bug has been marked as a duplicate of bug 1966862 *** In my understanding bug 1966862 is a different issue affecting only vsphere platform. The issue reported in this bug affects all platform and was fixed with https://github.com/openshift/machine-config-operator/pull/2586. They should be treated as different issues. Marking them as duplicates prevents the backport for https://github.com/openshift/machine-config-operator/pull/2586 to merge in 4.7. Can we sort this out?
Shoot, you're right. This didn't get closed properly because of the other patch attached to it, not because this fix didn't merge.
Verified on 4.8.0-0.nightly-2021-06-29-033219 Successfull deployment of IPI vSphere
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438