Created attachment 1777133 [details] installer-gather Description of problem: When installing 4.8 with assisted installer (doesn't matter if IPv4 or IPv6, doesn't matter if its OpenShiftSDN or OVNKubernetes) the installation fail because the master nodes fail to connect to the API Version-Release number of selected component (if applicable): registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-04-26-100514 How reproducible: 100% Steps to Reproduce: 1.happens in prow 2. 3. Actual results: Installation fail because the master nodes fail to join the cluster Expected results: Successful installation Additional info: keepalived on the bootstrap keeps creating and deleting the API and Ingress interfaces ``` Thu Apr 29 08:05:16 2021: Interface api added Thu Apr 29 08:05:16 2021: Interface api deleted Thu Apr 29 08:05:16 2021: Interface ingress added Thu Apr 29 08:05:16 2021: Interface ingress deleted Thu Apr 29 08:06:16 2021: Interface api added Thu Apr 29 08:06:16 2021: Interface api deleted Thu Apr 29 08:06:16 2021: Interface ingress added Thu Apr 29 08:06:16 2021: Interface ingress deleted Thu Apr 29 08:07:16 2021: Interface api added Thu Apr 29 08:07:16 2021: Interface api deleted Thu Apr 29 08:07:16 2021: Interface ingress added Thu Apr 29 08:07:16 2021: Interface ingress deleted Thu Apr 29 08:08:16 2021: Interface api added Thu Apr 29 08:08:16 2021: Interface api deleted Thu Apr 29 08:08:16 2021: Interface ingress added Thu Apr 29 08:08:16 2021: Interface ingress deleted ``` kube-apiserver log on the bootstrap seems OK
The interface changes in the keepalived logs are not something keepalived is doing. My guess would be that's related to the DHCP VIP assignment feature. Keepalived is just reporting that interfaces are appearing and disappearing. My guess would be that this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1957708 where the VIP is showing up on multiple nodes at once, which prevents the masters from talking to the bootstrap correctly. Unfortunately, this was opened before we had the network details captured by the installer. If this is still reproducing, can you re-run with a recent release to get new logs so we can also see the networking configuration on each node? Thanks.
@ercohen Do you still hit this bug in your CI? We solved lately [1] BZ which this bug seems like a duplicate of it [1] https://bugzilla.redhat.com/show_bug.cgi?id=1957708
We hit it once. If I'll see it again I'll reopen this issue
Sorry, This issue happened multiple times in our CI. Didn't see it lately
OK, I'll close it, if it happens again please reopen it. Thanks *** This bug has been marked as a duplicate of bug 1957708 ***