Bug 1907786

Summary: Cannot install Openshift successfully using assisted installer on 4.7 using OVN Kuberenetes with IPv6
Product: OpenShift Container Platform Reporter: Ori Amizur <oamizur>
Component: Machine Config OperatorAssignee: Ben Howard <behoward>
Status: CLOSED WONTFIX QA Contact: Michael Nguyen <mnguyen>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.7CC: bbreard, behoward, imcleod, jligon, kgarriso, miabbott, mkrejci, nstielau, rfreiman
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-17 17:35:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Log bundle from bootstrap
none
Directory /etc/mcs/bootstrap/machine-configs in bootstrap none

Description Ori Amizur 2020-12-15 08:50:51 UTC
Created attachment 1739280 [details]
Log bundle from bootstrap

Description of problem:

When installing  Openshift using Assisted Installer using OVNKubernetes with IPv6, the installation fails with MCO/MCS problems.  
This is a degradation from 4.6 that it is installed successfully.

In Assisted-Installer, after the installation of all other nodes besides the bootstrap is completed, the bootstrap reboots and becomes the third master.  Above 4.7 the installation is stuck as this node cannot pull ignition.

The claim is that we are using different bootstrap paths (since the bootstrap becomes a master), which causes inability to pull ignition by that node. 

Version-Release number of selected component (if applicable):

4,7

How reproducible:

Install Openshift using Assisted Installer on top of 4.7 with IPv6


Actual results:
Failed installation.


Expected results:
Successful installation.

Additional info:

Discussion on the issue: https://coreos.slack.com/archives/C999USB0D/p1607514976110900

Comment 1 Ori Amizur 2020-12-15 08:53:37 UTC
Created attachment 1739281 [details]
Directory /etc/mcs/bootstrap/machine-configs in bootstrap

Comment 2 Ori Amizur 2020-12-15 13:58:18 UTC
Additional information:

The render artifact is different from bootstrap and (later) master.  Therefore, the master (that was bootstrap) cannot join.

Comment 3 Micah Abbott 2020-12-15 15:16:25 UTC
> Above 4.7 the installation is stuck as this node cannot pull ignition.

If this is indeed the case, the log bundle provide does not appear to contain any logs that show any Ignition failures.

We would need the full journal from a failed node showing it booting and attempting to fetch Ignition before we would be able to do triage for this issue.

Comment 4 Colin Walters 2020-12-15 15:58:20 UTC
It looks like the gather process couldn't resolve host names and failed to gather logs.

Can you clarify: in this model, the bootstrap node is running `coreos-installer` and entirely deleting the previous contents of the machine right?  (There's no iBIP etc.)

Comment 5 Kirsten Garrison 2020-12-15 20:58:21 UTC
I'm a bit confused as the referenced slack thread states that the assisted installer is using a flow that the MCO does not support... so, if correct, seems like whoever owns the assisted installer should be looking at this?

Comment 6 Ben Howard 2020-12-17 17:35:36 UTC
The Assisted Installer found a fix:
https://github.com/openshift/assisted-installer/pull/147/files

Comment 7 Ben Howard 2020-12-17 17:43:03 UTC
Correct, the Assisted Installer did not find a fix, but the path is not supported (as stated by Kirsten).

Comment 8 Red Hat Bugzilla 2023-09-14 06:09:54 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days