Bug 2072202

Summary: AWS IPI install fails opaquely if dhcp-options-set misconfigured
Product: OpenShift Container Platform Reporter: Christoph Blecker <cblecker>
Component: InstallerAssignee: sdasu
Installer sub component: openshift-installer QA Contact: Shaowen Zhang <shaozhan>
Status: CLOSED DEFERRED Docs Contact:
Severity: medium    
Priority: medium CC: bbarbach, rdossant
Version: 4.9Keywords: ServiceDeliveryImpact
Target Milestone: ---   
Target Release: 4.13.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-09 01:16:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Christoph Blecker 2022-04-05 19:03:51 UTC
Version:

$ openshift-install version
4.9.25

Platform:
AWS IPI

What happened?
When provisioning a cluster using pre-provisioned VPC/subnets, if the dhcp-options-set is misconfigured for the subnet, installation will fail with no detailed error beyond bootstrap failed.

For example, if you use the following dhcp-options-set that lacks DNS server configuration, installation will not complete:
aws ec2 create-dhcp-options --dhcp-configurations '[{"Key":"domain-name","Values":["ec2.internal"]}]'

What did you expect to happen?
The installer to do either pre-flight verification that basic settings like DNS don't work, or the installer to provide guidance as to what part of the installation failed in order to allow the user to diagnose and correct it.

How to reproduce it (as minimally and precisely as possible)?

Create a VPC/subnets with default settings, but create a faulty DHCP options set above.
Pass the subnets into the openshift-installer, and proceed with an IPI install with default settings.

Comment 1 Brent Barbachem 2022-06-16 18:39:49 UTC
Linking https://bugzilla.redhat.com/show_bug.cgi?id=2072195 here

Comment 2 Brent Barbachem 2022-06-21 14:25:26 UTC
This should be solved via https://github.com/openshift/installer/pull/5816

Comment 7 Shaowen Zhang 2023-02-01 11:50:08 UTC
Verified on 4.13.0-0.nightly-2023-01-27-051537
Reproduce the issue on 4.9.25-x86_64.
The issue was successfully reproduced.And on 4.13.0-0.nightly-2023-01-27-051537, the installer successfully provides guidance for the failed part of the installation.

Comment 8 Shaowen Zhang 2023-02-01 11:53:56 UTC
But when I verify the bug, I notice that the error result of the release image to be verified is the same as the error result of its previous release image that without fixing the bug.
Verification release image:4.13.0-0.nightly-2023-01-27-051537
Unfixed bug release image:4.13.0-0.nightly-2023-01-24-061922

My verification steps are as follows:
1.Use the following command to create a dhcp-options-set that lacks DNS server configuration
aws ec2 create-dhcp-options --dhcp-configurations '[{"Key":"domain-name","Values":["ec2.internal"]}]'

2.Create a vpc
aws cloudformation create-stack --stack-name shaozhanteststack --region us-east-2 --template-body file://01_vpc.yaml

3.Change the dhcp-options-set associated with the vpc
Modify the dhcp-options-set of my vpc in the Amazon VPC console.

4.Creating a installation configuration file
./openshift-install create install-config --dir cluster

5.Customize the installation configuration file to specify subnets created above
vi cluster/install-config.yaml

6.Deploying the cluster
./openshift-install create cluster --dir cluster

Comment 9 Shaowen Zhang 2023-02-02 13:39:48 UTC
Adding dhcp-options-set that lacks DNS server configuration will cause bootstrap instance fetching ignition file fail, so it looks like the PR 6611 is not a solution for Bug 2072202.

Comment 10 Shiftzilla 2023-03-09 01:16:45 UTC
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira.

https://issues.redhat.com/browse/OCPBUGS-9207