Bug 1997059 - Failed to create cluster in AWS us-east-1 region due to a local zone is used
Summary: Failed to create cluster in AWS us-east-1 region due to a local zone is used
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: aos-install
QA Contact: Yunfei Jiang
URL:
Whiteboard:
Depends On: 1981941
Blocks: 2052307
TreeView+ depends on / blocked
 
Reported: 2021-08-24 10:42 UTC by Yunfei Jiang
Modified: 2022-03-12 04:37 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Installer was considering a local zone as a valid zone into which to install. Consequence: Install would fail when installing to a region that had a local zone enabled. Fix: Installer only considers zones that are availability zones and ignores local zones. Result: Installation to a region with local zones enabled will only install to the availability zones in that region and not to any local zones.
Clone Of:
Environment:
Last Closed: 2022-03-12 04:37:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5392 0 None open Bug 1997059: aws: Filter out local zones when generating a default list of zones 2021-11-17 02:00:29 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-12 04:37:52 UTC

Description Yunfei Jiang 2021-08-24 10:42:16 UTC
Trying to create an IPI cluster on us-east-1 region, but one of masters was created in local zone:
yunjiang-eeaa1-5nr72-master-0 us-east-1-mia-1a
yunjiang-eeaa1-5nr72-master-1 us-east-1a
yunjiang-eeaa1-5nr72-master-2 us-east-1b

this cause error `Error creating network Load Balancer: ValidationError: You cannot have any Local Zone subnets for load balancers of type 'network'`

> log
<--snip-->
level=error msg=Error: Error creating network Load Balancer: ValidationError: You cannot have any Local Zone subnets for load balancers of type 'network'
level=error msg=    status code: 400, request id: 5acf0aee-89da-4258-9331-09fd13b3676c
level=error
level=error msg=  on ../../../../../tmp/openshift-install-cluster-069882361/vpc/master-elb.tf line 1, in resource "aws_lb" "api_internal":
level=error msg=   1: resource "aws_lb" "api_internal" {
level=error
level=error
level=error
level=error msg=Error: Error creating network Load Balancer: ValidationError: You cannot have any Local Zone subnets for load balancers of type 'network'
level=error msg=    status code: 400, request id: e2b39384-4a1c-48ae-9108-c4ec180e0dda
level=error
level=error msg=  on ../../../../../tmp/openshift-install-cluster-069882361/vpc/master-elb.tf line 22, in resource "aws_lb" "api_external":
level=error msg=  22: resource "aws_lb" "api_external" {
level=error
level=error
level=error
level=error msg=Error: Error creating NAT Gateway: NotAvailableInZone: Nat Gateway is not available in this availability zone
level=error msg=    status code: 400, request id: 9ecd069f-b4e5-43f1-ba27-adc26f21bc13
level=error
level=error msg=  on ../../../../../tmp/openshift-install-cluster-069882361/vpc/vpc-public.tf line 85, in resource "aws_nat_gateway" "nat_gw":
level=error msg=  85: resource "aws_nat_gateway" "nat_gw" {
level=error
level=error
level=fatal msg=failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change


And there is another warning info, `no instance type found for the zone constraint`, not sure if it is caused by local zone issue:
> log
<--snip-->
level=warning msg=failed to find default instance type: no instance type found for the zone constraint
level=warning msg=failed to find default instance type: no instance type found for the zone constraint


How reproducible:
2/2

Version:
4.9.0-fc.0

Platform:
AWS

How to reproduce it (as minimally and precisely as possible)?
Create an IPI cluster on AWS us-east-1 region

Comment 1 Jeremiah Stuever 2021-08-31 20:41:03 UTC
These local zones are not enabled by default; a customer must opt in to enable them. As such, this shouldn't block current releases. We still need to fix this to enable those customers who have opted in to local zones. They can currently work around this by specifying the normal zones in the install-config.

Comment 4 Caleb Boylan 2021-10-25 18:30:10 UTC
This is blocked currently as our version of aws-sdk-go (v1.32.3) doesn't provide the ZoneType field on AvailabilityZones which is needed to filter out the local zone subnets. Once we upgrade our version of terraform ( https://bugzilla.redhat.com/show_bug.cgi?id=1981941 ) we should be on a new enough version of aws-sdk-go to fix this.

Comment 7 Yunfei Jiang 2021-12-15 08:20:08 UTC
verified. PASS.
OCP version: 4.10.0-0.nightly-2021-12-14-083101

Comment 12 errata-xmlrpc 2022-03-12 04:37:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.