Bug 2062998 - AWS GovCloud regions are recognized as the unknown regions
Summary: AWS GovCloud regions are recognized as the unknown regions
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.11
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Patrick Dillon
QA Contact: Yunfei Jiang
URL:
Whiteboard:
Depends On:
Blocks: 2068948
TreeView+ depends on / blocked
 
Reported: 2022-03-11 06:28 UTC by Yunfei Jiang
Modified: 2022-08-10 10:54 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:53:18 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5731 0 None open Bug 2062998: Update region check for coreos AMIs 2022-03-21 17:01:36 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:54:10 UTC

Description Yunfei Jiang 2022-03-11 06:28:35 UTC
While trying to install a cluster on AWS GovCloud region without AMI specified in install-config.yaml, installer will try to to copy image from us-east-1, but failed:

level=error msg=Error: InvalidRequest: Copy image not allowed from specified region.
level=error msg=    status code: 400, request id: ed112bd9-1738-4c6f-8e99-a6c848586e53
level=error
level=error msg=  with aws_ami_copy.imported[0],
level=error msg=  on main.tf line 98, in resource "aws_ami_copy" "imported":
level=error msg=  98: resource "aws_ami_copy" "imported" {
level=error
level=fatal msg=failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: exit status 1
level=fatal
level=fatal msg=Error: InvalidRequest: Copy image not allowed from specified region.
level=fatal msg=    status code: 400, request id: ed112bd9-1738-4c6f-8e99-a6c848586e53
level=fatal
level=fatal msg=  with aws_ami_copy.imported[0],
level=fatal msg=  on main.tf line 98, in resource "aws_ami_copy" "imported":
level=fatal msg=  98: resource "aws_ami_copy" "imported" {
level=fatal
level=fatal

It looks like the GovCloud regions had been removed from known-regions list, this is caused by https://github.com/openshift/installer/pull/5595

OCP version:
4.11.0-0.nightly-2022-03-09-235248

Steps to Reproduce:
Install an IPI cluster on AWS GovCloud, and do not configure AMI in install-config.yaml.

Actual results:
Install failed.

Expected results:
install cluster on AWS GovCloud successfully without specifying AMI id.
 
Additional info:

Comment 1 Patrick Dillon 2022-03-21 14:59:38 UTC
With https://github.com/openshift/installer/pull/5595, the functionality of the knownRegions function changed from listing all regions with published AMIs to listing all PUBLIC regions where AMIs are published. This change was made to remove the secret regions from the survey, which works as intended but incorrectly causes us-gov-(east,west)-1 to be set to us-east-1, when those gov regions have published amis: https://github.com/openshift/installer/blob/master/pkg/asset/rhcos/image.go#L89-L91 

In order to get us-gov-(east,west)-1 to use their published amis, I see two possible approaches:

1. Evaluate whether isKnownRegion in https://github.com/openshift/installer/blob/master/pkg/asset/rhcos/image.go#L89-L91 still makes sense, or could be replaced by coreos streamutils: https://github.com/coreos/stream-metadata-go/blob/main/stream/stream_utils.go. Comments for isKnownRegion should be updated to reflect new intent. Note that isKnownRegion is also used for determining whether to skip region checks in terraform & that usage should also be considered: https://github.com/openshift/installer/blob/master/pkg/tfvars/aws/aws.go#L131

2. Revise the approach for isKnownRegion to account for the two different use cases: 1) we need all public regions where amis are published 2) we need all regions where amis are published

I lean toward the first approach, because I believe the second use case for approach 2 is already handled in the stream utils.

Comment 2 Patrick Dillon 2022-03-21 16:25:33 UTC
I'm marking this as a blocker because it introduces a regression.

Comment 4 Patrick Dillon 2022-03-22 13:28:36 UTC
Although I marked this as a blocker, which it is because of the regression, there is a workaround by specifying the AMI id for the rhcos AMI in the govcloud region in the install config.

Comment 7 Yunfei Jiang 2022-04-06 05:59:34 UTC
verified. PASS.
OCP version: 4.11.0-0.nightly-2022-04-01-172551

Comment 9 errata-xmlrpc 2022-08-10 10:53:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.