Bug 2062998

Summary: AWS GovCloud regions are recognized as the unknown regions
Product: OpenShift Container Platform Reporter: Yunfei Jiang <yunjiang>
Component: InstallerAssignee: Patrick Dillon <padillon>
Installer sub component: openshift-installer QA Contact: Yunfei Jiang <yunjiang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: padillon
Version: 4.11Keywords: Regression
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-10 10:53:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 2068948    

Description Yunfei Jiang 2022-03-11 06:28:35 UTC
While trying to install a cluster on AWS GovCloud region without AMI specified in install-config.yaml, installer will try to to copy image from us-east-1, but failed:

level=error msg=Error: InvalidRequest: Copy image not allowed from specified region.
level=error msg=    status code: 400, request id: ed112bd9-1738-4c6f-8e99-a6c848586e53
level=error msg=  with aws_ami_copy.imported[0],
level=error msg=  on main.tf line 98, in resource "aws_ami_copy" "imported":
level=error msg=  98: resource "aws_ami_copy" "imported" {
level=fatal msg=failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: exit status 1
level=fatal msg=Error: InvalidRequest: Copy image not allowed from specified region.
level=fatal msg=    status code: 400, request id: ed112bd9-1738-4c6f-8e99-a6c848586e53
level=fatal msg=  with aws_ami_copy.imported[0],
level=fatal msg=  on main.tf line 98, in resource "aws_ami_copy" "imported":
level=fatal msg=  98: resource "aws_ami_copy" "imported" {

It looks like the GovCloud regions had been removed from known-regions list, this is caused by https://github.com/openshift/installer/pull/5595

OCP version:

Steps to Reproduce:
Install an IPI cluster on AWS GovCloud, and do not configure AMI in install-config.yaml.

Actual results:
Install failed.

Expected results:
install cluster on AWS GovCloud successfully without specifying AMI id.
Additional info:

Comment 1 Patrick Dillon 2022-03-21 14:59:38 UTC
With https://github.com/openshift/installer/pull/5595, the functionality of the knownRegions function changed from listing all regions with published AMIs to listing all PUBLIC regions where AMIs are published. This change was made to remove the secret regions from the survey, which works as intended but incorrectly causes us-gov-(east,west)-1 to be set to us-east-1, when those gov regions have published amis: https://github.com/openshift/installer/blob/master/pkg/asset/rhcos/image.go#L89-L91 

In order to get us-gov-(east,west)-1 to use their published amis, I see two possible approaches:

1. Evaluate whether isKnownRegion in https://github.com/openshift/installer/blob/master/pkg/asset/rhcos/image.go#L89-L91 still makes sense, or could be replaced by coreos streamutils: https://github.com/coreos/stream-metadata-go/blob/main/stream/stream_utils.go. Comments for isKnownRegion should be updated to reflect new intent. Note that isKnownRegion is also used for determining whether to skip region checks in terraform & that usage should also be considered: https://github.com/openshift/installer/blob/master/pkg/tfvars/aws/aws.go#L131

2. Revise the approach for isKnownRegion to account for the two different use cases: 1) we need all public regions where amis are published 2) we need all regions where amis are published

I lean toward the first approach, because I believe the second use case for approach 2 is already handled in the stream utils.

Comment 2 Patrick Dillon 2022-03-21 16:25:33 UTC
I'm marking this as a blocker because it introduces a regression.

Comment 4 Patrick Dillon 2022-03-22 13:28:36 UTC
Although I marked this as a blocker, which it is because of the regression, there is a workaround by specifying the AMI id for the rhcos AMI in the govcloud region in the install config.

Comment 7 Yunfei Jiang 2022-04-06 05:59:34 UTC
verified. PASS.
OCP version: 4.11.0-0.nightly-2022-04-01-172551

Comment 9 errata-xmlrpc 2022-08-10 10:53:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.