Bug 1944268 - openshift-install AWS SDK is missing endpoints for the ap-northeast-3 region
Summary: openshift-install AWS SDK is missing endpoints for the ap-northeast-3 region
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Matthew Staebler
QA Contact: Yunfei Jiang
URL:
Whiteboard:
Depends On:
Blocks: 1945467
TreeView+ depends on / blocked
 
Reported: 2021-03-29 15:54 UTC by Katherine Dubé
Modified: 2024-10-01 17:48 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Installer does not recognize the ap-northeast-3 AWS region. Consequence: Unable to install to the ap-northeast-3 AWS region. Fix: Installer changed to allow installs to unknown regions that fit the pattern for a known partition. Result: Installer can create infrastructure in the ap-northeast-3 AWS region.
Clone Of:
: 1945467 (view as bug list)
Environment:
Last Closed: 2021-07-27 22:56:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4796 0 None open rhcos[-amd64].json: Manual add of ap-northeast-3 2021-03-29 15:55:42 UTC
Github openshift installer pull 4801 0 None open Bug 1944268: aws: allow use of unknown regions in known partitions 2021-03-31 16:25:09 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:56:50 UTC

Description Katherine Dubé 2021-03-29 15:54:45 UTC
Version:

$ openshift-install version
./openshift-install 4.6.23
built from commit 9c86c823fff234c104f574eaf25953485edfe4b1
release image quay.io/openshift-release-dev/ocp-release@sha256:5c8ab6c4a863f9ac077bc743579d9387bb7cd311a36c3197e609f8be63d17981

Platform:
AWS

Please specify:
IPI 

What happened?

% ./openshift-install create cluster --dir katherine
FATAL failed to fetch Metadata: failed to load asset "Install Config": platform.aws.serviceEndpoints: Invalid value: []aws.ServiceEndpoint(nil): [failed to find endpoint for service "ec2": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "ec2", region: "ap-northeast-3", failed to find endpoint for service "elasticloadbalancing": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "elasticloadbalancing", region: "ap-northeast-3", failed to find endpoint for service "iam": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "iam", region: "ap-northeast-3", failed to find endpoint for service "route53": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "route53", region: "ap-northeast-3", failed to find endpoint for service "s3": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "s3", region: "ap-northeast-3", failed to find endpoint for service "sts": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "sts", region: "ap-northeast-3", failed to find endpoint for service "tagging": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "tagging", region: "ap-northeast-3"]

What did you expect to happen?

Successful installation.

How to reproduce it (as minimally and precisely as possible)?

install-config.yaml: 

platform:
  aws:
    region: ap-northeast-3
    amiID: ami-0310bc3d6eec49e56

Anything else we need to know?

Trying to enable support for AWS ap-northeast-3, but the AWS SDK used for the installer doesn't know the necessary endpoints for this region.

Comment 1 Matthew Staebler 2021-03-29 16:16:07 UTC
The ap-northeast-3 endpoint was added to the ASK SDK in v1.37.24 [1]. The 4.6 installer is using v1.32.3.

As a workaround, the user could add the endpoints manually to the install config.

[1] https://github.com/aws/aws-sdk-go/commit/6a71e1594856a4350bed5c29ba63724b66372591

Comment 2 Katherine Dubé 2021-03-29 19:58:12 UTC
Manually defining AWS service endpoints for the ap-northeast-3 region doesn't appear to work.

Excerpt from install-config.yaml:

platform:
  aws:
    amiID: ami-0310bc3d6eec49e56
    region: ap-northeast-3
    serviceEndpoints:
    - name: ec2
      url: https://ec2.ap-northeast-3.amazonaws.com
    - name: elasticloadbalancing
      url: https://elasticloadbalancing.ap-northeast-3.amazonaws.com
    - name: iam
      url: https://iam.amazonaws.com
    - name: route53
      url: https://route53.amazonaws.com
    - name: s3
      url: https://s3.ap-northeast-3.amazonaws.com
    - name: sts
      url: https://sts.ap-northeast-3.amazonaws.com
    - name: tagging
      url: https://tagging.ap-northeast-3.amazonaws.com


% ./openshift-install create cluster --dir katherine
INFO Consuming Install Config from target directory
INFO Credentials loaded from the "openshift-dev" profile in file "/Users/katherine/.aws/credentials"
WARNING Failed to find information on quotas ec2/L-0263D0A3, ec2/L-1216C47A
INFO Creating infrastructure resources...
ERROR
ERROR Error: Error creating IAM Role katherine-2kjw8-bootstrap-role: SignatureDoesNotMatch: Credential should be scoped to a valid region, not 'ap-northeast-3'.
ERROR 	status code: 403, request id: 862e5492-15b9-4fa5-926a-c31518a9984a
ERROR
ERROR   on ../../../../private/var/folders/2m/t4ltl17174s0x9v935kr59v40000gn/T/openshift-install-702836295/bootstrap/main.tf line 51, in resource "aws_iam_role" "bootstrap":
ERROR   51: resource "aws_iam_role" "bootstrap" {
ERROR
ERROR
ERROR
ERROR Error: Error creating IAM Role katherine-2kjw8-worker-role: SignatureDoesNotMatch: Credential should be scoped to a valid region, not 'ap-northeast-3'.
ERROR 	status code: 403, request id: 8b4f0f0b-8299-4297-bf9b-c5f4f180dacb
ERROR
ERROR   on ../../../../private/var/folders/2m/t4ltl17174s0x9v935kr59v40000gn/T/openshift-install-702836295/iam/main.tf line 13, in resource "aws_iam_role" "worker_role":
ERROR   13: resource "aws_iam_role" "worker_role" {
ERROR
ERROR
ERROR
ERROR Error: Error creating IAM Role katherine-2kjw8-master-role: SignatureDoesNotMatch: Credential should be scoped to a valid region, not 'ap-northeast-3'.
ERROR 	status code: 403, request id: 6fd96bfa-636a-4df9-bd25-22e0384238ab
ERROR
ERROR   on ../../../../private/var/folders/2m/t4ltl17174s0x9v935kr59v40000gn/T/openshift-install-702836295/master/main.tf line 17, in resource "aws_iam_role" "master_role":
ERROR   17: resource "aws_iam_role" "master_role" {
ERROR
ERROR
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change

Additionally, if I omit the service endpoints for IAM and Route 53 (since they're not region specific), then the installer throws an error:

% ./openshift-install create cluster --dir katherine
FATAL failed to fetch Metadata: failed to load asset "Install Config": platform.aws.serviceEndpoints: Invalid value: []aws.ServiceEndpoint{aws.ServiceEndpoint{Name:"ec2", URL:"https://ec2.ap-northeast-3.amazonaws.com"}, aws.ServiceEndpoint{Name:"elasticloadbalancing", URL:"https://elasticloadbalancing.ap-northeast-3.amazonaws.com"}, aws.ServiceEndpoint{Name:"s3", URL:"https://s3.ap-northeast-3.amazonaws.com"}, aws.ServiceEndpoint{Name:"sts", URL:"https://sts.ap-northeast-3.amazonaws.com"}, aws.ServiceEndpoint{Name:"tagging", URL:"https://tagging.ap-northeast-3.amazonaws.com"}}: [failed to find endpoint for service "iam": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "iam", region: "ap-northeast-3", failed to find endpoint for service "route53": UnknownEndpointError: could not resolve endpoint
	partition: "all partitions", service: "route53", region: "ap-northeast-3"]

Comment 3 Matthew Staebler 2021-03-29 21:31:12 UTC
The installer does not need to be so restrictive when validation the service endpoints. The installer should accept a region [1] that matches the regex for a partition, even if the SDK does not know the region.

[1] https://github.com/openshift/installer/blob/6363f3ab700e3976e8655ba0e826843593c7c98f/pkg/asset/installconfig/aws/validation.go#L255-L264

Comment 8 Yunfei Jiang 2021-04-30 09:24:40 UTC
verified. FAILED.

OCP version: 4.8.0-0.nightly-2021-04-29-222100

only the master nodes were created but no workers:

Status:
  Conditions:
    Last Transition Time:  2021-04-30T07:55:54Z
    Message:               Failed to check if machine exists: yunjiang-ap3-g5s68-worker-ap-northeast-3c-9lpgn: failed to create scope for machine: failed to create aws client: region "ap-northeast-3" not resolved: UnknownEndpointError: could not resolve endpoint
                           partition: "all partitions", service: "ec2", region: "ap-northeast-3"
    Reason:                ErrorCheckingProvider
    Status:                Unknown
    Type:                  InstanceExists
  Last Updated:            2021-04-30T07:55:54Z
  Phase:

Comment 9 Matthew Staebler 2021-06-07 13:14:56 UTC
This BZ just addresses installer support for the region. The in-cluster operators that failed when using the new region have been fixed in separate BZs. If there are other operators that fail, we should create additional BZs for those.

Comment 10 Yunfei Jiang 2021-06-08 08:12:55 UTC
verified. PASS.
OCP version: 4.8.0-0.nightly-2021-06-08-005718

Comment 13 errata-xmlrpc 2021-07-27 22:56:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.