Bug 2064723 - Installation fails in ap-southeast-3 region
Summary: Installation fails in ap-southeast-3 region
Keywords:
Status: CLOSED DUPLICATE of bug 2065510
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Joel Speed
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-16 12:30 UTC by Kirk Bater
Modified: 2022-03-18 14:25 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-18 14:25:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
AP-Southeast-3 Must Gather (9.94 MB, application/gzip)
2022-03-16 12:30 UTC, Kirk Bater
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSD-10164 0 None None None 2022-03-16 12:30:10 UTC

Description Kirk Bater 2022-03-16 12:30:10 UTC
Created attachment 1866188 [details]
AP-Southeast-3 Must Gather

Version:

$ openshift-install version
v4.9.0

Platform:
AWS/OSD

Please specify:
* IPI

What happened?

When attempting to create a cluster in the new ap-southeast-3 region, the installer gets past the bootstrap phase and fails to create worker nodes.

Must gather is attached, I forgot to get the install logs specifically.

When I was digging into the cluster, I saw the following errors in the machine-api-operator machine-controller pod:

I0315 17:09:29.768070       1 controller.go:174] kirk-aps3-nqj8z-worker-ap-southeast-3a-lhfpv: reconciling Machine
I0315 17:09:29.768081       1 actuator.go:104] kirk-aps3-nqj8z-worker-ap-southeast-3a-lhfpv: actuator checking if machine exists
E0315 17:09:29.768650       1 controller.go:281] kirk-aps3-nqj8z-worker-ap-southeast-3a-lhfpv: failed to check if machine exists: kirk-aps3-nqj8z-worker-ap-southeast-3a-lhfpv: failed to create scope for machine: failed to create aws client: region "ap-southeast-3" not resolved: UnknownEndpointError: could not resolve endpoint
        partition: "all partitions", service: "ec2", region: "ap-southeast-3"
E0315 17:09:29.776716       1 controller.go:304] controller-runtime/manager/controller/machine_controller "msg"="Reconciler error" "error"="kirk-aps3-nqj8z-worker-ap-southeast-3a-lhfpv: failed to create scope for machine: failed to create aws client: region \"ap-southeast-3\" not resolved: UnknownEndpointError: could not resolve endpoint\n\tpartition: \"all partitions\", service: \"ec2\", region: \"ap-southeast-3\"" "name"="kirk-aps3-nqj8z-worker-ap-southeast-3a-lhfpv" "namespace"="openshift-machine-api"

What did you expect to happen?

Worker nodes to be spun up and installation finished.

How to reproduce it (as minimally and precisely as possible)?

Create a cluster in OCM with the following json and then wait for install to fail

$ cat << EOF > ocm.json
{
    "name": "[CLUSTER_NAME]",
    "ccs": {
        "enabled": true
    },
    "region": {
        "id": "ap-southeast-3"
    },
    "version": {
    "id": "openshift-v4.9.21",
    "channel_group": "stable"
     },
    "aws": {
        "account_id": "[ACCOUNT ID]",
        "access_key_id": "[ACCESS_KEY_ID]",
        "secret_access_key": "[SECRET_ACCESS_KEY]"
    }
}
EOF
$ ocm post /api/clusters_mgmt/v1/clusters --body ocm.json
...
wait for install to fail


Anything else we need to know?

Comment 1 Matthew Staebler 2022-03-16 13:04:51 UTC
See https://issues.redhat.com/browse/OCPCLOUD-1159.

Comment 2 dmoiseev 2022-03-18 14:25:19 UTC
The same bug already exists: https://bugzilla.redhat.com/show_bug.cgi?id=2065510

*** This bug has been marked as a duplicate of bug 2065510 ***


Note You need to log in before you can comment on or make changes to this bug.