AWS Jakarta (ap-southeast-3) region has opened now, but failed to install OCP on this region: > oc logs -n openshift-machine-api machine-api-controllers-7f945cbc56-8hjvl -c machine-controller E0318 02:00:28.875146 1 controller.go:303] yunjiang-se2-42qt8-worker-ap-southeast-3b-pt655: failed to check if machine exists: yunjiang-se2-42qt8-worker-ap-southeast-3b-pt655: failed to create scope for machine: failed to create aws client: region "ap-southeast-3" not resolved: UnknownEndpointError: could not resolve endpoint partition: "all partitions", service: "ec2", region: "ap-southeast-3" E0318 02:00:28.884895 1 controller.go:317] controller/machine_controller "msg"="Reconciler error" "error"="yunjiang-se2-42qt8-worker-ap-southeast-3b-pt655: failed to create scope for machine: failed to create aws client: region \"ap-southeast-3\" not resolved: UnknownEndpointError: could not resolve endpoint\n\tpartition: \"all partitions\", service: \"ec2\", region: \"ap-southeast-3\"" "name"="yunjiang-se2-42qt8-worker-ap-southeast-3b-pt655" "namespace"="openshift-machine-api" [1] https://aws.amazon.com/blogs/aws/now-open-aws-asia-pacific-jakarta-region/ Version-Release number of the following components: 4.10.5 How reproducible: Always Steps to Reproduce: 1. Create an IPI cluster on ap-southeast-3 Actual results: Compute node can not be created, install failed. Expected results: CLuster successfully installed on ap-southeast-3 Additional info:
*** Bug 2064723 has been marked as a duplicate of this bug. ***
Hi Team, I have tried to install OCP 4.10.3 IPI on ap-southeast-3 and it went successful. Initially with default installation approach (assuming it is already supported) the installation went failure on worker node creation. It turned out that ap-southeast-3 support in AWS SDK just released [1] a day after the region is officially announced by AWS [2]. Since openshift-installer vendors aws-sdk it seems that the specific commit including the region is not yet included in OCP 4.10 GA timeframe, thus the installer has no visibility of ap-southeast-3 region. So after I learn about the possibilities on overriding the AWS Service Endpoints [3] and all of the possible services required in installation [4], I have successfully created the cluster using this approach: 1. Defining custom AWS Service Endpoints for ap-southeast-3 2. Upload RHCOS 4.10.3 AWS VMDK image to my AWS S3 bucket 3. Register the RHCOS 4.10.3 to the region AMI (because without that MachineSet scaling will not work with the error mentioning amiID not found) 4. Update the install-config.yaml I have also tested a brief functionalities of the cluster that works: 1. Installation finished around normal duration (30-40 minutes) 2. Registry bucket created and S2I builds are successful (pushing/pulling to/from registry) 3. MachineSet scaling 4. AWS EBS CSI and built-in volume consumption tested using CrunchyData workload 5. Google login authentication (using @redhat.com email) 6. Patch release cluster upgrade from 4.10.3 to 4.10.4 using fast-4.10 channel What I haven't tested and I'm going to test is the minor upgrade (e.g. 4.8 to 4.9) with only upload and specifying OCP 4.8 AMI. This hopefully replicates the condition when in the future customer would like to upgrade the cluster to 4.11. Here is the snippet in install-config.yaml that I used: ... platform: aws: region: ap-southeast-3 userTags: adminContact: otrifirg costCenter: 420 activity: poc customer: ptbc amiID: ami-0db1bdfbb0592c159 serviceEndpoints: - name: ec2 url: https://ec2.ap-southeast-3.amazonaws.com - name: elasticloadbalancing url: https://elasticloadbalancing.ap-southeast-3.amazonaws.com - name: s3 url: https://s3.ap-southeast-3.amazonaws.com - name: autoscaling url: https://autoscaling.ap-southeast-3.amazonaws.com - name: servicequotas url: https://servicequotas.ap-southeast-3.amazonaws.com - name: sts url: https://sts.ap-southeast-3.amazonaws.com - name: kms url: https://kms.ap-southeast-3.amazonaws.com ... [1] https://github.com/aws/aws-sdk-go/blob/v1.42.23/models/endpoints/endpoints.json [2] https://aws.amazon.com/blogs/aws/now-open-aws-asia-pacific-jakarta-region/ [3] https://docs.openshift.com/container-platform/4.8/installing/installing_aws/installing-aws-account.html#nw-endpoint-route53_installing-aws-account [4] https://docs.openshift.com/container-platform/4.8/installing/installing_aws/installing-aws-account.html#installation-aws-permissions_installing-aws-account [5] https://access.redhat.com/documentation/en-us/openshift_container_platform/4.10/html/installing/installing-on-aws#installation-aws-user-infra-rhcos-ami_installing-aws-user-infra [4] https://access.redhat.com/documentation/en-us/openshift_container_platform/4.10/html/installing/installing-on-aws#installation-aws-upload-custom-rhcos-ami_installing-aws-user-infra
Hi, An update on the region. Using the same approach as before, I managed to deploy OCP 4.8.14 with RHCOS 4.8.14. After that I upgrade the cluster to OCP 4.9.23 using stable-4.9 channel and it can be upgraded after almost 2 hours of upgrading 3 masters and 3 workers. This is without uploading and registering first the RHCOS 4.9.x AMI to ap-southeast-3 region. The MachineSet scaling is also still works.
Reproduce the issue on 4.10.5 steps: 1. Create an IPI cluster on ap-southeast-3 2. Cluster install failed, check the cluster liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 65m Unable to apply 4.10.5: some cluster operators have not yet rolled out liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-aws114-s46zx-master-0 72m huliu-aws114-s46zx-master-1 72m huliu-aws114-s46zx-master-2 72m huliu-aws114-s46zx-worker-ap-southeast-3a-cvhxz 63m huliu-aws114-s46zx-worker-ap-southeast-3b-hzl8h 63m huliu-aws114-s46zx-worker-ap-southeast-3c-tn5rn 63m liuhuali@Lius-MacBook-Pro huali-test % oc get machineset NAME DESIRED CURRENT READY AVAILABLE AGE huliu-aws114-s46zx-worker-ap-southeast-3a 1 1 72m huliu-aws114-s46zx-worker-ap-southeast-3b 1 1 72m huliu-aws114-s46zx-worker-ap-southeast-3c 1 1 72m liuhuali@Lius-MacBook-Pro huali-test % oc get node NAME STATUS ROLES AGE VERSION ip-10-0-138-63.ap-southeast-3.compute.internal Ready master 69m v1.23.3+e419edf ip-10-0-161-100.ap-southeast-3.compute.internal Ready master 69m v1.23.3+e419edf ip-10-0-203-113.ap-southeast-3.compute.internal Ready master 69m v1.23.3+e419edf liuhuali@Lius-MacBook-Pro huali-test % oc logs machine-api-controllers-7f945cbc56-ssb26 -c machine-controller ... E0322 03:24:51.944980 1 controller.go:303] huliu-aws114-s46zx-worker-ap-southeast-3a-cvhxz: failed to check if machine exists: huliu-aws114-s46zx-worker-ap-southeast-3a-cvhxz: failed to create scope for machine: failed to create aws client: region "ap-southeast-3" not resolved: UnknownEndpointError: could not resolve endpoint partition: "all partitions", service: "ec2", region: "ap-southeast-3" E0322 03:24:51.952173 1 controller.go:317] controller/machine_controller "msg"="Reconciler error" "error"="huliu-aws114-s46zx-worker-ap-southeast-3a-cvhxz: failed to create scope for machine: failed to create aws client: region \"ap-southeast-3\" not resolved: UnknownEndpointError: could not resolve endpoint\n\tpartition: \"all partitions\", service: \"ec2\", region: \"ap-southeast-3\"" "name"="huliu-aws114-s46zx-worker-ap-southeast-3a-cvhxz" "namespace"="openshift-machine-api" Verified on 4.11.0-0.nightly-2022-03-20-160505, compute node created successfully, no "failed to create scope for machine" error in machine-controller log, although the cluster still install failed, but it's due to https://bugzilla.redhat.com/show_bug.cgi?id=2065552, move this to Verified. 1. Create an IPI cluster on ap-southeast-3 2. Cluster install failed, check the cluster liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 59m Unable to apply 4.11.0-0.nightly-2022-03-20-160505: the cluster operator image-registry has not yet successfully rolled out liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-aws116-ftcts-master-0 Running m5.xlarge ap-southeast-3 ap-southeast-3a 60m huliu-aws116-ftcts-master-1 Running m5.xlarge ap-southeast-3 ap-southeast-3b 60m huliu-aws116-ftcts-master-2 Running m5.xlarge ap-southeast-3 ap-southeast-3c 60m huliu-aws116-ftcts-worker-ap-southeast-3a-bf859 Running m5.large ap-southeast-3 ap-southeast-3a 51m huliu-aws116-ftcts-worker-ap-southeast-3b-9zxmz Running m5.large ap-southeast-3 ap-southeast-3b 51m huliu-aws116-ftcts-worker-ap-southeast-3c-plhv9 Running m5.large ap-southeast-3 ap-southeast-3c 51m liuhuali@Lius-MacBook-Pro huali-test % oc get machineset NAME DESIRED CURRENT READY AVAILABLE AGE huliu-aws116-ftcts-worker-ap-southeast-3a 1 1 1 1 60m huliu-aws116-ftcts-worker-ap-southeast-3b 1 1 1 1 60m huliu-aws116-ftcts-worker-ap-southeast-3c 1 1 1 1 60m liuhuali@Lius-MacBook-Pro huali-test % oc get node NAME STATUS ROLES AGE VERSION ip-10-0-129-157.ap-southeast-3.compute.internal Ready worker 43m v1.23.3+02aefbf ip-10-0-143-131.ap-southeast-3.compute.internal Ready master 57m v1.23.3+02aefbf ip-10-0-161-81.ap-southeast-3.compute.internal Ready worker 43m v1.23.3+02aefbf ip-10-0-165-16.ap-southeast-3.compute.internal Ready master 57m v1.23.3+02aefbf ip-10-0-219-146.ap-southeast-3.compute.internal Ready master 57m v1.23.3+02aefbf ip-10-0-221-0.ap-southeast-3.compute.internal Ready worker 45m v1.23.3+02aefbf liuhuali@Lius-MacBook-Pro huali-test % oc logs machine-api-controllers-659f4bdc66-5qtvh -c machine-controller |grep "failed to create scope for machine" liuhuali@Lius-MacBook-Pro huali-test %
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069