Version: $ openshift-install version openshift-install 4.10.0-0.ci.test-2022-01-18-015330-ci-ln-c2rvwfb-latest built from commit c4bc155f6de2494b9baca767cd74dc665e2ec468 release image registry.build01.ci.openshift.org/ci-ln-c2rvwfb/release@sha256:105a191b4183a002f36cd4421a8db27ccb1e352d20a428e3899b0da491859451 release architecture amd64 Platform: alibabacloud Please specify: * IPI What happened? IPI installation failed, due to 'resource type [[cloud_essd]] not exists in [ap-southeast-3a]', although the specified 'systemDiskCategory' is 'cloud_efficiency'. What did you expect to happen? The installer should use the specified 'defaultMachinePlatform' when launching any ECS instance. How to reproduce it (as minimally and precisely as possible)? Always. Anything else we need to know? $ openshift-install create install-config --dir work ? SSH Public Key /home/jiwei/.ssh/openshift-qe.pub ? Platform alibabacloud ? Region ap-southeast-3 ? Base Domain alicloud-qe.devcluster.openshift.com ? Cluster Name jiwei-204 ? Pull Secret [? for help] ******* $ echo 'credentialsMode: Manual' >> work/install-config.yaml $ vim work/install-config.yaml $ yq e '.platform' work/install-config.yaml alibabacloud: region: ap-southeast-3 resourceGroupID: rg-aek2wky7lxk4f5y defaultMachinePlatform: instanceType: ecs.g6.xlarge systemDiskCategory: cloud_efficiency systemDiskSize: 200 $ $ openshift-install create manifests --dir work INFO Consuming Install Config from target directory INFO Manifests created in: work/manifests and work/openshift $ $ openshift-install create cluster --dir work --log-level info INFO Consuming Master Machines from target directory INFO Consuming Openshift Manifests from target directory INFO Consuming Worker Machines from target directory INFO Consuming OpenShift Install (Manifests) from target directory INFO Consuming Common Manifests from target directory INFO Creating infrastructure resources... ERROR ERROR Error: [ERROR] terraform-provider-alicloud/alicloud/resource_alicloud_instance.go:452: Resource alicloud_instance RunInstances Failed!!! [SDK alibaba-cloud-sdk-go ERROR]: ERROR SDK.ServerError ERROR ErrorCode: InvalidResourceType.NotSupported ERROR Recommend: https://error-center.aliyun.com/status/search?Keyword=InvalidResourceType.NotSupported&source=PopGw ERROR RequestId: 961BAEA3-3F36-3C09-AC48-14BB985902A0 ERROR Message: user order resource type [[cloud_essd]] not exists in [ap-southeast-3a] ERROR ERROR on ../../../tmp/openshift-install-bootstrap-3799552535/main.tf line 133, in resource "alicloud_instance" "bootstrap": ERROR 133: resource "alicloud_instance" "bootstrap" { ERROR ERROR FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change $ $ aliyun ecs DescribeAvailableResource --DestinationResource 'SystemDisk' --RegionId ap-southeast-3 --InstanceType 'ecs.g6.xlarge' --endpoint ecs.ap-southeast-3.aliyuncs.com --output cols=ZoneId,AvailableResources.AvailableResource[].SupportedResources.SupportedResource[] rows=AvailableZones.AvailableZone[] ZoneId | AvailableResources.AvailableResource[].SupportedResources.SupportedResource[] ------ | ----------------------------------------------------------------------------- ap-southeast-3a | [map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_efficiency] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_ssd]] ap-southeast-3b | [map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_essd] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_efficiency] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_ssd]] $
The region "cn-qingdao (China (Qingdao))" has similar issue. $ yq e '.controlPlane' work/install-config.yaml architecture: amd64 hyperthreading: Enabled name: master platform: alibabacloud: systemDiskCategory: cloud_efficiency replicas: 3 $ yq e '.compute' work/install-config.yaml - architecture: amd64 hyperthreading: Enabled name: worker platform: alibabacloud: systemDiskCategory: cloud_efficiency replicas: 3 $ yq e '.platform' work/install-config.yaml alibabacloud: region: cn-qingdao resourceGroupID: rg-aek2wky7lxk4f5y $ $ openshift-install create cluster --dir work --log-level info INFO Consuming Common Manifests from target directory INFO Consuming Worker Machines from target directory INFO Consuming OpenShift Install (Manifests) from target directory INFO Consuming Master Machines from target directory INFO Consuming Openshift Manifests from target directory INFO Creating infrastructure resources... ERROR ERROR Error: [ERROR] terraform-provider-alicloud/alicloud/resource_alicloud_instance.go:452: Resource alicloud_instance RunInstances Failed!!! [SDK alibaba-cloud-sdk-go ERROR]: ERROR SDK.ServerError ERROR ErrorCode: InvalidResourceType.NotSupported ERROR Recommend: https://error-center.aliyun.com/status/search?Keyword=InvalidResourceType.NotSupported&source=PopGw ERROR RequestId: 374A4CC9-2370-5998-899D-7C54C39A9533 ERROR Message: user order resource type [[cloud_essd]] not exists in [cn-qingdao-b] ERROR ERROR on ../../../tmp/openshift-install-bootstrap-2812370522/main.tf line 133, in resource "alicloud_instance" "bootstrap": ERROR 133: resource "alicloud_instance" "bootstrap" { ERROR ERROR FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change $ $ aliyun ecs DescribeAvailableResource --DestinationResource 'SystemDisk' --RegionId cn-qingdao --InstanceType 'ecs.g6.xlarge' --endpoint ecs.cn-qingdao.aliyuncs.com --output cols=ZoneId,AvailableResources.AvailableResource[].SupportedResources.SupportedResource[] rows=AvailableZones.AvailableZone[] ZoneId | AvailableResources.AvailableResource[].SupportedResources.SupportedResource[] ------ | ----------------------------------------------------------------------------- cn-qingdao-b | [map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_efficiency] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_ssd]] $ aliyun ecs DescribeAvailableResource --DestinationResource 'SystemDisk' --RegionId cn-qingdao --InstanceType 'ecs.g6.large' --endpoint ecs.cn-qingdao.aliyuncs.com --output cols=ZoneId,AvailableResources.AvailableResource[].SupportedResources.SupportedResource[] rows=AvailableZones.AvailableZone[] ZoneId | AvailableResources.AvailableResource[].SupportedResources.SupportedResource[] ------ | ----------------------------------------------------------------------------- cn-qingdao-c | [map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_essd] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_efficiency] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_ssd]] cn-qingdao-b | [map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_efficiency] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_ssd]] $
@husun The bootstrap VM is hard-coded to use cloud_essd. Is that intentional?
I am setting this as a non-blocker for now as it only affects regions that do not support cloud_essd.
root cause has been found, sunhui is working on it, PR will be submitted soon.
I have fixed it on the PR https://github.com/openshift/installer/pull/5564
$ openshift-install create install-config --dir work ? SSH Public Key /home/fedora/.ssh/openshift-qe.pub ? Platform alibabacloud ? Region ap-southeast-3 ? Base Domain alicloud-qe.devcluster.openshift.com ? Cluster Name jiwei-408 ? Pull Secret [? for help] ******** INFO Install-Config created in: work $ vim work/install-config.yaml $ yq e .platform work/install-config.yaml alibabacloud: region: ap-southeast-3 resourceGroupID: rg-aek2wky7lxk4f5y defaultMachinePlatform: instanceType: ecs.g6.xlarge systemDiskCategory: cloud_efficiency systemDiskSize: 200 $ yq e .metadata work/install-config.yaml creationTimestamp: null name: jiwei-408 $ yq e .credentialsMode work/install-config.yaml Manual $ openshift-install create manifests --dir work INFO Consuming Install Config from target directory INFO Manifests created in: work/manifests and work/openshift $ $ openshift-install create cluster --dir work --log-level info INFO Consuming Master Machines from target directory INFO Consuming Worker Machines from target directory INFO Consuming OpenShift Install (Manifests) from target directory INFO Consuming Common Manifests from target directory INFO Consuming Openshift Manifests from target directory INFO Creating infrastructure resources... INFO Waiting up to 20m0s (until 11:57AM) for the Kubernetes API at https://api.jiwei-408.alicloud-qe.devcluster.openshift.com:6443... INFO API v1.23.0+2135ac2 up INFO Waiting up to 30m0s (until 12:11PM) for bootstrapping to complete... INFO Destroying the bootstrap resources... INFO Waiting up to 40m0s (until 12:31PM) for the cluster at https://api.jiwei-408.alicloud-qe.devcluster.openshift.com:6443 to initialize... W0127 11:52:08.550078 430110 reflector.go:324] k8s.io/client-go/tools/watch/informerwatcher.go:146: failed to list *v1.ClusterVersion: Get "https://api.jiwei-408.alicloud-qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusterversions?fieldSelector=metadata.name%3Dversion&limit=500&resourceVersion=0": http2: client connection lost I0127 11:52:08.550251 430110 trace.go:205] Trace[1248183454]: "Reflector ListAndWatch" name:k8s.io/client-go/tools/watch/informerwatcher.go:146 (27-Jan-2022 11:51:51.019) (total time: 17530ms): Trace[1248183454]: ---"Objects listed" error:Get "https://api.jiwei-408.alicloud-qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusterversions?fieldSelector=metadata.name%3Dversion&limit=500&resourceVersion=0": http2: client connection lost 17530ms (11:52:08.550) Trace[1248183454]: [17.530476537s] [17.530476537s] END E0127 11:52:08.550279 430110 reflector.go:138] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *v1.ClusterVersion: failed to list *v1.ClusterVersion: Get "https://api.jiwei-408.alicloud-qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusterversions?fieldSelector=metadata.name%3Dversion&limit=500&resourceVersion=0": http2: client connection lost INFO Waiting up to 10m0s (until 12:12PM) for the openshift-console route to be created... INFO Install complete! INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/fedora/work/auth/kubeconfig' INFO Access the OpenShift web-console here: https://console-openshift-console.apps.jiwei-408.alicloud-qe.devcluster.openshift.com INFO Login to the console with user: "kubeadmin", and password: "3iUbd-G5R5G-skw2e-9LxZ9" INFO Time elapsed: 27m5s $ $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-26-234447 True False 2m17s Cluster version is 4.10.0-0.nightly-2022-01-26-234447 $ oc get nodes NAME STATUS ROLES AGE VERSION jiwei-408-hv4sp-master-0 Ready master 21m v1.23.0+2135ac2 jiwei-408-hv4sp-master-1 Ready master 19m v1.23.0+2135ac2 jiwei-408-hv4sp-master-2 Ready master 19m v1.23.0+2135ac2 jiwei-408-hv4sp-worker-ap-southeast-3a-rnmd7 Ready worker 8m49s v1.23.0+2135ac2 jiwei-408-hv4sp-worker-ap-southeast-3a-zhmkj Ready worker 8m45s v1.23.0+2135ac2 jiwei-408-hv4sp-worker-ap-southeast-3b-8j2ws Ready worker 10m v1.23.0+2135ac2 $ $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.10.0-0.nightly-2022-01-26-234447 True False False 2m48s baremetal 4.10.0-0.nightly-2022-01-26-234447 True False False 18m cloud-controller-manager 4.10.0-0.nightly-2022-01-26-234447 True False False 21m cloud-credential 4.10.0-0.nightly-2022-01-26-234447 True False False 17m cluster-autoscaler 4.10.0-0.nightly-2022-01-26-234447 True False False 17m config-operator 4.10.0-0.nightly-2022-01-26-234447 True False False 19m console 4.10.0-0.nightly-2022-01-26-234447 True False False 4m41s csi-snapshot-controller 4.10.0-0.nightly-2022-01-26-234447 True False False 18m dns 4.10.0-0.nightly-2022-01-26-234447 True False False 17m etcd 4.10.0-0.nightly-2022-01-26-234447 True False False 16m image-registry 4.10.0-0.nightly-2022-01-26-234447 True False False 10m ingress 4.10.0-0.nightly-2022-01-26-234447 True False False 9m32s insights 4.10.0-0.nightly-2022-01-26-234447 True False False 12m kube-apiserver 4.10.0-0.nightly-2022-01-26-234447 True False False 15m kube-controller-manager 4.10.0-0.nightly-2022-01-26-234447 True False False 16m kube-scheduler 4.10.0-0.nightly-2022-01-26-234447 True False False 15m kube-storage-version-migrator 4.10.0-0.nightly-2022-01-26-234447 True False False 18m machine-api 4.10.0-0.nightly-2022-01-26-234447 True False False 13m machine-approver 4.10.0-0.nightly-2022-01-26-234447 True False False 17m machine-config 4.10.0-0.nightly-2022-01-26-234447 True False False 16m marketplace 4.10.0-0.nightly-2022-01-26-234447 True False False 17m monitoring 4.10.0-0.nightly-2022-01-26-234447 True False False 7m8s network 4.10.0-0.nightly-2022-01-26-234447 True False False 18m node-tuning 4.10.0-0.nightly-2022-01-26-234447 True False False 8m9s openshift-apiserver 4.10.0-0.nightly-2022-01-26-234447 True False False 12m openshift-controller-manager 4.10.0-0.nightly-2022-01-26-234447 True False False 17m openshift-samples 4.10.0-0.nightly-2022-01-26-234447 True False False 12m operator-lifecycle-manager 4.10.0-0.nightly-2022-01-26-234447 True False False 18m operator-lifecycle-manager-catalog 4.10.0-0.nightly-2022-01-26-234447 True False False 17m operator-lifecycle-manager-packageserver 4.10.0-0.nightly-2022-01-26-234447 True False False 12m service-ca 4.10.0-0.nightly-2022-01-26-234447 True False False 19m storage 4.10.0-0.nightly-2022-01-26-234447 True False True 15m AlibabaDiskCSIDriverOperatorCRDegraded: AlibabaCloudDriverStaticResourcesControllerDegraded: "rbac/snapshotter_role.yaml" (string): clusterroles.rbac.authorization.k8s.io "alibaba-disk-external-snapshotter-role" is forbidden: user "system:serviceaccount:openshift-cluster-csi-drivers:alibaba-disk-csi-driver-operator" (groups=["system:serviceaccounts" "system:serviceaccounts:openshift-cluster-csi-drivers" "system:authenticated"]) is attempting to grant RBAC permissions not currently held:... $
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056