Bug 2042370 - [IPI on Alibabacloud] installer panics when the zone does not have an enhanced NAT gateway
Summary: [IPI on Alibabacloud] installer panics when the zone does not have an enhance...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: aos-install
QA Contact: Jianli Wei
URL:
Whiteboard:
Depends On: 2040143
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-19 10:57 UTC by Matthew Staebler
Modified: 2022-03-10 16:41 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2040143
Environment:
Last Closed: 2022-03-10 16:40:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5575 0 None open Bug 2042370: [Alibaba] fix installer index panic 2022-01-25 07:09:43 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:41:06 UTC

Description Matthew Staebler 2022-01-19 10:57:30 UTC
+++ This bug was initially created as a clone of Bug #2040143 +++

Version:
$ openshift-install version
openshift-install 4.10.0-0.nightly-2022-01-13-000150
built from commit 28cfc831cee01eb503a2340b4d5365fd281bf867
release image registry.ci.openshift.org/ocp/release@sha256:089541f3c2bb64b9561fefeed3dae688e422ad1f50a17400525c6cd0bab61f46
release architecture amd64
$ 

Platform: alibabacloud

Please specify:
* IPI

What happened?
Alibabacloud doesn't support VPC in the region "cn-nanjing (China (Nanjing))", so no way to launch a cluster in the region.  

What did you expect to happen?
Suggest to either remove the region from the list, or provide a better error message. The current error message complains about instanceType, which is not the root cause in fact. 

How to reproduce it (as minimally and precisely as possible)?
Always.

Anything else we need to know?
>(1) The current error message.
$ openshift-install create install-config --dir work3
? SSH Public Key /home/fedora/.ssh/ali.pub
? Platform alibabacloud
? Region cn-nanjing
? Base Domain alicloud-qe.devcluster.openshift.com
? Cluster Name jiwei-403
? Pull Secret [? for help] *************
FATAL failed to fetch Install Config: failed to generate asset "Install Config": [controlPlane.platform.alibabacloud.instanceType: Invalid value: "ecs.g6.xlarge": no available availability zones found, compute[0].platform.alibabacloud.instanceType: Invalid value: "ecs.g6.large": no available availability zones found]
$ 
>(2) You can see there's no VPC endpoint for the region "cn-nanjing".
$ aliyun vpc DescribeRegions | jq -c ".Regions.Region[] | select(.RegionId | contains(\"cn-nanjing\"))" | jq -r .RegionEndpoint
$ 
$ aliyun vpc DescribeRegions | jq -c ".Regions.Region[] | select(.RegionId | contains(\"cn-hangzhou\"))" | jq -r .RegionEndpoint
vpc.aliyuncs.com
$ 
$ aliyun vpc DescribeRegions | jq -c ".Regions.Region[] | select(.RegionId | contains(\"cn-zhangjiakou\"))" | jq -r .RegionEndpoint
vpc.cn-zhangjiakou.aliyuncs.com
$ 
>(3) Trying an installation in region "cn-nanjing" by customizing the install-config.yaml, finally runtime error shows up.
$ openshift-install create install-config --dir work
? SSH Public Key /home/fedora/.ssh/ali.pub
? Platform alibabacloud
? Region cn-hangzhou
? Base Domain alicloud-qe.devcluster.openshift.com
? Cluster Name jiwei-403
? Pull Secret [? for help] ******
INFO Install-Config created in: work
>$ (edit the install-config.yaml to change into region 'cn-nanjing' and set valid instanceType)
$ yq e '.platform' work/install-config.yaml 
alibabacloud:
  region: cn-nanjing
  defaultMachinePlatform:
    instanceType: ecs.g6e.xlarge
$ 
$ openshift-install create manifests --dir work
INFO Consuming Install Config from target directory 
INFO Manifests created in: work/manifests and work/openshift 
$ grep zoneId work -r
work/openshift/99_openshift-cluster-api_worker-machineset-0.yaml:          zoneId: cn-nanjing-a
work/openshift/99_openshift-cluster-api_master-machines-2.yaml:      zoneId: cn-nanjing-a
work/openshift/99_openshift-cluster-api_master-machines-0.yaml:      zoneId: cn-nanjing-a
work/openshift/99_openshift-cluster-api_master-machines-1.yaml:      zoneId: cn-nanjing-a
$ 
$ openshift-install create cluster --dir work --log-level info
INFO Consuming OpenShift Install (Manifests) from target directory 
INFO Consuming Openshift Manifests from target directory 
INFO Consuming Master Machines from target directory 
INFO Consuming Worker Machines from target directory 
INFO Consuming Common Manifests from target directory 
panic: runtime error: index out of range [0] with length 0

goroutine 1 [running]:
github.com/openshift/installer/pkg/asset/cluster.(*TerraformVariables).Generate(0x25849760, 0x5)
        /go/src/github.com/openshift/installer/pkg/asset/cluster/tfvars.go:744 +0x289a
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc000fdedb0, {0x162dbdf8, 0x25849760}, {0x0, 0x0})
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:227 +0x604
github.com/openshift/installer/pkg/asset/store.(*storeImpl).Fetch(0x16451e58, {0x162dbdf8, 0x25849760}, {0x257fed00, 0x8, 0x8})
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:77 +0x48
main.runTargetCmd.func1({0x7fff94fa715e, 0x4})
        /go/src/github.com/openshift/installer/cmd/openshift-install/create.go:238 +0x116
main.runTargetCmd.func2(0x2580a460, {0xc000ffe480, 0x4, 0x4})
        /go/src/github.com/openshift/installer/cmd/openshift-install/create.go:265 +0xae
github.com/spf13/cobra.(*Command).execute(0x2580a460, {0xc000ffe440, 0x4, 0x4})
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:860 +0x5f8
github.com/spf13/cobra.(*Command).ExecuteC(0xc000fbaa00)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:974 +0x3bc
github.com/spf13/cobra.(*Command).Execute(...)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:902
main.installerMain()
        /go/src/github.com/openshift/installer/cmd/openshift-install/main.go:72 +0x29e
main.main()
        /go/src/github.com/openshift/installer/cmd/openshift-install/main.go:50 +0x125
$

Comment 1 bteng 2022-01-24 10:05:29 UTC
Cn-nanjing Region is not normal region, some dependent service do not exist in cn-naning. No need to support cn-nanjing, installer will remove cn-nanjing as a region option.

Comment 4 Jianli Wei 2022-01-29 08:31:40 UTC
Error message 'enhanced NAT gateway is not supported in the current region' shows up and no panic any more. Mark as verified. 

$ openshift-install version
openshift-install 4.10.0-0.nightly-2022-01-29-015515
built from commit 4fc9fa88c22221b6cede2456b1c33847943b75c9
release image registry.ci.openshift.org/ocp/release@sha256:b6bded497818f2e07401988576f15c62cd6fe45c385d177b50a43d6dabaf4524
release architecture amd64
$ yq e .credentialsMode test1/install-config.yaml
Manual
$ yq e .metadata test1/install-config.yaml
creationTimestamp: null
name: jiwei-613
$ yq e .platform test1/install-config.yaml 
alibabacloud:
  region: cn-nanjing
  defaultMachinePlatform:
    instanceType: ecs.g6e.xlarge
$ openshift-install create manifests --dir test1
INFO Consuming Install Config from target directory 
INFO Manifests created in: test1/manifests and test1/openshift 
$ openshift-install create cluster --dir test1 --log-level info
INFO Consuming OpenShift Install (Manifests) from target directory 
INFO Consuming Master Machines from target directory 
INFO Consuming Openshift Manifests from target directory 
INFO Consuming Common Manifests from target directory 
INFO Consuming Worker Machines from target directory 
FATAL failed to fetch Terraform Variables: failed to fetch dependency of "Terraform Variables": failed to generate asset "Platform Provisioning Check": platform.alibabacloud.region: Invalid value: "cn-nanjing": enhanced NAT gateway is not supported in the current region 
$

Comment 7 errata-xmlrpc 2022-03-10 16:40:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.