Bug 2048067 - [IPI on Alibabacloud] "Platform Provisioning Check" tells '"ap-southeast-6": enhanced NAT gateway is not supported', which seems false
Summary: [IPI on Alibabacloud] "Platform Provisioning Check" tells '"ap-southeast-6": ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.11.0
Assignee: Nobody
QA Contact: Jianli Wei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-29 10:15 UTC by Jianli Wei
Modified: 2022-08-10 10:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:45:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
slb don't have backend servers (233.35 KB, image/png)
2022-03-03 05:56 UTC, Jianli Wei
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5664 0 None open Bug 2048067: [Alibaba] fix location service endpoint 2022-02-28 16:48:22 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:45:36 UTC

Description Jianli Wei 2022-01-29 10:15:08 UTC
Version:
$ openshift-install version
openshift-install 4.10.0-0.nightly-2022-01-29-015515
built from commit 4fc9fa88c22221b6cede2456b1c33847943b75c9
release image registry.ci.openshift.org/ocp/release@sha256:b6bded497818f2e07401988576f15c62cd6fe45c385d177b50a43d6dabaf4524
release architecture amd64

Platform: alibabacloud

Please specify:
* IPI

What happened?
Installation failed in region 'ap-southeast-6' due to enhanced NAT gateway is not supported, but the zone 'ap-southeast-6a' does be told supporting it. Please clarify.

What did you expect to happen?
As enhanced NAT gateway is supported in zone 'ap-southeast-6a', the installer should tell correct reason for failure. 

How to reproduce it (as minimally and precisely as possible)?
Always.

Anything else we need to know?
We got the error after https://bugzilla.redhat.com/show_bug.cgi?id=2041750 is fixed. 

> the errors told by installer (https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/72116/):
01-29 17:20:23.753  level=debug msg=  Generating Platform Provisioning Check...
01-29 17:20:26.307  level=fatal msg=failed to fetch Terraform Variables: failed to fetch dependency of "Terraform Variables": failed to generate asset "Platform Provisioning Check": platform.alibabacloud.region: Invalid value: "ap-southeast-6": enhanced NAT gateway is not supported in the current region

> the zone 'ap-southeast-6a' does be told supporting enhanced nat gateway, and also the default instance types (ecs.g6.xlarge and ecs.g6.large) and system disk category (cloud_essd)
$ aliyun vpc ListEnhanhcedNatGatewayAvailableZones --RegionId ap-southeast-6 --endpoint vpc.ap-southeast-6.aliyuncs.com --output cols=ZoneId rows=Zones[]
ZoneId
------
ap-southeast-6a

$ 
$ aliyun ecs DescribeAvailableResource --DestinationResource 'InstanceType' --RegionId ap-southeast-6 --IoOptimized 'optimized' --InstanceType ecs.g6.xlarge --endpoint ecs.ap-southeast-6.aliyuncs.com | jq -r .AvailableZones.AvailableZone[].ZoneId
ap-southeast-6a
$ aliyun ecs DescribeAvailableResource --DestinationResource 'InstanceType' --RegionId ap-southeast-6 --IoOptimized 'optimized' --InstanceType ecs.g6.large --endpoint ecs.ap-southeast-6.aliyuncs.com | jq -r .AvailableZones.AvailableZone[].ZoneId
ap-southeast-6a
$ 
$ aliyun ecs DescribeAvailableResource --DestinationResource 'SystemDisk' --RegionId ap-southeast-6 --IoOptimized 'optimized' --InstanceType ecs.g6.xlarge --endpoint ecs.ap-southeast-6.aliyuncs.com --output cols=ZoneId,AvailableResources.AvailableResource[].SupportedResources.SupportedResource[] rows=AvailableZones.AvailableZone[]
ZoneId          | AvailableResources.AvailableResource[].SupportedResources.SupportedResource[]
------          | -----------------------------------------------------------------------------
ap-southeast-6a | [map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_essd] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_efficiency]]

$ aliyun ecs DescribeAvailableResource --DestinationResource 'SystemDisk' --RegionId ap-southeast-6 --IoOptimized 'optimized' --InstanceType ecs.g6.large --endpoint ecs.ap-southeast-6.aliyuncs.com --output cols=ZoneId,AvailableResources.AvailableResource[].SupportedResources.SupportedResource[] rows=AvailableZones.AvailableZone[]
ZoneId          | AvailableResources.AvailableResource[].SupportedResources.SupportedResource[]
------          | -----------------------------------------------------------------------------
ap-southeast-6a | [map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_essd] map[Max:500 Min:20 Status:Available Unit:GiB Value:cloud_efficiency]]

$

Comment 1 husun 2022-02-24 06:32:49 UTC
This is because the endpoint used by the Installer is vpc.aliyuncs.com (vpc.ap-southeast-6.aliyuncs.com is correct), which results in an empty query result

Comment 2 Jianli Wei 2022-03-03 05:56:25 UTC
Created attachment 1863928 [details]
slb don't have backend servers

Tested with the build having the PR https://github.com/openshift/installer/pull/5664, installation failed. Please investigate.

FYI the QE flexy-install job: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/81383/

$ aliyun vpc DescribeVpcs --RegionId "ap-southeast-6" --endpoint "vpc.ap-southeast-6.aliyuncs.com" --VpcName "jiwei-401-rzhmm-vpc" --output cols=RegionId,VpcId,VSwitchIds.VSwitchId[],NatGatewayIds.NatGatewayIds[] rows=Vpcs.Vpc[]
RegionId       | VpcId                     | VSwitchIds.VSwitchId[]                                | NatGatewayIds.NatGatewayIds[]
--------       | -----                     | ----------------------                                | -----------------------------
ap-southeast-6 | vpc-5tsu2bg57cq75of568fh9 | [vsw-5tspyd7cc3xzrlf9kznmo vsw-5tsx7mw4jk10kzesn9zn2] | [ngw-5tsy7byzlugowug02bru7]

$ aliyun ecs DescribeInstances --RegionId "ap-southeast-6" --endpoint "ecs.ap-southeast-6.aliyuncs.com" --VpcId "vpc-5tsu2bg57cq75of568fh9" --output cols=CreationTime,ZoneId,InstanceType,Status,InstanceId,InstanceName rows=Instances.Instance[]
CreationTime      | ZoneId          | InstanceType  | Status  | InstanceId             | InstanceName
------------      | ------          | ------------  | ------  | ----------             | ------------
2022-03-03T03:23Z | ap-southeast-6a | ecs.g6.large  | Running | i-5ts67v7x8edvp9gk8qr7 | jiwei-401-rzhmm-worker-ap-southeast-6a-jxwzf
2022-03-03T03:23Z | ap-southeast-6a | ecs.g6.large  | Running | i-5ts9e4l2lk2k94xp6n9t | jiwei-401-rzhmm-worker-ap-southeast-6a-9qcwn
2022-03-03T03:21Z | ap-southeast-6a | ecs.g6.large  | Running | i-5ts67v7x8edvp9gk8qr6 | jiwei-401-rzhmm-worker-ap-southeast-6a-j7g6w
2022-03-03T02:58Z | ap-southeast-6a | ecs.g6.xlarge | Running | i-5ts67v7x8edvoxmdkewf | jiwei-401-rzhmm-master-2
2022-03-03T02:58Z | ap-southeast-6a | ecs.g6.xlarge | Running | i-5ts67v7x8edvoxmdkewg | jiwei-401-rzhmm-master-1
2022-03-03T02:58Z | ap-southeast-6a | ecs.g6.xlarge | Running | i-5ts67v7x8edvoxmdkewe | jiwei-401-rzhmm-master-0

$ aliyun slb DescribeLoadBalancers --RegionId "ap-southeast-6" --endpoint "slb.ap-southeast-6.aliyuncs.com" --Tags "[{'TagKey': 'ack.aliyun.com', 'TagValue': 'jiwei-401-rzhmm'}]" --output cols=CreateTime,MasterZoneId,AddressType,Address,LoadBalancerId,LoadBalancerName rows=LoadBalancers.LoadBalancer[]
CreateTime        | MasterZoneId    | AddressType | Address       | LoadBalancerId           | LoadBalancerName
----------        | ------------    | ----------- | -------       | --------------           | ----------------
2022-03-03T11:19Z | ap-southeast-6a | internet    | 8.212.145.246 | lb-5ts6obz6bja9o0nv2abvi | a85acd04434444b54a37040b62f80fc0

$ aliyun slb DescribeLoadBalancers --RegionId "ap-southeast-6" --endpoint "slb.ap-southeast-6.aliyuncs.com" --Tags "[{'TagKey': 'kubernetes.io/cluster/jiwei-401-rzhmm', 'TagValue': 'owned'}]" --output cols=CreateTime,MasterZoneId,AddressType,Address,LoadBalancerId,LoadBalancerName rows=LoadBalancers.LoadBalancer[]
CreateTime        | MasterZoneId    | AddressType | Address       | LoadBalancerId           | LoadBalancerName
----------        | ------------    | ----------- | -------       | --------------           | ----------------
2022-03-03T10:58Z | ap-southeast-6a | intranet    | 10.0.216.35   | lb-5tsah1k3hzq31d8169zhm | jiwei-401-rzhmm-slb-internal
2022-03-03T10:58Z | ap-southeast-6a | internet    | 8.212.176.218 | lb-5tsfppc8yuwwa1d7f5in0 | jiwei-401-rzhmm-slb-external

$

Comment 3 Jianli Wei 2022-03-03 05:58:39 UTC
FYI the build was generated by Slack App "cluster-bot" (by "build openshift/installer#5664"), see https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp-modern/1499207346130259968.

Comment 8 errata-xmlrpc 2022-08-10 10:45:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.