Bug 2042770 - [IPI on Alibabacloud] with vpcID & vswitchIDs specified, the installer would still try creating NAT gateway unexpectedly
Summary: [IPI on Alibabacloud] with vpcID & vswitchIDs specified, the installer would ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: aos-install
QA Contact: Jianli Wei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-20 05:44 UTC by Jianli Wei
Modified: 2022-03-10 16:41 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:40:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5574 0 None open Bug 2042770: [Alibaba] fix resource creation for existing network 2022-01-25 10:13:13 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:41:09 UTC

Description Jianli Wei 2022-01-20 05:44:13 UTC
Version:
$ openshift-install version
openshift-install 4.10.0-0.nightly-2022-01-19-150530
built from commit 08d043c78dc3feb74b3593645550b3a55aa35bff
release image registry.ci.openshift.org/ocp/release@sha256:ed7dd03bcbc6fc023c140969acb511ce2adf2aa0ad92a1337664a8313e60e697
release architecture amd64

Platform: alibabacloud

Please specify:
* IPI

What happened?
With vpcID & vswitchIDs specified, the installer would still try creating NAT gateway & EIP & the vswitch unexpectedly. 

What did you expect to happen?
In this case, the installer should use the existing NAT gateway of the VPC. 

How to reproduce it (as minimally and precisely as possible)?
Always.

Anything else we need to know?
$ openshift-install version
openshift-install 4.10.0-0.nightly-2022-01-19-150530
built from commit 08d043c78dc3feb74b3593645550b3a55aa35bff
release image registry.ci.openshift.org/ocp/release@sha256:ed7dd03bcbc6fc023c140969acb511ce2adf2aa0ad92a1337664a8313e60e697
release architecture amd64
$ openshift-install create install-config --dir work
? SSH Public Key /home/fedora/.ssh/openshift-qe.pub
? Platform alibabacloud
? Region us-east-1
? Base Domain alicloud-qe.devcluster.openshift.com
? Cluster Name jiwei-402
? Pull Secret [? for help] ******
INFO Install-Config created in: work
$ vim work/install-config.yaml
$ yq e '.platform' work/install-config.yaml 
alibabacloud:
  region: us-east-1
  resourceGroupID: rg-aek2c4huej7f3ni
  vpcID: vpc-0xiomn2irqnxk2j8y3sf1
  vswitchIDs:
    - vsw-0xifjr0tccm34obxzdh3x
    - vsw-0xicfnx2sryqxxcux0art
$ openshift-install create cluster --dir work --log-level info
INFO Consuming Install Config from target directory
INFO Creating infrastructure resources...
ERROR
ERROR Error: [ERROR] terraform-provider-alicloud/alicloud/resource_alicloud_vswitch.go:127: Resource alicloud_vswitch CreateVSwitch Failed!!! [SDK alibaba-cloud-sdk-go ERROR]:
ERROR SDKError:
ERROR    Code: InvalidCidrBlock.Overlapped
ERROR    Message: code: 400, Specified CIDR block overlapped with other subnets. request id: 8C9AC716-6017-5FCB-8ECB-95EA96261400
ERROR    Data: {"Code":"InvalidCidrBlock.Overlapped","HostId":"vpc.aliyuncs.com","Message":"Specified CIDR block overlapped with other subnets.","Recommend":"https://error-center.aliyun.com/status/search?Keyword=InvalidCidrBlock.Overlapped\u0026source=PopGw","RequestId":"8C9AC716-6017-5FCB-8ECB-95EA96261400"}
ERROR
ERROR
ERROR   on ../../tmp/openshift-install-cluster-2599886064/vpc/vpc.tf line 45, in resource "alicloud_vswitch" "vswitch_nat_gateway":
ERROR   45: resource "alicloud_vswitch" "vswitch_nat_gateway" {
ERROR
ERROR
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change
$

$ aliyun vpc DescribeVpcs --RegionId us-east-1 --VpcId vpc-0xiomn2irqnxk2j8y3sf1 --endpoint vpc.aliyuncs.com --output cols=VpcName,NatGatewayIds.NatGatewayIds[],VSwitchIds.VSwitchId[] rows=Vpcs.Vpc[]
VpcName       | NatGatewayIds.NatGatewayIds[]                         | VSwitchIds.VSwitchId[]
-------       | -----------------------------                         | ----------------------
jiwei-401-vpc | [ngw-0xibzv7asxeuki1fkhhzj ngw-0xipspeaibwc4qdnuxzpb] | [vsw-0xijuix4xe6a5v3u8vtdm vsw-0xicfnx2sryqxxcux0art vsw-0xifjr0tccm34obxzdh3x]

$

Comment 1 bteng 2022-01-24 10:02:32 UTC
Root cause has been found. Sun Hui will fix this soon.

Comment 5 Jianli Wei 2022-01-29 03:04:28 UTC
Verified with 4.10.0-0.nightly-2022-01-28-213019, see the QE flexy-install job https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/72020/. 

$ aliyun vpc DescribeVpcs --RegionId us-east-1 --VpcName jiwei-602-vpc --endpoint vpc.aliyuncs.com --output cols=CreationTime,VpcId,CidrBlock rows=Vpcs.Vpc[]
CreationTime         | VpcId                     | CidrBlock
------------         | -----                     | ---------
2022-01-29T02:11:47Z | vpc-0xisqnb8p43932tvhc1z8 | 10.0.0.0/16

$ aliyun vpc DescribeVSwitches --RegionId us-east-1 --VpcId vpc-0xisqnb8p43932tvhc1z8 --endpoint vpc.aliyuncs.com --output cols=Status,VSwitchName,VSwitchId,ZoneId rows=VSwitches.VSwitch[]
Status    | VSwitchName                  | VSwitchId                 | ZoneId
------    | -----------                  | ---------                 | ------
Available | jiwei-602-vswitch-us-east-1b | vsw-0xigy12jsaglfva8eb2ez | us-east-1b
Available | jiwei-602-vswitch-us-east-1a | vsw-0xiekixvoy0qwp91zh9sc | us-east-1a
Available | jiwei-602-vswitch-natgw      | vsw-0xim5ziaxoer065iv8myr | us-east-1a

$ aliyun vpc DescribeNatGateways --RegionId us-east-1 --VpcId vpc-0xisqnb8p43932tvhc1z8 --endpoint vpc.aliyuncs.com --output cols=NatGatewayId,NetworkType,IpLists.IpList[].IpAddress,SnatTableIds.SnatTableId rows=NatGateways.NatGateway[]
NatGatewayId              | NetworkType | IpLists.IpList[].IpAddress | SnatTableIds.SnatTableId
------------              | ----------- | -------------------------- | ------------------------
ngw-0xinjt5g5iimlw2xifz3l | internet    | [47.253.214.247]           | [stb-0xi5bg3d0bc8jh7ld79vz]

$ aliyun vpc DescribeSnatTableEntries --RegionId us-east-1 --SnatTableId stb-0xi5bg3d0bc8jh7ld79vz --endpoint vpc.aliyuncs.com --output cols=SnatEntryId,Status,SnatIp,SourceCIDR,SourceVSwitchId rows=SnatTableEntries.SnatTableEntry[]
SnatEntryId                | Status    | SnatIp         | SourceCIDR    | SourceVSwitchId
-----------                | ------    | ------         | ----------    | ---------------
snat-0xiw2wwcnbsj77vf3xc3a | Available | 47.253.214.247 | 10.0.224.0/20 | vsw-0xigy12jsaglfva8eb2ez
snat-0xi9hdfdvt0tv8f6fag4v | Available | 47.253.214.247 | 10.0.240.0/20 | vsw-0xiekixvoy0qwp91zh9sc

$

Comment 8 errata-xmlrpc 2022-03-10 16:40:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.