Bug 2041319 - [IPI on Alibabacloud] installation in region "cn-shanghai" failed, due to "Resource alicloud_vswitch CreateVSwitch Failed...InvalidCidrBlock.Overlapped"
Summary: [IPI on Alibabacloud] installation in region "cn-shanghai" failed, due to "Re...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: aos-install
QA Contact: Jianli Wei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-17 05:36 UTC by Jianli Wei
Modified: 2022-03-10 16:39 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:39:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5566 0 None open Bug 2041319: [Alibaba] fix VSwitch subnets overlap 2022-01-22 18:10:53 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:39:56 UTC

Description Jianli Wei 2022-01-17 05:36:47 UTC
Version:
$ openshift-install version
openshift-install 4.10.0-0.nightly-2022-01-16-191814
built from commit 2f0b6c3dc404e0ac63557f44fcc644144e881d6d
release image registry.ci.openshift.org/ocp/release@sha256:bbcc0968909576836ddf063322f9b4b4485b3204985535d7490e42cb9ef339ca
release architecture amd64

Platform: alibabacloud

Please specify:
* IPI

What happened?
IPI installation in region "cn-shanghai" failed, due to "Resource alicloud_vswitch CreateVSwitch Failed...InvalidCidrBlock.Overlapped".

What did you expect to happen?
The installation should succeed in the region.

How to reproduce it (as minimally and precisely as possible)?
Always

Anything else we need to know?
$ openshift-install create install-config --dir work
? SSH Public Key /home/jiwei/.ssh/openshift-qe.pub
? Platform alibabacloud
? Region cn-shanghai
? Base Domain alicloud-qe.devcluster.openshift.com
? Cluster Name jiwei-104
? Pull Secret [? for help] *******
INFO Install-Config created in: work
$ 
$ echo 'credentialsMode: Manual' >> work/install-config.yaml
$ 
$ openshift-install create manifests --dir work
INFO Consuming Install Config from target directory
INFO Manifests created in: work/manifests and work/openshift
$ 
$ openshift-install create cluster --dir work --log-level info
INFO Consuming Master Machines from target directory
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Consuming Common Manifests from target directory
INFO Consuming Worker Machines from target directory
INFO Consuming Openshift Manifests from target directory
INFO Creating infrastructure resources...
ERROR
ERROR Error: [ERROR] terraform-provider-alicloud/alicloud/resource_alicloud_vswitch.go:127: Resource alicloud_vswitch CreateVSwitch Failed!!! [SDK alibaba-cloud-sdk-go ERROR]:
ERROR SDKError:
ERROR    Code: InvalidCidrBlock.Overlapped
ERROR    Message: code: 400, Specified CIDR block overlapped with other subnets. request id: EC917D02-E392-5FF7-8352-24E88FC05ED7
ERROR    Data: {"Code":"InvalidCidrBlock.Overlapped","HostId":"vpc.aliyuncs.com","Message":"Specified CIDR block overlapped with other subnets.","Recommend":"https://error-center.aliyun.com/status/search?Keyword=InvalidCidrBlock.Overlapped\u0026source=PopGw","RequestId":"EC917D02-E392-5FF7-8352-24E88FC05ED7"}
ERROR
ERROR
ERROR   on ../../../tmp/openshift-install-cluster-3593658407/vpc/vpc.tf line 29, in resource "alicloud_vswitch" "vswitches":
ERROR   29: resource "alicloud_vswitch" "vswitches" {
ERROR
ERROR
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change
$ 
$ aliyun vpc DescribeVpcs --VpcName jiwei-104-hp8pb-vpc --RegionId cn-shanghai --endpoint vpc.aliyuncs.com --output cols=RegionId,VpcId,CidrBlock rows=Vpcs.Vpc[]
RegionId    | VpcId                     | CidrBlock
--------    | -----                     | ---------
cn-shanghai | vpc-uf656z32yxtociv1ikzmx | 10.0.0.0/16
$ 
$ aliyun vpc DescribeVSwitches --VpcId vpc-uf656z32yxtociv1ikzmx --RegionId cn-shanghai --endpoint vpc.aliyuncs.com --output cols=ZoneId,VSwitchName,CidrBlock rows=VSwitches.VSwitch[]
ZoneId        | VSwitchName                           | CidrBlock
------        | -----------                           | ---------
cn-shanghai-g | jiwei-104-hp8pb-vswitch-cn-shanghai-g | 10.0.32.0/20
cn-shanghai-m | jiwei-104-hp8pb-vswitch-cn-shanghai-m | 10.0.48.0/20
cn-shanghai-l | jiwei-104-hp8pb-vswitch-cn-shanghai-l | 10.0.0.0/20
cn-shanghai-b | jiwei-104-hp8pb-vswitch-cn-shanghai-b | 10.0.16.0/20
cn-shanghai-a | jiwei-104-hp8pb-vswitch-nat-gateway   | 10.0.64.0/20
$ 
$ aliyun ecs DescribeAvailableResource --DestinationResource 'InstanceType' --RegionId cn-shanghai --IoOptimized 'optimized' --InstanceType ecs.g6.xlarge --endpoint ecs.cn-shanghai.aliyuncs.com | jq -r .AvailableZones.AvailableZone[].ZoneId
cn-shanghai-l
cn-shanghai-b
cn-shanghai-g
cn-shanghai-m
cn-shanghai-n
$ aliyun vpc ListEnhanhcedNatGatewayAvailableZones --RegionId cn-shanghai --endpoint vpc.aliyuncs.com | jq -r .Zones[].ZoneId
cn-shanghai-a
cn-shanghai-b
cn-shanghai-e
cn-shanghai-f
cn-shanghai-g
cn-shanghai-l
$

Comment 1 Matthew Staebler 2022-01-17 17:51:10 UTC
I am marking this as a non-blocker as it appears to only effect the one region.

Comment 2 Brian Lu 2022-01-21 01:07:15 UTC
RC has been found, sunhui is working on it, will submit a PR soon.

Comment 3 husun 2022-01-24 09:08:32 UTC
Have fixed it on the PR https://github.com/openshift/installer/pull/5566

Comment 4 Jianli Wei 2022-01-26 09:58:29 UTC
$ openshift-install version
openshift-install 4.10.0-0.ci-2022-01-26-033956
built from commit 281380228dab000afcf5299a5a8fef6c03958340
release image registry.ci.openshift.org/ocp/release@sha256:26308dd5cde2edbfbafebff6521628779f64e492306a2bc457d973215c4773ae
release architecture amd64
$ 
$ openshift-install create install-config --dir work1
? SSH Public Key /home/fedora/.ssh/openshift-qe.pub
? Platform alibabacloud
? Region cn-shanghai
? Base Domain alicloud-qe.devcluster.openshift.com
? Cluster Name jiwei-305
? Pull Secret [? for help] *******
INFO Install-Config created in: work1
$
$ echo 'credentialsMode: Manual' >> work1/install-config.yaml
$ openshift-install create manifests --dir work1
INFO Consuming Install Config from target directory
INFO Manifests created in: work1/manifests and work1/openshift
$ openshift-install create cluster --dir work1 --log-level info
INFO Consuming Openshift Manifests from target directory
INFO Consuming Common Manifests from target directory
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Consuming Worker Machines from target directory
INFO Consuming Master Machines from target directory
INFO Creating infrastructure resources...
INFO Waiting up to 20m0s (until 9:23AM) for the Kubernetes API at https://api.jiwei-305.alicloud-qe.devcluster.openshift.com:6443...
INFO API v1.22.1-4635+b259fd8b378e8b-dirty up
INFO Waiting up to 30m0s (until 9:38AM) for bootstrapping to complete...
INFO Destroying the bootstrap resources...
INFO Waiting up to 40m0s (until 10:10AM) for the cluster at https://api.jiwei-305.alicloud-qe.devcluster.openshift.com:6443 to initialize...
W0126 09:30:28.622676  428248 reflector.go:324] k8s.io/client-go/tools/watch/informerwatcher.go:146: failed to list *v1.ClusterVersion: Get "https://api.jiwei-305.alicloud-qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusterversions?fieldSelector=metadata.name%3Dversion&limit=500&resourceVersion=0": http2: client connection lost
E0126 09:30:28.622876  428248 reflector.go:138] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *v1.ClusterVersion: failed to list *v1.ClusterVersion: Get "https://api.jiwei-305.alicloud-qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusterversions?fieldSelector=metadata.name%3Dversion&limit=500&resourceVersion=0": http2: client connection lost
INFO Waiting up to 10m0s (until 9:53AM) for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/fedora/work1/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.jiwei-305.alicloud-qe.devcluster.openshift.com
INFO Login to the console with user: "kubeadmin", and password: "72U6H-uqakT-vzoz6-6WvuX"
INFO Time elapsed: 43m1s
$
$ oc get clusterversion
NAME      VERSION                         AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.ci-2022-01-26-033956   True        False         3m26s   Cluster version is 4.10.0-0.ci-2022-01-26-033956
$ oc get nodes
NAME                             STATUS   ROLES    AGE   VERSION
jiwei-305-vb95j-master-0         Ready    master   35m   v1.23.0+06791f6
jiwei-305-vb95j-master-1         Ready    master   36m   v1.23.0+06791f6
jiwei-305-vb95j-master-2         Ready    master   35m   v1.23.0+06791f6
jiwei-305-vb95j-worker-b-npqjb   Ready    worker   13m   v1.23.0+06791f6
jiwei-305-vb95j-worker-l-nf5w6   Ready    worker   13m   v1.23.0+06791f6
$ 
$ oc get co
NAME                                       VERSION                         AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.0-0.ci-2022-01-26-033956   True        False         False      3m56s
baremetal                                  4.10.0-0.ci-2022-01-26-033956   True        False         False      31m
cloud-controller-manager                   4.10.0-0.ci-2022-01-26-033956   True        False         False      35m
cloud-credential                           4.10.0-0.ci-2022-01-26-033956   True        False         False      31m     
cluster-autoscaler                         4.10.0-0.ci-2022-01-26-033956   True        False         False      31m     
config-operator                            4.10.0-0.ci-2022-01-26-033956   True        False         False      32m     
console                                    4.10.0-0.ci-2022-01-26-033956   True        False         False      8m18s   
csi-snapshot-controller                    4.10.0-0.ci-2022-01-26-033956   True        False         False      32m     
dns                                        4.10.0-0.ci-2022-01-26-033956   True        False         False      31m     
etcd                                       4.10.0-0.ci-2022-01-26-033956   True        False         False      30m     
image-registry                             4.10.0-0.ci-2022-01-26-033956   True        False         False      11m     
ingress                                    4.10.0-0.ci-2022-01-26-033956   True        False         False      10m     
insights                                   4.10.0-0.ci-2022-01-26-033956   True        False         False      26m     
kube-apiserver                             4.10.0-0.ci-2022-01-26-033956   True        False         False      25m     
kube-controller-manager                    4.10.0-0.ci-2022-01-26-033956   True        False         False      27m     
kube-scheduler                             4.10.0-0.ci-2022-01-26-033956   True        False         False      27m     
kube-storage-version-migrator              4.10.0-0.ci-2022-01-26-033956   True        False         False      32m     
machine-api                                4.10.0-0.ci-2022-01-26-033956   True        False         False      26m     
machine-approver                           4.10.0-0.ci-2022-01-26-033956   True        False         False      31m     
machine-config                             4.10.0-0.ci-2022-01-26-033956   True        False         False      12m     
marketplace                                4.10.0-0.ci-2022-01-26-033956   True        False         False      31m     
monitoring                                 4.10.0-0.ci-2022-01-26-033956   True        False         False      9m27s   
network                                    4.10.0-0.ci-2022-01-26-033956   True        False         False      30m     
node-tuning                                4.10.0-0.ci-2022-01-26-033956   True        False         False      31m     
openshift-apiserver                        4.10.0-0.ci-2022-01-26-033956   True        False         False      16m     
openshift-controller-manager               4.10.0-0.ci-2022-01-26-033956   True        False         False      11m
openshift-samples                          4.10.0-0.ci-2022-01-26-033956   True        False         False      15m
operator-lifecycle-manager                 4.10.0-0.ci-2022-01-26-033956   True        False         False      32m
operator-lifecycle-manager-catalog         4.10.0-0.ci-2022-01-26-033956   True        False         False      32m
operator-lifecycle-manager-packageserver   4.10.0-0.ci-2022-01-26-033956   True        False         False      16m
service-ca                                 4.10.0-0.ci-2022-01-26-033956   True        False         False      33m
storage                                    4.10.0-0.ci-2022-01-26-033956   True        False         False      27m
$ 
$ aliyun vpc DescribeVpcs --VpcName jiwei-305-vb95j-vpc --RegionId cn-shanghai --endpoint vpc.aliyuncs.com --ou
tput cols=RegionId,VpcId,CidrBlock rows=Vpcs.Vpc[]
RegionId    | VpcId                     | CidrBlock
--------    | -----                     | ---------
cn-shanghai | vpc-uf61wd9jtjsmypxq139hj | 10.0.0.0/16

$ aliyun vpc DescribeVSwitches --VpcId vpc-uf61wd9jtjsmypxq139hj --RegionId cn-shanghai --endpoint vpc.aliyuncs.com --output cols=ZoneId,VSwitchName,CidrBlock rows=VSwitches.VSwitch[]
ZoneId        | VSwitchName                           | CidrBlock
------        | -----------                           | ---------
cn-shanghai-b | jiwei-305-vb95j-vswitch-cn-shanghai-b | 10.0.64.0/19
cn-shanghai-g | jiwei-305-vb95j-vswitch-cn-shanghai-g | 10.0.96.0/19
cn-shanghai-a | jiwei-305-vb95j-vswitch-nat-gateway   | 10.0.0.0/19
cn-shanghai-m | jiwei-305-vb95j-vswitch-cn-shanghai-m | 10.0.128.0/19
cn-shanghai-n | jiwei-305-vb95j-vswitch-cn-shanghai-n | 10.0.160.0/19
cn-shanghai-l | jiwei-305-vb95j-vswitch-cn-shanghai-l | 10.0.32.0/19

$

Comment 7 errata-xmlrpc 2022-03-10 16:39:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.