Bug 2033271 - [IPI on Alibabacloud] destroying cluster succeeded, but the resource group deletion wasn’t triggered
Summary: [IPI on Alibabacloud] destroying cluster succeeded, but the resource group de...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: aos-install
QA Contact: Jianli Wei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-16 12:04 UTC by Jianli Wei
Modified: 2022-03-10 16:34 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:34:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
alicloud web console resource group page (175.59 KB, image/png)
2021-12-16 12:04 UTC, Jianli Wei
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5536 0 None open Bug 2033271: [Alibaba] fix deletion of resource group 2022-01-14 09:59:24 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:34:42 UTC

Description Jianli Wei 2021-12-16 12:04:31 UTC
Created attachment 1846555 [details]
alicloud web console resource group page

Version:

$ openshift-install version
openshift-install 4.10.0-0.nightly-2021-12-15-151042
built from commit 7485aa4e85231dd09b3b1a693905483edfddf9a8
release image registry.ci.openshift.org/ocp/release@sha256:b4403653c3f3610e64c8334e50c48b018c8dba7d33b0d2d98905ce100e932a52
release architecture amd64

Platform: alibabacloud

Please specify:
* IPI (automated install with `openshift-install`. If you don't know, then it's IPI)

What happened?
After successfully destroying the cluster, the resource group created by the installer isn't triggered deletion. 

$ openshift-install destroy cluster --dir work --log-level info
INFO OSS bucket deleted                            bucketName=jiwei-oo-nb4cf-bootstrap stage=OSS buckets
INFO ECS instances deleted                         stage=ECS instances
INFO RAM roles deleted                             stage=RAM roles
INFO Private zones deleted                         stage=private zones
INFO SLB instances deleted                         stage=SLBs
INFO Security groups deleted                       stage=ECS security groups
INFO NAT gateways deleted                          stage=Nat gateways
INFO EIPs deleted                                  stage=EIPs
INFO VSwitches deleted                             stage=VSwitchs
INFO VPCs deleted                                  stage=VPCs
INFO Time elapsed: 2m36s
$

What did you expect to happen?
The resource group should be triggered deletion by end of destroying cluster (see bug attachment, due to ‘aliyun’ issue on listing resource groups).

How to reproduce it (as minimally and precisely as possible)?
Always.

Comment 1 Jianli Wei 2022-01-11 14:09:19 UTC
FYI there's still issue with today's build, i.e. after destroying cluster, the resource group deletion wasn't triggered. 

$ openshift-install version
openshift-install 4.10.0-0.nightly-2022-01-11-065245
built from commit 28cfc831cee01eb503a2340b4d5365fd281bf867
release image registry.ci.openshift.org/ocp/release@sha256:d9759e7c8ec5e2555419d84ff36aff2a4c8f9367236c18e722a3fe4d7c4f6dee
release architecture amd64
$ 
$ openshift-install create cluster --dir work --log-level info
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Consuming Openshift Manifests from target directory
INFO Consuming Master Machines from target directory
INFO Consuming Common Manifests from target directory
INFO Consuming Worker Machines from target directory
INFO Creating infrastructure resources...
INFO Waiting up to 20m0s (until 1:36PM) for the Kubernetes API at https://api.jiwei-202.alicloud-qe.devcluster.openshift.com:6443...
INFO API v1.22.1+6859754 up
INFO Waiting up to 30m0s (until 1:49PM) for bootstrapping to complete...
INFO Destroying the bootstrap resources...
INFO Waiting up to 40m0s (until 2:09PM) for the cluster at https://api.jiwei-202.alicloud-qe.devcluster.openshift.com:6443 to initialize...
I0111 13:29:25.630681  373929 trace.go:205] Trace[725340612]: "Reflector ListAndWatch" name:k8s.io/client-go/tools/watch/informerwatcher.go:146 (11-Jan-2022 13:29:08.887) (total time: 16742ms):
Trace[725340612]: [16.742779359s] [16.742779359s] END
E0111 13:29:25.630756  373929 reflector.go:138] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *v1.ClusterVersion: failed to list *v1.ClusterVersion: Get "https://api.jiwei-202.alicloud-qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusterversions?fieldSelector=metadata.name%3Dversion&limit=500&resourceVersion=0": http2: client connection lost
INFO Waiting up to 10m0s (until 2:01PM) for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/fedora/work/auth/kubeconfig'
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.jiwei-202.alicloud-qe.devcluster.openshift.com
INFO Login to the console with user: "kubeadmin", and password: "BTgFf-LPACc-HjjZT-pdevu"
INFO Time elapsed: 38m22s
$ 
$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-11-065245   True        False         57s     Cluster version is 4.10.0-0.nightly-2022-01-11-065245
$ oc get nodes
NAME                                           STATUS   ROLES    AGE   VERSION
jiwei-202-xsx5g-master-0                       Ready    master   30m   v1.22.1+6859754
jiwei-202-xsx5g-master-1                       Ready    master   31m   v1.22.1+6859754
jiwei-202-xsx5g-master-2                       Ready    master   29m   v1.22.1+6859754
jiwei-202-xsx5g-worker-ap-northeast-1a-w7pfz   Ready    worker   16m   v1.22.1+6859754
jiwei-202-xsx5g-worker-ap-northeast-1b-nlnmx   Ready    worker   18m   v1.22.1+6859754
jiwei-202-xsx5g-worker-ap-northeast-1b-zrln6   Ready    worker   18m   v1.22.1+6859754
$ 
>$ openshift-install destroy cluster --dir work --log-level info
INFO ECS instances deleted                         stage=ECS instances
INFO RAM roles deleted                             stage=RAM roles
INFO Private zones deleted                         stage=private zones
INFO SLB instances deleted                         stage=SLBs
INFO Security groups deleted                       stage=ECS security groups
INFO NAT gateways deleted                          stage=Nat gateways
INFO EIPs deleted                                  stage=EIPs
INFO VSwitches deleted                             stage=VSwitchs
INFO VPCs deleted                                  stage=VPCs
INFO Time elapsed: 2m40s
$ 
$ aliyun resourcemanager ListResourceGroups --ResourceGroupId rg-aekzg4jz7uxwo3y --endpoint resourcemanager.ap-northeast-1.aliyuncs.com
{
        "PageNumber": 1,
        "PageSize": 10,
        "RequestId": "0EAED934-D0C9-2B2A-A42F-9EAF4C70B6AA",
        "ResourceGroups": {
                "ResourceGroup": [
                        {
                                "AccountId": "5724326381648897",
                                "CreateDate": "2022-01-11T21:14:39+08:00",
                                "DisplayName": "jiwei-202-xsx5g-rg",
                                "Id": "rg-aekzg4jz7uxwo3y",
>                                "Name": "jiwei-202-xsx5g-rg",
>                                "Status": "OK"
                        }
                ]
        },
        "TotalCount": 1
}
$

Comment 4 Jianli Wei 2022-01-20 11:07:10 UTC
$ openshift-install version
openshift-install 4.10.0-0.nightly-2022-01-20-082726
built from commit 9eade28a9ce4862a6ef092bc5f5fcfb499342d4d
release image registry.ci.openshift.org/ocp/release@sha256:bdc27b9ff4a1a482d00fc08924f1157d782ded9f3e91af09fe9f3596bcea877c
release architecture amd64
$ 
$ openshift-install destroy cluster --dir work --log-level info
INFO RAM roles deleted                             stage=RAM roles
INFO Private zones deleted                         stage=private zones
INFO SLB instances deleted                         stage=SLBs
INFO Security groups deleted                       stage=ECS security groups
INFO NAT gateways deleted                          stage=Nat gateways
INFO EIPs deleted                                  stage=EIPs
INFO VSwitches deleted                             stage=VSwitchs
INFO VPCs deleted                                  stage=VPCs
INFO Resource group deleted                        name=jiwei-404-5sq5q-rg stage=resource groups
INFO Time elapsed: 2m43s                          
$ 
$ aliyun resourcemanager ListResourceGroups --Name jiwei-404-5sq5q-rg --endpoint resourcemanager.aliyuncs.com
{
        "PageNumber": 1,
        "PageSize": 10,
        "RequestId": "52C15C1B-CB11-5751-AA77-36A31B14BCCE",
        "ResourceGroups": {
                "ResourceGroup": [
                        {
                                "AccountId": "5724326381648897",
                                "CreateDate": "2022-01-20T18:53:48+08:00",
                                "DisplayName": "jiwei-404-5sq5q-rg",
                                "Id": "rg-aek23stro2qhvay",
                                "Name": "jiwei-404-5sq5q-rg-rg-aek23stro2qhvay-20220120",
                                "Status": "PendingDelete"
                        }
                ]
        },
        "TotalCount": 1
}
$

Comment 7 errata-xmlrpc 2022-03-10 16:34:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.