Bug 2035705

Summary: Azure 'Destroy cluster' get stuck when the cluster resource group is already not existing.
Product: OpenShift Container Platform Reporter: Johnny Liu <jialiu>
Component: InstallerAssignee: Kiran Thyagaraja <kiran>
Installer sub component: openshift-installer QA Contact: MayXu <maxu>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: maxu, mstaeble
Version: 4.10   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-10 16:36:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Johnny Liu 2021-12-27 04:13:12 UTC
Version:

$ openshift-install version
openshift-install 4.10.0-0.nightly-2021-12-23-153012
built from commit 94a3ed9cbe4db66dc50dab8b85d2abf40fb56426
release image registry.ci.openshift.org/ocp/release@sha256:39cacdae6214efce10005054fb492f02d26b59fe9d23686dc17ec8a42f428534
release architecture amd64

Platform:
Azure

Please specify:
* IPI

What happened?
Azure cluster's resource group was already pruned, run "destroy cluster" command to destroy other resource, e.g: IAM and public DNS records. The installer get stuck saying "Resource group 'qeci-30896-q96nz-rg' could not be found."


What did you expect to happen?
Destroy cluster get completed successfully.

How to reproduce it (as minimally and precisely as possible)?
1. Ensure the cluster's resource group is pruned, not existing any more. 
2. Destroy the cluster. 
$ cat /tmp/tmp.zjvpNpLdqI/metadata.json 
{"clusterName":"qeci-30896","infraID":"qeci-30896-q96nz","azure":{"region":"centralus"}}
$ openshift-instsall destroy cluster --dir /tmp/tmp.zjvpNpLdqI/
DEBUG OpenShift Installer 4.10.0-0.nightly-2021-12-23-153012 
DEBUG Built from commit 94a3ed9cbe4db66dc50dab8b85d2abf40fb56426 
INFO Credentials loaded from file "/root/azure.cred" 
DEBUG deleting public records                      
DEBUG already deleted                              
DEBUG deleting resource group                      
DEBUG failed to delete qeci-30896-q96nz-rg: resources.GroupsClient#Delete: Failure sending request: StatusCode=0 -- Original Error: Code="ResourceGroupNotFound" Message="Resource group 'qeci-30896-q96nz-rg' could not be found." 
DEBUG deleting resource group                      
DEBUG failed to delete qeci-30896-q96nz-rg: resources.GroupsClient#Delete: Failure sending request: StatusCode=0 -- Original Error: Code="ResourceGroupNotFound" Message="Resource group 'qeci-30896-q96nz-rg' could not be found." 
DEBUG deleting resource group                      
DEBUG failed to delete qeci-30896-q96nz-rg: resources.GroupsClient#Delete: Failure sending request: StatusCode=0 -- Original Error: Code="ResourceGroupNotFound" Message="Resource group 'qeci-30896-q96nz-rg' could not be found." 

Anything else we need to know?
This issue does not happen on 4.9.
DEBUG OpenShift Installer 4.9.0-0.nightly-2021-12-23-045233 
DEBUG Built from commit eb132dae953888e736c382f1176c799c0e1aa49e 
INFO Credentials loaded from file "/root/azure.cred" 
DEBUG deleting public records                      
DEBUG already deleted                              
DEBUG deleting resource group                      
DEBUG already deleted                               resource group=qeci-30896-q96nz-rg
DEBUG deleting application registrations           
INFO deleted                                       appID=fd514d62-d27b-4f28-9de0-83e1ab454f8c
INFO deleted                                       appID=066798b8-9a0b-4da6-8a3f-ae6d97fd0d1d
INFO deleted                                       appID=b4bb1844-82ee-42a6-a2cd-c0652fb57d68
INFO deleted                                       appID=41284f75-b1c4-4a84-bece-d32a03f83195
INFO deleted                                       appID=8fff4f80-35eb-4f6f-9bba-6dcffb8c8d92
INFO Time elapsed: 2s

Comment 1 Matthew Staebler 2022-01-04 15:35:37 UTC
It looks like this was broken by https://github.com/openshift/installer/pull/5314.

Comment 4 MayXu 2022-01-19 03:21:36 UTC
verified OK on 
openshift-install 4.10.0-0.nightly-2022-01-18-044014

Comment 7 errata-xmlrpc 2022-03-10 16:36:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056