Bug 2099872 - openshift-install destroy cluster needs to be run twice on powervs
Summary: openshift-install destroy cluster needs to be run twice on powervs
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.11
Hardware: ppc64le
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: OCP Installer
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-21 21:03 UTC by Manoj Kumar
Modified: 2023-03-09 01:21 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-09 01:21:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Manoj Kumar 2022-06-21 21:03:58 UTC

Version:

$ openshift-install version
./openshift-install 4.11.0-0.nightly-ppc64le-2022-06-16-003709
built from commit f39a15787e72cd803e450500ce6be9ff846e5b81
release image quay.io/openshift-release-dev/ocp-release-nightly@sha256:1887990a385be4d835ccd83a0ff938bb3b9a117686364a10a6014c7f0170f180
release architecture ppc64le


Platform:

powervs
IPI

What happened?

Cluster creation failed.  The first destroy timed out.  The second destroy completed successfully.

The first destroy shows this error:

DEBUG Waiting for job "78c9630b-55e2-451b-a5ab-1ff094d4c054" to delete (status is "running") 
DEBUG Waiting for job "78c9630b-55e2-451b-a5ab-1ff094d4c054" to delete (status is "running") 
DEBUG Waiting for job "78c9630b-55e2-451b-a5ab-1ff094d4c054" to delete (status is "running") 
DEBUG Waiting for job "78c9630b-55e2-451b-a5ab-1ff094d4c054" to delete (status is "running") 
DEBUG Waiting for job "78c9630b-55e2-451b-a5ab-1ff094d4c054" to delete (status is "running") 
DEBUG Waiting for job "78c9630b-55e2-451b-a5ab-1ff094d4c054" to delete (status is "running") 
DEBUG powervs.PolledRun: Failed destroyCluster                                                  
DEBUG powervs.Run: after wait.PollImmediateInfinite, err = failed to destroy cluster: destroyCluster: timed out 
FATAL Failed to destroy cluster: failed to destroy cluster: destroyCluster: timed out 

The second destroy command always works.

INFO Deleted job "7f19e4d5-7dc2-40a0-b4e3-48599d293d8a" 
DEBUG destroyCluster: <-wgDone                     
DEBUG executeStageFunction: Adding: DNS Records    
DEBUG executeStageFunction: Adding: SSH Keys       
DEBUG executeStageFunction: duration = 14m59.999996042s 
DEBUG executeStageFunction: duration = 14m59.999998285s 
DEBUG executeStageFunction: Adding: Cloud Object Storage Instances 
DEBUG executeStageFunction: Executing: DNS Records 
DEBUG executeStageFunction: duration = 14m59.999996356s 
DEBUG Listing DNS records                          
DEBUG executeStageFunction: Executing: Cloud Object Storage Instances 
DEBUG executeStageFunction: Executing: SSH Keys    
DEBUG Listing COS instances                        
DEBUG Listing SSHKeys                              
DEBUG listSSHKeys: FOUND: rdr-kumarmn-p57x5-key    
DEBUG listCOSInstances: FOUND rdr-kumarmn-p57x5-cos dd615706-7c65-4cde-8e7e-87d59caefbb5 
DEBUG listDNSRecords: FOUND: da860eb4509ac0379c3f792cb05c7009, api-int.rdr-kumarmn.scnl-ibm.com 
DEBUG listDNSRecords: FOUND: 68e20d7ec80e01bba689b3f14dbb56be, api.rdr-kumarmn.scnl-ibm.com 
DEBUG listDNSRecords: PerPage = 20, Page = 1, Count = 20 
DEBUG listDNSRecords: moreData = true              
DEBUG Deleting sshKey "rdr-kumarmn-p57x5-key"      
DEBUG listDNSRecords: PerPage = 20, Page = 2, Count = 20 
DEBUG listDNSRecords: moreData = true              
DEBUG listDNSRecords: PerPage = 20, Page = 3, Count = 4 
DEBUG listDNSRecords: moreData = false             
DEBUG Deleting DNS record "api.rdr-kumarmn.scnl-ibm.com" 
INFO Deleted sshKey "rdr-kumarmn-p57x5-key"       
DEBUG Deleting DNS record "api-int.rdr-kumarmn.scnl-ibm.com" 
DEBUG Deleting COS instance "rdr-kumarmn-p57x5-cos" 
INFO Deleted DNS record "api.rdr-kumarmn.scnl-ibm.com" 
INFO Deleted DNS record "api-int.rdr-kumarmn.scnl-ibm.com" 
INFO Deleted COS instance "rdr-kumarmn-p57x5-cos" 
DEBUG destroyCluster: <-wgDone                     
DEBUG powervs.Run: after wait.PollImmediateInfinite, err = <nil> 
DEBUG Purging asset "Metadata" from disk           
DEBUG Purging asset "Master Ignition Customization Check" from disk 
DEBUG Purging asset "Worker Ignition Customization Check" from disk 
DEBUG Purging asset "Terraform Variables" from disk 
DEBUG Purging asset "Kubeconfig Admin Client" from disk 
DEBUG Purging asset "Kubeadmin Password" from disk 
DEBUG Purging asset "Certificate (journal-gatewayd)" from disk 
DEBUG Purging asset "Cluster" from disk            
INFO Time elapsed: 2m34s                          


What did you expect to happen?

Expected destroy to work the first time, and not have to run it a second time.

How to reproduce it (as minimally and precisely as possible)?

$ ./openshift-install destroy cluster --log-level=debug --dir=test
$ ./openshift-install destroy cluster --log-level=debug --dir=test

Comment 3 Shiftzilla 2023-03-09 01:21:59 UTC
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira.

https://issues.redhat.com/browse/OCPBUGS-9330


Note You need to log in before you can comment on or make changes to this bug.