Bug 1889779

Summary: error when destroying a vSphere installation that failed early
Product: OpenShift Container Platform Reporter: Joel Diaz <jdiaz>
Component: InstallerAssignee: Patrick Dillon <padillon>
Installer sub component: openshift-installer QA Contact: jima
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: adahiya, jima, mstaeble
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: installer did not tag vsphere ovf template until after upload. Consequence: template upload can take a long time and if installation terminates before upload, the template was not tagged and therefore could not be removed. Because the template could not be removed, the folder was not empty and also could not be deleted. Fix: move tagging to beginning of ovf template upload. Result: a partial template caused by quitting installation early can be removed. Destroying template and folder succeeds even when installation quits in the middle of template upload.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:26:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1920552, 1925282    
Attachments:
Description Flags
openshift install log none

Description Joel Diaz 2020-10-20 14:40:12 UTC
Created attachment 1722913 [details]
openshift install log

Thanks for opening a bug report!
Before hitting the button, please fill in as much of the template below as you can.
If you leave out information, it's harder to help you.
Be ready for follow-up questions, and please respond in a timely manner.
If we can't reproduce a bug we might close your issue.
If we're wrong, PLEASE feel free to reopen it and explain why.

Version:

$ openshift-install version
[jdiaz@minigoomba os-install-4.6-rc4]$ ./openshift-install version
./openshift-install 4.6.0-rc.4
built from commit ebdbda57fc18d3b73e69f0f2cc499ddfca7e6593
release image registry.svc.ci.openshift.org/ocp/release@sha256:2c22e1c56831935a24efb827d2df572855ccd555c980070f77c39729526037d5


Platform: vSphere

Please specify:
* IPI

What happened?
Cluster installation failed while creating/importing the RHCOS image.
Attempt to destroy any resources created up to the failure to allow a second installation attempt, and the destroy command gives an error.


What did you expect to happen?

The destroy command should gracefully notice that there is nothing to delete/destroy.

How to reproduce it (as minimally and precisely as possible)?

Set up an environment where the RHCOS import fails, or just force quit the installer during the RHCOS image import. Now that we have an incomplete installation, try to run the destroy.

$ ./openshift-install destroy cluster --dir vsphere --log-level=debug 

Anything else we need to know?

Here are the logs of the end of the failed install:

DEBUG vsphereprivate_import_ova.import: Still creating... [1m40s elapsed] 
DEBUG vsphereprivate_import_ova.import: Still creating... [1m50s elapsed] 
DEBUG vsphereprivate_import_ova.import: Still creating... [2m0s elapsed] 
DEBUG vsphereprivate_import_ova.import: Still creating... [2m10s elapsed] 
ERROR                                              
ERROR Error: failed to upload: Post "https://10.3.32.7/nfc/528eb5b2-eca4-9d4a-3126-6c97584cb1fa/disk-0.vmdk": dial tcp 10.3.32.7:443: connect: connection timed out 
ERROR                                              
ERROR   on ../../../../tmp/openshift-install-278843614/main.tf line 43, in resource "vsphereprivate_import_ova" "import": 
ERROR   43: resource "vsphereprivate_import_ova" "import" { 
ERROR                                              
ERROR                                              
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change 


And here is the error when trying to 'destroy cluster':

[jdiaz@minigoomba os-install-4.6-rc4]$ ./openshift-install destroy cluster --dir vmc --log-level=debug
DEBUG OpenShift Installer 4.6.0-rc.4               
DEBUG Built from commit ebdbda57fc18d3b73e69f0f2cc499ddfca7e6593 
DEBUG find attached objects on tag                 
DEBUG find VirtualMachine objects                  
FATAL Failed to destroy cluster: object references is empty 
[jdiaz@minigoomba os-install-4.6-rc4]$ echo $?
1

Comment 2 Patrick Dillon 2020-11-18 16:09:02 UTC
In order to test this run a vsphere install with log-level debug. Once you start seeing "vsphereprivate_import_ova.import: Still creating... " 

Interrupt then run a delete.

Comment 4 jima 2020-11-23 03:36:02 UTC
Verified on ipi on vsphere with 4.7.0-0.nightly-2020-11-22-204912 and passed.

# ./openshift-install destroy cluster --dir ipi/ --log-level debug
DEBUG OpenShift Installer 4.7.0-0.nightly-2020-11-22-204912 
DEBUG Built from commit 68282c185253d4831514b20623b1717535c5e6f2 
DEBUG Find attached objects on tag                 
DEBUG Find VirtualMachine objects                  
DEBUG Delete VirtualMachines                       
INFO Destroyed                                     VirtualMachine=jimaipi-qwz86-rhcos
DEBUG Find Folder objects                          
DEBUG Delete Folder                                
INFO Destroyed                                     Folder=jimaipi-qwz86
DEBUG Delete tag                                   
DEBUG Delete tag category                          
DEBUG Purging asset "Metadata" from disk           
DEBUG Purging asset "Terraform Variables" from disk 
DEBUG Purging asset "Kubeconfig Admin Client" from disk 
DEBUG Purging asset "Kubeadmin Password" from disk 
DEBUG Purging asset "Certificate (journal-gatewayd)" from disk 
INFO Time elapsed: 9m33s

Comment 7 errata-xmlrpc 2021-02-24 15:26:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633