Bug 1920552

Summary: error when destroying a vSphere installation that failed early
Product: OpenShift Container Platform Reporter: Patrick Dillon <padillon>
Component: InstallerAssignee: Patrick Dillon <padillon>
Installer sub component: openshift-installer QA Contact: jima
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: jima, mstaeble
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-09 20:16:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1889779    
Bug Blocks: 1920554    

Description Patrick Dillon 2021-01-26 14:55:58 UTC
This bug was initially created as a copy of Bug #1889779

I am copying this bug because: 



Thanks for opening a bug report!
Before hitting the button, please fill in as much of the template below as you can.
If you leave out information, it's harder to help you.
Be ready for follow-up questions, and please respond in a timely manner.
If we can't reproduce a bug we might close your issue.
If we're wrong, PLEASE feel free to reopen it and explain why.

Version:

$ openshift-install version
[jdiaz@minigoomba os-install-4.6-rc4]$ ./openshift-install version
./openshift-install 4.6.0-rc.4
built from commit ebdbda57fc18d3b73e69f0f2cc499ddfca7e6593
release image registry.svc.ci.openshift.org/ocp/release@sha256:2c22e1c56831935a24efb827d2df572855ccd555c980070f77c39729526037d5


Platform: vSphere

Please specify:
* IPI

What happened?
Cluster installation failed while creating/importing the RHCOS image.
Attempt to destroy any resources created up to the failure to allow a second installation attempt, and the destroy command gives an error.


What did you expect to happen?

The destroy command should gracefully notice that there is nothing to delete/destroy.

How to reproduce it (as minimally and precisely as possible)?

Set up an environment where the RHCOS import fails, or just force quit the installer during the RHCOS image import. Now that we have an incomplete installation, try to run the destroy.

$ ./openshift-install destroy cluster --dir vsphere --log-level=debug 

Anything else we need to know?

Here are the logs of the end of the failed install:

DEBUG vsphereprivate_import_ova.import: Still creating... [1m40s elapsed] 
DEBUG vsphereprivate_import_ova.import: Still creating... [1m50s elapsed] 
DEBUG vsphereprivate_import_ova.import: Still creating... [2m0s elapsed] 
DEBUG vsphereprivate_import_ova.import: Still creating... [2m10s elapsed] 
ERROR                                              
ERROR Error: failed to upload: Post "https://10.3.32.7/nfc/528eb5b2-eca4-9d4a-3126-6c97584cb1fa/disk-0.vmdk": dial tcp 10.3.32.7:443: connect: connection timed out 
ERROR                                              
ERROR   on ../../../../tmp/openshift-install-278843614/main.tf line 43, in resource "vsphereprivate_import_ova" "import": 
ERROR   43: resource "vsphereprivate_import_ova" "import" { 
ERROR                                              
ERROR                                              
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change 


And here is the error when trying to 'destroy cluster':

[jdiaz@minigoomba os-install-4.6-rc4]$ ./openshift-install destroy cluster --dir vmc --log-level=debug
DEBUG OpenShift Installer 4.6.0-rc.4               
DEBUG Built from commit ebdbda57fc18d3b73e69f0f2cc499ddfca7e6593 
DEBUG find attached objects on tag                 
DEBUG find VirtualMachine objects                  
FATAL Failed to destroy cluster: object references is empty 
[jdiaz@minigoomba os-install-4.6-rc4]$ echo $?
1

Comment 2 Patrick Dillon 2021-01-26 15:14:24 UTC
Just needs a cherry-pick but depends on https://github.com/openshift/installer/pull/4579

Comment 4 jima 2021-03-01 02:01:47 UTC
Tested on 4.6.0-0.nightly-2021-02-26-224651 and passed.

Let installation be failed at the step of importing ova template:
ERROR                                              
ERROR Error: failed to upload: Post "https://10.3.32.8/nfc/52239649-17a4-85da-43d4-c84c826e71a4/disk-0.vmdk": dial tcp 10.3.32.8:443: connect: connection timed out 
ERROR                                              
ERROR   on ../../../../tmp/openshift-install-638830781/main.tf line 43, in resource "vsphereprivate_import_ova" "import": 
ERROR   43: resource "vsphereprivate_import_ova" "import" { 
ERROR                                              
ERROR                                              
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change 

Then running destroy command to remove all resources created on vsphere.
# ./openshift-install destroy cluster --dir ipi/ --log-level debug
DEBUG OpenShift Installer 4.6.0-0.nightly-2021-02-26-224651 
DEBUG Built from commit 9c86c823fff234c104f574eaf25953485edfe4b1 
DEBUG Find attached objects on tag                 
DEBUG Find VirtualMachine objects                  
DEBUG Delete VirtualMachines                       
INFO Destroyed                                     VirtualMachine=jimavmc-cjctc-rhcos
DEBUG Find Folder objects                          
DEBUG Delete Folder                                
INFO Destroyed                                     Folder=jimavmc-cjctc
DEBUG Delete tag                                   
DEBUG Delete tag category                          
DEBUG Purging asset "Metadata" from disk           
DEBUG Purging asset "Terraform Variables" from disk 
DEBUG Purging asset "Kubeconfig Admin Client" from disk 
DEBUG Purging asset "Kubeadmin Password" from disk 
DEBUG Purging asset "Certificate (journal-gatewayd)" from disk 
INFO Time elapsed: 2m37s

Comment 7 errata-xmlrpc 2021-03-09 20:16:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.20 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0674