Cause: installer did not tag vsphere ovf template until after upload.
Consequence: template upload can take a long time and if installation terminates before upload, the template was not tagged and therefore could not be removed. Because the template could not be removed, the folder was not empty and also could not be deleted.
Fix: move tagging to beginning of ovf template upload.
Result: a partial template caused by quitting installation early can be removed. Destroying template and folder succeeds even when installation quits in the middle of template upload.
Created attachment 1722913[details]
openshift install log
Thanks for opening a bug report!
Before hitting the button, please fill in as much of the template below as you can.
If you leave out information, it's harder to help you.
Be ready for follow-up questions, and please respond in a timely manner.
If we can't reproduce a bug we might close your issue.
If we're wrong, PLEASE feel free to reopen it and explain why.
Version:
$ openshift-install version
[jdiaz@minigoomba os-install-4.6-rc4]$ ./openshift-install version
./openshift-install 4.6.0-rc.4
built from commit ebdbda57fc18d3b73e69f0f2cc499ddfca7e6593
release image registry.svc.ci.openshift.org/ocp/release@sha256:2c22e1c56831935a24efb827d2df572855ccd555c980070f77c39729526037d5
Platform: vSphere
Please specify:
* IPI
What happened?
Cluster installation failed while creating/importing the RHCOS image.
Attempt to destroy any resources created up to the failure to allow a second installation attempt, and the destroy command gives an error.
What did you expect to happen?
The destroy command should gracefully notice that there is nothing to delete/destroy.
How to reproduce it (as minimally and precisely as possible)?
Set up an environment where the RHCOS import fails, or just force quit the installer during the RHCOS image import. Now that we have an incomplete installation, try to run the destroy.
$ ./openshift-install destroy cluster --dir vsphere --log-level=debug
Anything else we need to know?
Here are the logs of the end of the failed install:
DEBUG vsphereprivate_import_ova.import: Still creating... [1m40s elapsed]
DEBUG vsphereprivate_import_ova.import: Still creating... [1m50s elapsed]
DEBUG vsphereprivate_import_ova.import: Still creating... [2m0s elapsed]
DEBUG vsphereprivate_import_ova.import: Still creating... [2m10s elapsed]
ERROR
ERROR Error: failed to upload: Post "https://10.3.32.7/nfc/528eb5b2-eca4-9d4a-3126-6c97584cb1fa/disk-0.vmdk": dial tcp 10.3.32.7:443: connect: connection timed out
ERROR
ERROR on ../../../../tmp/openshift-install-278843614/main.tf line 43, in resource "vsphereprivate_import_ova" "import":
ERROR 43: resource "vsphereprivate_import_ova" "import" {
ERROR
ERROR
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply Terraform: failed to complete the change
And here is the error when trying to 'destroy cluster':
[jdiaz@minigoomba os-install-4.6-rc4]$ ./openshift-install destroy cluster --dir vmc --log-level=debug
DEBUG OpenShift Installer 4.6.0-rc.4
DEBUG Built from commit ebdbda57fc18d3b73e69f0f2cc499ddfca7e6593
DEBUG find attached objects on tag
DEBUG find VirtualMachine objects
FATAL Failed to destroy cluster: object references is empty
[jdiaz@minigoomba os-install-4.6-rc4]$ echo $?
1
In order to test this run a vsphere install with log-level debug. Once you start seeing "vsphereprivate_import_ova.import: Still creating... "
Interrupt then run a delete.
Verified on ipi on vsphere with 4.7.0-0.nightly-2020-11-22-204912 and passed.
# ./openshift-install destroy cluster --dir ipi/ --log-level debug
DEBUG OpenShift Installer 4.7.0-0.nightly-2020-11-22-204912
DEBUG Built from commit 68282c185253d4831514b20623b1717535c5e6f2
DEBUG Find attached objects on tag
DEBUG Find VirtualMachine objects
DEBUG Delete VirtualMachines
INFO Destroyed VirtualMachine=jimaipi-qwz86-rhcos
DEBUG Find Folder objects
DEBUG Delete Folder
INFO Destroyed Folder=jimaipi-qwz86
DEBUG Delete tag
DEBUG Delete tag category
DEBUG Purging asset "Metadata" from disk
DEBUG Purging asset "Terraform Variables" from disk
DEBUG Purging asset "Kubeconfig Admin Client" from disk
DEBUG Purging asset "Kubeadmin Password" from disk
DEBUG Purging asset "Certificate (journal-gatewayd)" from disk
INFO Time elapsed: 9m33s
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2020:5633