Description of problem: When OCP4 installation fails and you run `openshift-isntall destroy cluster` command, bootstrap VM is left in the engine. Version-Release number of the following components: ./openshift-install v4.4.0 built from commit 4010a2e42a95fc8eace70d629653b6f60a27b021 release image quay.io/openshift-release-dev/ocp-release-nightly@sha256:5c50516bd5669faec3729fa4e4705d073f9c720c769df4c77afe05dc20533963 How reproducible: 100 % Steps to Reproduce: 1. Run a OCP4 instllation that fails but manages to create master nodes and a bootstrap node 2. Run openshift-install destroy cluster command against your cluster Actual results: ./openshift-install destroy cluster --dir=test-cluster --log-level=debug DEBUG OpenShift Installer v4.4.0 DEBUG Built from commit 4010a2e42a95fc8eace70d629653b6f60a27b021 INFO searching VMs by tag=purple-zdcjt INFO Found %!s(int=5) VMs INFO Stopping VM purple-zdcjt-master-2 : errors: %s%!(EXTRA <nil>) INFO Stopping VM purple-zdcjt-worker-0-9fhtt : errors: %s%!(EXTRA <nil>) INFO Stopping VM purple-zdcjt-master-1 : errors: %s%!(EXTRA <nil>) INFO Stopping VM purple-zdcjt-master-0 : errors: %s%!(EXTRA <nil>) INFO Stopping VM purple-zdcjt-worker-0-m8clh : errors: %s%!(EXTRA <nil>) INFO Removing VM purple-zdcjt-master-0 : errors: %s%!(EXTRA <nil>) INFO Removing VM purple-zdcjt-master-1 : errors: %s%!(EXTRA <nil>) INFO Removing VM purple-zdcjt-worker-0-9fhtt : errors: %s%!(EXTRA <nil>) INFO Removing VM purple-zdcjt-master-2 : errors: %s%!(EXTRA <nil>) INFO Removing VM purple-zdcjt-worker-0-m8clh : errors: %s%!(EXTRA <nil>) ERROR Removing VMs - error: %!s(<nil>) INFO Removing tag purple-zdcjt : errors: %s%!(EXTRA <nil>) ERROR Removing Tag - error: %!s(<nil>) ERROR Removing Template - error: %!s(<nil>) DEBUG Purging asset "Terraform Variables" from disk DEBUG Purging asset "Kubeconfig Admin Client" from disk DEBUG Purging asset "Kubeadmin Password" from disk DEBUG Purging asset "Certificate (journal-gatewayd)" from disk DEBUG Purging asset "Metadata" from disk DEBUG Purging asset "Cluster" from disk Expected results: Bootstrap VM should be removed as well
*** Bug 1818529 has been marked as a duplicate of this bug. ***
Also those "errors: %s%!(EXTRA <nil>)" prints should be cleaned up.
> Also those "errors: %s%!(EXTRA <nil>)" prints should be cleaned up. This was handled for 4.5 via [1]. But if you want that backported to earlier 4.y, you should create a separate bug series, because the logging fix has nothing to do with the bootstrap cleanup issue this bug is about. [1]: https://github.com/openshift/installer/pull/3445
(In reply to W. Trevor King from comment #3) > > Also those "errors: %s%!(EXTRA <nil>)" prints should be cleaned up. > > This was handled for 4.5 via [1]. But if you want that backported to > earlier 4.y, you should create a separate bug series, because the logging > fix has nothing to do with the bootstrap cleanup issue this bug is about. > > [1]: https://github.com/openshift/installer/pull/3445 Well, I had opened Bug 1818529 but marked it as a dupe. You have a point -- it's probably not a dupe.
*** Bug 1836342 has been marked as a duplicate of this bug. ***
I don't see any problem with the code: installer/pkg/destroy/ovirt/destroyer.go What I have found: The bootstrap VM is not tagged as the tmp VM too. Working in a possible patch.
Here a PR with an example in golang how to detect the vms are tagged or not: https://github.com/oVirt/ovirt-engine-sdk-go/pull/205/commits/0da9b64a27fedc3b41ec0057d42c80223b559dc9
(In reply to Greg Sheremeta from comment #2) > Also those "errors: %s%!(EXTRA <nil>)" prints should be cleaned up. +1
(In reply to Douglas Schilling Landgraf from comment #10) > (In reply to Greg Sheremeta from comment #2) > > Also those "errors: %s%!(EXTRA <nil>)" prints should be cleaned up. > > +1 per Comment 3, already done
(In reply to Greg Sheremeta from comment #11) > (In reply to Douglas Schilling Landgraf from comment #10) > > (In reply to Greg Sheremeta from comment #2) > > > Also those "errors: %s%!(EXTRA <nil>)" prints should be cleaned up. > > > > +1 > > per Comment 3, already done Yep, I have just tried that in my local env. I can't see it anymore.
moving back to assign to keep in my radar. We will need a second patch.
I think the temp VM should have a different tag name than $cluster_id because that means its only going to be removed when destroying the cluster, so it we keep being a waste during the liftime of the cluster. Instead we should destroy it on destroy bootstrap, where it more logically belongs. To do that all we need is to tag it with $cluster_id-bootstrap, and then in the installer code remove all VMs by this tag. The bootstrap VM should also have the same tag, so both get removed. To overcome the problem where you can't define the tag twice and have it updated, what we need is to declare the tag once, but have it assigned with a list of vm id, just like masters: resource "ovirt_tag" "cluster_tag" { name = var.cluster_id vm_ids = [for instance in ovirt_vm.master.* : instance.id] } But instead of ovirt_vm.master.* we should replace it with ovirt_vm.bootstrap.* and conctat ovirt_vm.tmp_import_vm, so: resource "ovirt_tag" "cluster_bootstrap_tag" { name = "${var.cluster_id}-bootstrap" vm_ids = [concat(ovirt_vm.bootstrap.id, tmp_import_vm_id)] } you would need to pass the tmp_import_vm_id to the bootstrap module.
(In reply to Roy Golan from comment #14) > I think the temp VM should have a different tag name than $cluster_id > because that means its > only going to be removed when destroying the cluster, so it we keep being a > waste during the liftime > of the cluster. > > Instead we should destroy it on destroy bootstrap, where it more logically > belongs. To do that > all we need is to tag it with $cluster_id-bootstrap, and then in the > installer code remove all VMs > by this tag. The bootstrap VM should also have the same tag, so both get > removed. > > To overcome the problem where you can't define the tag twice and have it > updated, what we need is > to declare the tag once, but have it assigned with a list of vm id, just > like masters: > > resource "ovirt_tag" "cluster_tag" { > name = var.cluster_id > vm_ids = [for instance in ovirt_vm.master.* : instance.id] > } > > But instead of ovirt_vm.master.* we should replace it with > ovirt_vm.bootstrap.* and conctat ovirt_vm.tmp_import_vm, so: > > resource "ovirt_tag" "cluster_bootstrap_tag" { > name = "${var.cluster_id}-bootstrap" > vm_ids = [concat(ovirt_vm.bootstrap.id, tmp_import_vm_id)] > } > > you would need to pass the tmp_import_vm_id to the bootstrap module. replied in the github.
due to capacity constraints we will be revisiting this bug in the upcoming sprint
*** Bug 1855861 has been marked as a duplicate of this bug. ***
Verified on: 4.6.0-0.nightly-2020-07-22-074636 Steps: 1. # openshift-install create cluster --log-level=debug --dir=resources 2. somehow, cancel the installation once it passes the "Creating infrastructure resources" step (CTRL+C, kill the process, turn off the DNS...) # openshift-install destroy cluster --dir=resources 3. check on engine UI if bootstrap vm is there (Compute -> Virtual Machines) Results: bootstrap vm deleted
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196