Bug 1953019

Summary: [Installer][baremetal][metal3] The baremetal IPI installer fails on delete cluster with: failed to clean baremetal bootstrap storage pool
Product: OpenShift Container Platform Reporter: Andreas Karis <akaris>
Component: InstallerAssignee: Kiran Thyagaraja <kiran>
Installer sub component: OpenShift on Bare Metal IPI QA Contact: Polina Rabinovich <prabinov>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: prabinov, sdasu
Version: 4.7Keywords: Triaged
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:03:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andreas Karis 2021-04-23 18:04:52 UTC
[Installer][baremetal][metal3] The baremetal IPI installer fails on delete cluster with: failed to clean baremetal bootstrap storage pool

{code}
[root@openshift-jumpserver-0 ~]# ./openshift-baremetal-install version
./openshift-baremetal-install 4.7.5
built from commit e15f17c958b4a04e770c0cfe758ca69452874508
release image quay.io/openshift-release-dev/ocp-release@sha256:0a4c44daf1666f069258aa983a66afa2f3998b78ced79faa6174e0a0f438f0a5
{code}

After a successful installation, the libvirt bootstrap VM and the storage pool are already deleted by the installer:
{code}

[root@openshift-jumpserver-0 ~]# virsh pool-list
 Name      State    Autostart
-------------------------------
 default   active   yes

[root@openshift-jumpserver-0 ~]# virsh list --all
 Id   Name   State
--------------------

{code}

Therefore, when someone tries to destroy the cluster, it fails:
{code}
[root@openshift-jumpserver-0 ~]# ./openshift-baremetal-install destroy cluster  --log-level=debug --dir=openshift-install
DEBUG OpenShift Installer 4.7.5                    
DEBUG Built from commit e15f17c958b4a04e770c0cfe758ca69452874508 
DEBUG Deleting bare metal resources                
DEBUG Deleting baremetal bootstrap volumes         
FATAL Failed to destroy cluster: failed to clean baremetal bootstrap storage pool: get storage pool "ipi-cluster-kfczp-bootstrap": virError(Code=49, Domain=18, Message='Storage pool not found: no storage pool with matching name 'ipi-cluster-kfczp-bootstrap'') 
{code}

https://github.com/openshift/installer/blob/fae650e24e7036b333b2b2d9dfb5a08a29cd07b1/pkg/destroy/baremetal/baremetal.go#L59

I do not think that we should throw a FATAL error when a resource which we want to clean up cannot be found. If the pool cannot be found, this also indicates that the storage should be gone. Thus, simply log a warning or info and continue the operation.

Comment 7 Andreas Karis 2021-05-14 08:49:03 UTC
I tested with the latest nightly, and this looks way better, now:

[root@openshift-jumpserver-0 ~]#  ./openshift-baremetal-install-4.8.0-0.nightly-2021-05-13-222446 destroy cluster  --log-level=debug --dir=openshift-install
DEBUG OpenShift Installer 4.8.0-0.nightly-2021-05-13-222446 
DEBUG Built from commit 7ba3f375977b7e2a0adc856db3f258f2c53b8aef 
DEBUG Deleting bare metal resources                
DEBUG Deleting baremetal bootstrap volumes         
WARNING Unable to get storage pool ipi-cluster-lhmww-bootstrap: virError(Code=49, Domain=18, Message='Storage pool not found: no storage pool with matching name 'ipi-cluster-lhmww-bootstrap'') 
DEBUG FIXME: delete resources!                     
DEBUG Purging asset "Metadata" from disk           
DEBUG Purging asset "Master Ignition Customization Check" from disk 
DEBUG Purging asset "Worker Ignition Customization Check" from disk 
DEBUG Purging asset "Terraform Variables" from disk 
DEBUG Purging asset "Kubeconfig Admin Client" from disk 
DEBUG Purging asset "Kubeadmin Password" from disk 
DEBUG Purging asset "Certificate (journal-gatewayd)" from disk 
DEBUG Purging asset "Cluster" from disk            
INFO Time elapsed: 1s

Comment 8 Polina Rabinovich 2021-05-14 08:53:36 UTC
Verified with Andreas Karis

Comment 11 errata-xmlrpc 2021-07-27 23:03:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438