Bug 1925249

Summary: KMS resources should be garbage collected when StorageCluster is deleted
Product: [Red Hat Storage] Red Hat OpenShift Container Storage Reporter: Filip Balák <fbalak>
Component: ocs-operatorAssignee: arun kumar mohan <amohan>
Status: CLOSED ERRATA QA Contact: Filip Balák <fbalak>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.7CC: amohan, gshanmug, jarrpa, jthottan, madam, mbukatov, mrajanna, muagarwa, nberry, ndevos, ocs-bugs, shan, shmohan, sostapov
Target Milestone: ---Keywords: AutomationBackLog
Target Release: OCS 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.7.0-324.ci Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-19 09:19:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Installation wizard with error message none

Description Filip Balák 2021-02-04 16:58:23 UTC
Created attachment 1755094 [details]
Installation wizard with error message

Description of problem (please be detailed as possible and provide log
snippests):
When user tries to recreate StorageCluster after one StorageCluster with KMS was deleted or tries to create more storage clusters with enabled KMS, there appears error message in installation wizard: An error occurred - secrets "ocs-kms-token" already exists

Version of all relevant components (if applicable):
ocs-operator.v4.7.0-250.ci


Is there any workaround available to the best of your knowledge?
For reinstalling StorageCluster, user can delete secret ocs-kms-token and configmaps ocs-kms-connection-details and csi-kms-connection-details.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
2

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes


Steps to Reproduce:
1. Create StorageCluster with integrated KMS encryption (vault) in OCP Console.
2. Delete the created StorageCluster.
3. Try to create the StorageCluster with KMS encryption again.


Actual results:
Installation wizard can not start StorageCluster installation due to error:
  An error occurred
  secrets "ocs-kms-token" already exists

Expected results:
Installation should pass and new KMS resources should be created.

Additional info:

Comment 2 Sébastien Han 2021-02-05 07:47:28 UTC
The Secret and CM need to have an ownerref to the StorageCluster object so they will be garbage collected on removal.

Comment 4 Filip Balák 2021-02-05 09:15:19 UTC
After discussion with Neha I understand that having more StorageClusters is not a valid user case (https://bugzilla.redhat.com/show_bug.cgi?id=1867400) and I agree with Neha's and Sébastien's solution. I am changing title to reflect that.

Comment 5 Jose A. Rivera 2021-02-08 15:00:42 UTC
This should be fairly straightforward. The verification is to make sure the relevant resources are no longer present in the openshift-storage Namespace after deleting the StorageCluster. Giving devel_ack+.

Comment 6 Martin Bukatovic 2021-02-08 19:05:28 UTC
Providing QE ack, reproducer is clear, bug is affects 4.7 feature.

Comment 9 arun kumar mohan 2021-02-18 18:11:12 UTC
Pushed PR: https://github.com/openshift/ocs-operator/pull/1087 to OCS-Operator master branch

Comment 10 Filip Balák 2021-03-09 14:32:38 UTC
When cluster is deleted and user tries to recreate the cluster with cluster-wide encryption again, there is still csi-kms-connection-details configmap which prevents user from creating the new cluster (error message: 'configmaps "csi-kms-connection-details" already exists'). --> ASSIGNED

Secret ocs-kms-token and configmap ocs-kms-connection-details are garbage collected successfully.

Tested with:
ocs-operator.v4.7.0-284.ci

Comment 11 arun kumar mohan 2021-03-12 07:38:36 UTC
'csi-kms-connection-details' config map is not a resource that is being used in ocs-operator, so it won't be "correct" to gc/delete the resource from ocs-operator.
Assigning to @mrajanna to take a look as this resource is used in 'ceph-csi' project.

Comment 12 arun kumar mohan 2021-03-12 07:43:55 UTC
Assigning it @ndevos, as he has worked on the KMS encryption part.

Comment 14 Niels de Vos 2021-03-12 08:17:33 UTC
Not sure who should be deleting it, but it definitely is not Ceph-CSI. If the UI creates it, it probably makes sense for the UI to delete it as well.

Or, can the UI set an ownerReference to the storage-cluster object it creates? That way, deleting the storage-cluster should also trigger an automatic deletion of the configmap:
- https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/

Gowtham, is adding an ownerReference something that can be done in the UI?

Comment 15 gowtham 2021-03-16 17:58:49 UTC
Let me try to add ownership reference for KMS config map and test this

Comment 17 arun kumar mohan 2021-03-24 05:07:41 UTC
As we discussed, adding the changes (for the deletion of csi configmap) into OCS Operator...
PR: https://github.com/openshift/ocs-operator/pull/1127

Jose, please take a look...

Comment 18 Filip Balák 2021-04-12 11:05:53 UTC
Secret ocs-kms-token and configmaps ocs-kms-connection-details and csi-kms-connection-details are garbage collected when cluster is uninstalled and user can create new cluster with kms withour error. --> VERIFIED

Tested with:
ocs-operator.v4.7.0-344.ci

Comment 20 errata-xmlrpc 2021-05-19 09:19:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041