Bug 1975538 - [Storage] Remove stale cruft installed by CVO in earlier releases
Summary: [Storage] Remove stale cruft installed by CVO in earlier releases
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.9
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Jonathan Dobson
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-23 21:30 UTC by Jack Ottofaro
Modified: 2023-02-10 23:53 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1975533
Environment:
Last Closed: 2023-02-10 23:53:16 UTC
Target Upstream Version:


Attachments (Terms of Use)
Spreadsheet containing leaked resources. (11.48 KB, text/plain)
2021-06-23 21:30 UTC, Jack Ottofaro
no flags Details

Description Jack Ottofaro 2021-06-23 21:30:12 UTC
Created attachment 1793636 [details]
Spreadsheet containing leaked resources.

+++ This bug was initially created as a clone of Bug #1975533 +++

This "stale cruft" is created as a result of the following scenario. Release A had manifest M that lead the CVO to reconcile resource R. But then the component maintainers decided they didn't need R any longer, so they dropped manifest M in release B. The new CVO will no longer reconcile R, but clusters updating from A to B will still have resource R in-cluster, as an unmaintained orphan.

Now that https://issues.redhat.com/browse/OTA-222 has been implemented teams can go back through and create deletion manifests for these leaked resources.

The attachment delete-candidates.csv contains a list of leaked resources as compared to a freshly installed 4.9 cluster. Use this list to find your component's resources and use the manifest delete annotation (https://github.com/openshift/cluster-version-operator/pull/438) to remove them.

Note also that in the case of a cluster-scoped resource it may not need to be removed but simply be modified to remove namespace.

Comment 1 Tomas Smetana 2021-06-24 08:01:56 UTC
I checked the attached CSV: There potential "cruft" seems to belong to cluster-storage-operator or csi-snapshot-controller operator. Channging subcomponent accordingly.

Comment 2 Jan Safranek 2021-06-25 14:34:33 UTC
It looks like we should make sure these objects are cleaned:

Namespaced in openshift-cluster-storage-operator:
ClusterRoleBinding	csi-snapshot-controller-operator-role
RoleBinding	cluster-storage-operator
Role	cluster-storage-operator
ClusterRoleBinding	cluster-storage-operator-role

Non-namespaced:
ClusterRoleBinding	cluster-storage-operator
ClusterRole	cluster-storage-operator

Comment 3 Jonathan Dobson 2021-06-30 23:05:54 UTC
The following 2 objects are still present and do not need to be removed. It's just that the namespace was removed from each of them starting in 4.7 with the following commits.

ClusterRoleBinding      csi-snapshot-controller-operator-role   openshift-cluster-storage-operator      4.4     4.6     0000_50_cluster-csi-snapshot-controller-operator_05_operator_rbac.yaml
https://github.com/openshift/cluster-csi-snapshot-controller-operator/commit/fb5d0a4e2171276a81d319eedfe73b284e08f439

ClusterRoleBinding      cluster-storage-operator-role   openshift-cluster-storage-operator      4.6     4.6     0000_50_cluster-storage-operator_08_operator_rbac.yaml
https://github.com/openshift/cluster-storage-operator/commit/27fc35b95b5f71b218544b4b187f6f23f74b60ef


The following 4 objects were removed starting in 4.6:

ClusterRoleBinding      cluster-storage-operator        <none>  4.1     4.5     0000_50_cluster-storage-operator_01-cluster-role-binding.yaml

ClusterRole     cluster-storage-operator        <none>  4.1     4.5     0000_50_cluster-storage-operator_01-cluster-role.yaml

RoleBinding     cluster-storage-operator        openshift-cluster-storage-operator      4.1     4.5     0000_50_cluster-storage-operator_01-role-binding.yaml

Role    cluster-storage-operator        openshift-cluster-storage-operator      4.1     4.5     0000_50_cluster-storage-operator_01-role.yaml

They were renamed from manifests/01-* to manifests/03-* with this commit:
https://github.com/openshift/cluster-storage-operator/commit/471ec786f01cb00106691a2b43f7b6c571feaf37

And then later removed altogether with this commit:
https://github.com/openshift/cluster-storage-operator/commit/f0411e5a596164aeda9e74fc269278d3abf01bc3

These were removed in favor of creating driver specific objects under assets/csidriveroperators/*

So these 4 files need to be restored from f0411e5a596164aeda9e74fc269278d3abf01bc3 and then add the release.openshift.io/delete annotation for CVO to clean up stale objects that may be left behind from previous releases:

manifests/03-cluster-role-binding.yaml
manifests/03-cluster-role.yaml
manifests/03-role-binding.yaml
manifests/03-role.yaml

See "Manifest Annotation For Object Deletion" doc:
https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/object-deletion.md

Comment 5 Jonathan Dobson 2022-02-02 23:34:51 UTC
Need to re-test this on a newer build, I let it go stale for too long:
https://github.com/openshift/cluster-storage-operator/pull/182


Note You need to log in before you can comment on or make changes to this bug.