Bug 1975542

Summary: [Insights] Remove stale cruft installed by CVO in earlier releases
Product: OpenShift Container Platform Reporter: Jack Ottofaro <jack.ottofaro>
Component: Insights OperatorAssignee: Tomas Remes <tremes>
Status: CLOSED ERRATA QA Contact: Dmitry Misharov <dmisharo>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.9CC: aos-bugs, inecas, mfojtik, mklika, sttts, tremes, xxia, yanyang
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1975533 Environment:
Last Closed: 2021-10-18 17:36:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Spreadsheet containing leaked resources. none

Description Jack Ottofaro 2021-06-23 21:36:28 UTC
Created attachment 1793638 [details]
Spreadsheet containing leaked resources.

+++ This bug was initially created as a clone of Bug #1975533 +++

This "stale cruft" is created as a result of the following scenario. Release A had manifest M that lead the CVO to reconcile resource R. But then the component maintainers decided they didn't need R any longer, so they dropped manifest M in release B. The new CVO will no longer reconcile R, but clusters updating from A to B will still have resource R in-cluster, as an unmaintained orphan.

Now that https://issues.redhat.com/browse/OTA-222 has been implemented teams can go back through and create deletion manifests for these leaked resources.

The attachment delete-candidates.csv contains a list of leaked resources as compared to a freshly installed 4.9 cluster. Use this list to find your component's resources and use the manifest delete annotation (https://github.com/openshift/cluster-version-operator/pull/438) to remove them.

Note also that in the case of a cluster-scoped resource it may not need to be removed but simply be modified to remove namespace.

Comment 1 Yang Yang 2021-07-29 01:42:04 UTC
The release.openshift.io/delete: "true" is added in 0000_50_insights-operator_03-clusterrole.yaml in 4.9.0-0.nightly-2021-07-28-181504. But the RoleBinding insights-operator-obfuscation-secret is still present in a fresh installed cluster. 

$ cat 0000_50_insights-operator_03-clusterrole.yaml
<snippet>
246 apiVersion: rbac.authorization.k8s.io/v1
247 kind: RoleBinding
248 metadata:
249   name: insights-operator-obfuscation-secret
250   namespace: openshift-insights
251   annotations:
252     release.openshift.io/delete: "true"
253 roleRef:
254   kind: Role
255   name: insights-operator-obfuscation-secret
256 subjects:
257 - kind: ServiceAccount
258   name: gather
259   namespace: openshift-insights

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-07-28-181504   True        False         7m15s   Cluster version is 4.9.0-0.nightly-2021-07-28-181504

$ oc get RoleBinding insights-operator-obfuscation-secret -n openshift-insights
NAME                                   ROLE                                        AGE
insights-operator-obfuscation-secret   Role/insights-operator-obfuscation-secret   45m

Tomas Remes, we expect the RoleBinding is not created since it has release.openshift.io/delete: "true" annotation. Could you please help confirm that do you recreate the RoleBinding with that name?

Comment 2 Tomas Remes 2021-07-29 07:02:06 UTC
Yes our definition is quite messy at the moment. We have to clean it, but the goal is to preserve the role and the rolebinding in the 4.9. I discussed it with Yang in Slack.

Comment 3 Jack Ottofaro 2021-07-29 15:37:31 UTC
I'm not seeing use of 'release.openshift.io/delete: "true"' annotation in the PR. Also be sure to continue to use annotation 'include.release.openshift.io/self-managed-high-availability: "true"' or else CVO will ignore your manifest altogether and the delete will not happen.

Comment 5 Yang Yang 2021-07-30 01:11:18 UTC
Jack, Tomas said they want to preserve the role and rolebinding in 4.9. So he just removed the duplicate definitions.

Comment 6 Dmitry Misharov 2021-08-05 12:18:06 UTC
Verified on 4.9.0-0.ci-2021-08-05-062416.

Comment 9 errata-xmlrpc 2021-10-18 17:36:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759