Bug 1946838

Summary: Copied CSVs show up as adopted components
Product: OpenShift Container Platform Reporter: Evan Cordell <ecordell>
Component: OLMAssignee: Evan Cordell <ecordell>
OLM sub component: OLM QA Contact: xzha
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: krizza, kuiwang, simore, tflannag
Version: 4.6   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Copied CSVs (placeholder to indicate which namespace an operator is available in) were occasionally adopted by the Operator-controller, which makes them visible as a component of the operator. Operator components have their status tracked on the top-level Operator object. Consequence: High load is placed on the apiserver due to an excess of watches on copied CSVs, and it increases the memory footprint of the olm operator. Fix: Copied CSVs are never included as Operator components. Result: Lowered memory footprint, and lowered apiserver load.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:57:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1947909    

Description Evan Cordell 2021-04-07 01:06:16 UTC
Description of problem:

The Operator API lists out components of an operator, but should not include copied CSVs in their component list.

The worst form of this bug happens when the CSV is adopted (gets labelled) before being copied out to every namespace.

It is also possible to trigger a form of this bug (even if the initial race hasn't happened) by adding a new namespace that falls into an existing operatorgroup's namespace set. 



Version-Release number of selected component (if applicable):

Confirmed to be an issue in 4.7/4.8/master, assumed to be an issue in earlier releases.


How reproducible:


Steps to Reproduce:
1. Install an AllNAmespace operator
2. Add a namespace
3. View the Operator object corresponding to the installed operator. A "reason: Copied" CSV will be visible.

(The worse form of this is racy, but seems to consistently happen on 4.7+ clusters)

Actual results:
Copied CSVs found in Operator.Status.Components


Expected results:
No copied CSVs found in Operator.Status.Components


Additional info:
This adds additional memory and cpu load to olm-operator proportional to the number of namespaces and number of operators in the cluster.

Comment 4 xzha 2021-04-12 05:06:51 UTC
verify:
zhaoxia@xzha-mac bug-1946838 % oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-04-09-222447   True        False         57m     Cluster version is 4.8.0-0.nightly-2021-04-09-222447
1. Install an AllNAmespace operator
zhaoxia@xzha-mac bug-1946838 % cat sub.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: klusterlet-operator
  namespace: openshift-operators
spec:
  channel: stable
  installPlanApproval: Automatic
  name: klusterlet
  source: community-operators
  sourceNamespace: openshift-marketplace
zhaoxia@xzha-mac bug-1946838 % oc apply -f sub.yaml 
subscription.operators.coreos.com/klusterlet-operator created

2. Add a namespace
zhaoxia@xzha-mac bug-1946838 % oc new-project test-1
zhaoxia@xzha-mac bug-1946838 % oc project test-1
Now using project "test-1" on server "https://api.xzha0412-4.8.qe.devcluster.openshift.com:6443".
zhaoxia@xzha-mac bug-1946838 % oc get csv
NAME                DISPLAY      VERSION   REPLACES            PHASE
klusterlet.v0.3.0   Klusterlet   0.3.0     klusterlet.v0.2.0   Succeeded

3. View the Operator object 
zhaoxia@xzha-mac bug-1946838 % oc get operator klusterlet.openshift-operators -o yaml | grep -i Copied 
zhaoxia@xzha-mac bug-1946838 % oc get operator klusterlet.openshift-operators -o yaml | grep -i reason 
        reason: MinimumReplicasAvailable
        reason: NewReplicaSetAvailable
        reason: NoConflicts
        reason: InitialNamesAccepted
        reason: AllCatalogSourcesHealthy
        reason: InstallSucceeded

LGTM, verified

Comment 8 errata-xmlrpc 2021-07-27 22:57:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438