Description of problem: The OLM logs contain a lot of debug traces. Version-Release number of selected component (if applicable): OCP 4.6.4 How reproducible: Getting the logs from the operator Steps to Reproduce: -- Actual results: ``` [...] 2020-11-16T14:59:38.274Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "subscription", "request": "openshift-operators/openshift-pipelines-operator-rh"} 2020-11-16T14:59:38.275Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "subscription", "request": "openshift-operators/jaeger-product"} time="2020-11-16T14:59:38Z" level=info msg="csv in operatorgroup" csv=redhat-openshift-pipelines-operator.v1.1.2 id=TqYh8 namespace=openshift-operators opgroup=global-operators phase=Succeeded time="2020-11-16T14:59:38Z" level=info msg="csv in operatorgroup" csv=elasticsearch-operator.4.6.0-202010311441.p0 id=Zkrs+ namespace=openshift-operators-redhat opgroup=openshift-operators-redhat phase=Succeeded time="2020-11-16T14:59:38Z" level=info msg="checking elasticsearch-operator.4.6.0-202010311441.p0" time="2020-11-16T14:59:38Z" level=info msg="checking jaeger-operator.v1.20.0" time="2020-11-16T14:59:38Z" level=info msg="checking kiali-operator.v1.24.2" time="2020-11-16T14:59:38Z" level=info msg="checking serverless-operator.v1.10.0" time="2020-11-16T14:59:38Z" level=info msg="checking web-terminal.v1.1.0" time="2020-11-16T14:59:38Z" level=info msg="checking servicemeshoperator.v2.0.0.2" time="2020-11-16T14:59:38Z" level=info msg="checking redhat-openshift-pipelines-operator.v1.1.2" 2020-11-16T14:59:38.401Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "clusterserviceversion", "request": "openshift-operators/jaeger-operator.v1.20.0"} 2020-11-16T14:59:38.435Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "operator", "request": "/servicemeshoperator.openshift-operators"} 2020-11-16T14:59:38.435Z DEBUG controllers.operator reconciling operator {"request": "/jaeger-product.openshift-operators"} 2020-11-16T14:59:38.520Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "clusterserviceversion", "request": "openshift-operators/web-terminal.v1.1.0"} 2020-11-16T14:59:38.520Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "clusterserviceversion", "request": "ocp-ops-view/kiali-operator.v1.24.2"} 2020-11-16T14:59:38.520Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "clusterserviceversion", "request": "kube-public/servicemeshoperator.v2.0.0.2"} 2020-11-16T14:59:38.520Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "clusterserviceversion", "request": "openshift-kube-apiserver/kiali-operator.v1.24.2"} 2020-11-16T14:59:38.544Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "operator", "request": "/jaeger-product.openshift-operators"} 2020-11-16T14:59:38.544Z DEBUG controllers.operator reconciling operator {"request": "/kiali-ossm.openshift-operators"} 2020-11-16T14:59:38.565Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "clusterserviceversion", "request": "iam-test/servicemeshoperator.v2.0.0.2"} 2020-11-16T14:59:38.609Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "clusterserviceversion", "request": "openshift-network-operator/kiali-operator.v1.24.2"} 2020-11-16T14:59:38.637Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "operator", "request": "/kiali-ossm.openshift-operators"} 2020-11-16T14:59:38.637Z DEBUG controllers.operator reconciling operator {"request": "/web-terminal.openshift-operators"} ``` Expected results: Only non-debug logs. Additional info: Customer had this problem for a while and their logs are filling up quickly. They updated to 4.6.4 due to BZ#1888073 but problem persists.
Hello, I am excited to provide an update and announce that a PR is available that should significantly reduce the number of logs generated by OLM. I have verified the effects of this change by conducting the following test. TEST CASE: Environment - A cluster with 100 namespaces and a ClusterScoped operator installed. OLM will copy the ClusterScoped operator into everynamespace, leaving OLM with 100 CSVs to reconcile. Data Collection Proceedure - I tracked the size of the OLM file over 10 minutes. Case 1: No Fix ``` 120K olm.log at 0 minutes 132K olm.log at 1 minutes 364K olm.log at 2 minutes 456K olm.log at 3 minutes 520K olm.log at 4 minutes 540K olm.log at 5 minutes 540K olm.log at 6 minutes 540K olm.log at 7 minutes 548K olm.log at 8 minutes 548K olm.log at 9 minutes ``` I reviewed the logs generated by OLM and found that three messages were constantly being logged: 1. DEBUG controller-runtime.controller Successfully Reconciled {"controller": "clusterserviceversion"... which appeared 604 times 2. level=info msg="csv in operatorgroup" ... which appeared 561 times 3. level=info msg="syncing APIService" .. which appeared 113 times CASE 2: Includes Fix For the 1st log mentioned above, I found that three new controllers recently introduced to OLM used a logger that was configured for developer mode which I have disabled in the provided fix. For the 2nd and 3rd log mentioned above, I reclassified them as debugs logs since they do not provide information that is useful outside of debugging. The changes above reduced the size of the log file significantly: ``` 40K olm.log at 0 minutes 44K olm.log at 1 minutes 76K olm.log at 2 minutes 76K olm.log at 3 minutes 76K olm.log at 4 minutes 80K olm.log at 5 minutes 80K olm.log at 6 minutes 80K olm.log at 7 minutes 80K olm.log at 8 minutes 80K olm.log at 9 minutes ```
Created attachment 1738087 [details] Includes files used to test effects of change Please look at the attached olm-log-noise-bug.tar.gz for files used in performing this testcase along with the results.
Follow up, I also installed two additional ClusterScoped operators on a cluster with the fix and the additional operators had a minimal effect on file size: ``` $ oc get csvs -n operators NAME DISPLAY VERSION REPLACES PHASE chaosblade-operator.v0.5.1 Chaosblade Operator 0.5.1 Succeeded credstash-operator.v1.13.0 credstash-operator 1.13.0 Succeeded sparkoperator.v1.0.2 Apache Spark Operator 1.0.2 Succeeded $ oc logs -n olm olm-operator-865b874cbc-gd6tz > olm.log $ ls -sh olm.log 156K olm.log ```
Cluster version is 4.7.0-0.nightly-2020-12-13-224840 [root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-66ccb94685-cpfqj -- olm --version OLM version: 0.17.0 git commit: 4b66803055a8ab611447c33ed86e755ad39cb313 1, Install two cluster-scoped operators, and there are no DEBUG level logs in the olm logs. [root@preserve-olm-env data]# oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL openshift-operators akka-cluster-operator-certified akka-cluster-operator-certified certified-operators beta openshift-operators etcd etcd community-operators clusterwide-alpha [root@preserve-olm-env data]# oc logs catalog-operator-66ccb94685-cpfqj > olm.1214 [root@preserve-olm-env data]# ls -lh olm.1214 -rw-r--r--. 1 root root 81K Dec 14 07:29 olm.1214 [root@preserve-olm-env data]# [root@preserve-olm-env data]# oc get pods NAME READY STATUS RESTARTS AGE catalog-operator-66ccb94685-cpfqj 1/1 Running 0 72m olm-operator-d6dc885f4-lrx28 1/1 Running 0 72m packageserver-c4dcc95bb-22wfp 1/1 Running 0 65m packageserver-c4dcc95bb-rr67k 1/1 Running 0 65m [root@preserve-olm-env data]# cat olm.1214 |grep -i debug LGTM, verify it.
*** Bug 1924764 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633