{client="machine-config-operator/v0.0.0 (linux/amd64) kubernetes/$Format/machine-config",resource="clusterrolebindings",scope="cluster",verb="GET"} 0.4666666666666667 {client="machine-config-operator/v0.0.0 (linux/amd64) kubernetes/$Format/machine-config",resource="customresourcedefinitions",scope="cluster",verb="GET"} 0.3888888888888889 This looks like a misconfiguration, these resources only rarely change and the MCO shouldn't be reading them this frequently. Contributes to idle read traffic on the cluster.
We're just syncing the informer for clusterrolebinding and adding an event handler so we're not really calling anything. That informer is there since I've started working on the MCO so I don't know why we kick an operator resync when that informer fires (Abhinav might help on that). crd informer/lister are used in the Operator sync: func (optr *Operator) waitForCustomResourceDefinition(resource *apiextv1beta1.CustomResourceDefinition) error { We basically make sure the CRD is there and we do that everytime the operator sync (sounds right though right? we need to make sure our stuff is there at every sync to reconcile if something drifted)
actually, for both we might be wrongly updating these stuff when we shouldn't (CRDs aren't changing so the ones in cluster shouldn't be updated). I'll dig around this.
(In reply to Antonio Murdaca from comment #2) > actually, for both we might be wrongly updating these stuff when we > shouldn't (CRDs aren't changing so the ones in cluster shouldn't be > updated). I'll dig around this. Update: we GET those resources at every sync of the operator itself to make sure they haven't changed, and if they did, we re-apply them. That's why we GET them.
Not as urgent as upgrades issues - so moving it to 4.2 but if cycles allow, we're gonna work on this as soon as we can
*** Bug 1706638 has been marked as a duplicate of this bug. ***
*** Bug 1707516 has been marked as a duplicate of this bug. ***
https://github.com/openshift/machine-config-operator/pull/813
Clayton, how can we access the metrics you've been showing in https://bugzilla.redhat.com/show_bug.cgi?id=1703234#c0 to check this BZ is solved? we've been adding custom logs in a custom payload to check this but QE needs something on a pristine payload.
I talked with Antonio about this and we agreed this is not an issue that an end user is likely to notice. Thus, it makes testing the change particularly difficult, if not impossible. In the attached PR, Antonio notes that the MCD is spamming the logs with: ``` I0604 16:04:52.546178 1 operator.go:257] resyncing operator because "openshift-machine-config-operator/machine-config-controller" *v1.ConfigMap changed I0604 16:04:59.840257 1 operator.go:257] resyncing operator because "openshift-machine-config-operator/machine-config" *v1.ConfigMap changed ``` If I inspect MCD logs on a 4.2 cluster, I see no evidence of those messages being repeated: ``` $ for i in $(oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}' | xargs); do oc -n openshift-machine-config-operator logs $i | grep "resyncing operator"; done $ ``` And just to show that the above query will return correctly with a different message: ``` $ for i in $(oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}' | xargs); do oc -n openshift-machine-config-operator logs $i | grep "In desired config"; done I0701 18:14:30.793968 4209 daemon.go:932] In desired config rendered-master-2abcc719829ba4399ef2bab30ad9de0f I0701 18:22:59.323839 3082 daemon.go:932] In desired config rendered-worker-ceeb937c77626a4593a2feec69673203 I0701 18:22:56.663589 3128 daemon.go:932] In desired config rendered-worker-ceeb937c77626a4593a2feec69673203 I0701 18:14:20.316160 4851 daemon.go:932] In desired config rendered-master-2abcc719829ba4399ef2bab30ad9de0f I0701 18:22:56.520222 3203 daemon.go:932] In desired config rendered-worker-ceeb937c77626a4593a2feec69673203 I0701 18:14:30.856923 4245 daemon.go:932] In desired config rendered-master-2abcc719829ba4399ef2bab30ad9de0f ``` Additionally, Antonio notes that he has been creating custom payloads to generate spammy logs while testing other changes: https://github.com/openshift/machine-config-operator/issues/815#issuecomment-499281762 ...but with the fix for this BZ applied, the MCD is not spamming the logs. Since we don't have any direct steps to test/reproduce this, would be hard for users to detect a change in behavior, and enough evidence to support it has been fixed, I'm going to mark this VERIFIED using 4.2.0-0.nightly-2019-06-30-221852
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922