Based on audit logs of a cluster that has been running for a day: $ cat * | grep '"resource":"rolebindings"' | grep '"verb":"update"' | jq -r '.user.username+"\t->\t "+.objectRef.namespace+":"+.objectRef.name' | sort | uniq -c 3577 system:serviceaccount:openshift-machine-config-operator:default -> default:machine-config-daemon-events 3577 system:serviceaccount:openshift-machine-config-operator:default -> openshift-machine-config-operator:machine-config-daemon-events 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> default:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> kube-system:prometheus-k8s 321 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> kube-system:resource-metrics-auth-reader 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-apiserver:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-cluster-version:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-etcd:prometheus-k8s 321 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-kube-controller-manager:prometheus-k8s 321 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-kube-scheduler:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-monitoring:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-monitoring:prometheus-k8s-config 210 system:serviceaccount:openshift-network-operator:default -> openshift-infra:openshift-sdn-controller-account 210 system:serviceaccount:openshift-network-operator:default -> openshift-sdn:openshift-sdn-controller-leaderelection 210 system:serviceaccount:openshift-network-operator:default -> openshift-sdn:prometheus-k8s CNO is continuously updating three role bindings. It should only update when the role binding is different than the expected value.
There's no way we can safely make this change before code freeze. Pushing to 4.2.
It's possible that I fixed this a few hours before this bug was filed. But we can check in 4.2.
Fixed in https://github.com/openshift/cluster-network-operator/pull/170.
@mkhan, How did you collect those audit logs? I try to use you steps to verify this bugs in v4.2.
@mkhan I check the logs in /var/log/kube-apiserver of master which worked more than 6 hours with version 4.2.0-0.nightly-2019-07-03-003353 no update for role binding: [root@ip-10-0-172-244 kube-apiserver]# cat audit.log | grep '"resource":"rolebindings"' | grep '"verb":"update"' | grep openshift-network-operator [root@ip-10-0-172-244 kube-apiserver]# So this bug should be fixed. please correct me if the steps are not enough. thanks.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922