Description of problem: If an admin updates manually one of the bindings managed by CMO, the operator fails to reconcile the resource to the expected state. By design, a binding's roleRef can't be changed after creation (see https://kubernetes.io/docs/reference/access-authn-authz/rbac/#clusterrolebinding-example). To change the roleRef of an existing resource, it needs to be deleted and recreated. Version-Release number of selected component (if applicable): 4.5 How reproducible: Always Steps to Reproduce: 1. cat <<EOF > custom-binding.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: prometheus-k8s roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: thanos-querier subjects: - kind: ServiceAccount name: prometheus-k8s namespace: openshift-monitoring EOF 2. Update the "prometheus-k8s" cluster role binding to reference "thanos-querier" instead of "prometheus-k8s". oc auth reconcile -f custom-binding.yaml 3. Verify that the binding has been updated oc get clusterrolebindings prometheus-k8s -o jsonpath='{.roleRef.name}' 4. Check the CMO logs. Actual results: CMO fails to reconcile the binding. I0408 09:01:46.767141 1 operator.go:340] Updating ClusterOperator status to failed. Err: running task Updating Prometheus-k8s failed: reconciling Prometheus ClusterRoleBinding failed: updating ClusterRoleBinding object failed: ClusterRoleBinding.rbac.authorization.k8s.io "prometheus-k8s" is invalid: roleRef: Invalid value: rbac.RoleRef{APIGroup:"rbac.authorization.k8s.io", Kind:"ClusterRole", Name:"prometheus-k8s"}: cannot change roleRef E0408 09:01:46.779512 1 operator.go:272] Syncing "openshift-monitoring/cluster-monitoring-config" failed E0408 09:01:46.779647 1 operator.go:273] sync "openshift-monitoring/cluster-monitoring-config" failed: running task Updating Prometheus-k8s failed: reconciling Prometheus ClusterRoleBinding failed: updating ClusterRoleBinding object failed: ClusterRoleBinding.rbac.authorization.k8s.io "prometheus-k8s" is invalid: roleRef: Invalid value: rbac.RoleRef{APIGroup:"rbac.authorization.k8s.io", Kind:"ClusterRole", Name:"prometheus-k8s"}: cannot change roleRef Expected results: CMO reconciles the binding without error. Additional info: Relevant code in the CMO repository: * https://github.com/openshift/cluster-monitoring-operator/blob/7a2264c8469aa8168b4f3c28d42f0982c200b538/pkg/client/client.go#L980-L1000 * https://github.com/openshift/cluster-monitoring-operator/blob/7a2264c8469aa8168b4f3c28d42f0982c200b538/pkg/client/client.go#L1032-L1045 Issue filed from https://bugzilla.redhat.com/show_bug.cgi?id=1820230#c8 Workaround: Delete the offending (cluster)role binding and let CMO recreate it properly.
tested with 4.5.0-0.nightly-2020-05-05-205255 and followed steps in Comment 0, the rolebindings could be reconciled without error when roleRef has been changed # oc -n openshift-monitoring logs cluster-monitoring-operator-57cb74c7ff-x8tgl -c cluster-monitoring-operator | grep "cannot change roleRef" no result
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409