Bug 1706637
Summary: | RoleBindingRestriction being listed 4 times a second during e2e runs | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> |
Component: | apiserver-auth | Assignee: | Sally <somalley> |
Status: | CLOSED WONTFIX | QA Contact: | Chuan Yu <chuyu> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1.0 | CC: | aos-bugs, eparis, gblomqui, mkhan, nagrawal, somalley |
Target Milestone: | --- | ||
Target Release: | 4.2.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-14 14:03:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Clayton Coleman
2019-05-05 20:32:17 UTC
Opened BZ1707516 BZ1707517 BZ1707519 to track fixes to noisy components. Based on audit logs of a cluster that has been running for a day: $ cat * | grep '"resource":"rolebindings"' | grep '"verb":"update"' | jq -r '.user.username+"\t->\t "+.objectRef.namespace+":"+.objectRef.name' | sort | uniq -c 3577 system:serviceaccount:openshift-machine-config-operator:default -> default:machine-config-daemon-events 3577 system:serviceaccount:openshift-machine-config-operator:default -> openshift-machine-config-operator:machine-config-daemon-events 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> default:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> kube-system:prometheus-k8s 321 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> kube-system:resource-metrics-auth-reader 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-apiserver:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-cluster-version:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-etcd:prometheus-k8s 321 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-kube-controller-manager:prometheus-k8s 321 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-kube-scheduler:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-monitoring:prometheus-k8s 320 system:serviceaccount:openshift-monitoring:cluster-monitoring-operator -> openshift-monitoring:prometheus-k8s-config 210 system:serviceaccount:openshift-network-operator:default -> openshift-infra:openshift-sdn-controller-account 210 system:serviceaccount:openshift-network-operator:default -> openshift-sdn:openshift-sdn-controller-leaderelection 210 system:serviceaccount:openshift-network-operator:default -> openshift-sdn:prometheus-k8s The above is over 10k writes to role bindings. https://github.com/openshift/origin/pull/22783 merged however we'll be reverting this, as it's causing flakes. Opening a new BZ to track those, and will close this in favor of the above 3 BZs opened. The admission plugin for RoleBindingRestrictions is DefaultAllow, this was deemed ok bc they are seldom used. When introduced, this DefaultAllow behavior caused least backward compatibility issues. It's impossible to know if cache is up-to-date, so using an informer with RBRs will never work. As @deads2k put it, ‘“am I up to date with the namespace I'm asserting has no restrictions" is a question you can't answer’ Best we can do for this bug is to ensure components are not continuously updating role bindings. Rather, they should only update when the role binding is different than the expected value. BZs were opened against offending components https://bugzilla.redhat.com/show_bug.cgi?id=1706637#c3, this BZ can be closed. https://bugzilla.redhat.com/show_bug.cgi?id=1720678 opened to track test failures introduced by https://github.com/openshift/origin/pull/22783 |