+++ This bug was initially created as a clone of Bug #1959291 +++ +++ This bug was initially created as a clone of Bug #1959290 +++ +++ This bug was initially created as a clone of Bug #1959285 +++ Apparently, kube-controller-manager-operator has dependency on SAR as part of it's healthcheck, which causes it to be restarted in case of kubeapi rollout in SNO. How reproducible: User cluster-bot: 1. launch nightly aws,single-node 2. Update audit log verbosity to: AllRequestBodies 3. Wait for api rollout (oc get kubeapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}') 4. reboot the node to cleanup the caches (oc debug node/ip-10-0-136-254.ec2.internal) 5. Wait 6. Grep the audit log: oc adm node-logs ip-10-0-128-254.ec2.internal --path=kube-apiserver/audit.log | grep -i health | grep -i subjectaccessreviews | grep -v Unhealth > rbac.log cat rbac.log | jq . -C | less -r | grep 'username' | sort | uniq Actual results: ~/work/installer [master]> cat rbac.log | jq . -C | less -r | grep 'username' | sort | uniq "username": "system:serviceaccount:openshift-kube-controller-manager-operator:kube-controller-manager-operator" Expected results: It should not appear Additional info: Affects SNO stability upon api rollout (certificates rotation)
Hi Rom, The health checks we have for KCMO check for the KCM endpoint. We just have health checks for 10257 port as you can see here: https://github.com/openshift/cluster-kube-controller-manager-operator/blob/dc54142035982bc44581936a7e90cdd9ac9ad24e/bindata/v4.1.0/kube-controller-manager/pod.yaml When KCMO starts it connects to APIServer and perhaps that was the reason you are noticing those entries in event log. So, closing this BZ for now. Feel free to open it in case you feel otherwise.