The etcd-operator PR was errournously linked here. It should be https://github.com/openshift/cluster-authentication-operator/pull/537 instead.
Though two many links to read in above, went through them to understand. The main links to understand are: https://github.com/kubernetes/kubernetes/pull/106852 https://github.com/kubernetes/kubernetes/issues/107454 Checked the library-go, found the lib PR is: https://github.com/openshift/library-go/pull/1282 . Read its code, the only difference is: leaderelection.go now switches to return ConfigMapsLeasesResourceLock instead of ConfigMapsResourceLock . Checked latest 4.10.0-0.nightly-2022-01-29-015515 : $ oc adm release info --commits registry.ci.openshift.org/ocp/release:4.10.0-0.nightly-2022-01-29-015515 | grep authentication-operator cluster-authentication-operator https://github.com/openshift/cluster-authentication-operator 4770445... Then checked the CAO repo of this bug's PR: $ cd /path/to/github.com/openshift/cluster-authentication-operator $ git pull $ git checkout -b 4.10.0-0.nightly-2022-01-29-015515 477044 $ vi vendor/github.com/openshift/library-go/pkg/config/leaderelection/leaderelection.go ... rl, err := resourcelock.New( resourcelock.ConfigMapsLeasesResourceLock, ... This means the PR indeed has landed into 4.10 payloads. Then checked the definition and use of ConfigMapsLeasesResourceLock, it is in: https://github.com/openshift/cluster-authentication-operator/blob/4770445/vendor/k8s.io/client-go/tools/leaderelection/resourcelock/interface.go#L137-L140 : case ConfigMapsLeasesResourceLock: return &MultiLock{ Primary: configmapLock, Secondary: leaseLock, This means 4.10 indeed both use old configmap-based election and new lease-baded election, proving Dev's plan in https://github.com/kubernetes/kubernetes/issues/107454 for 4.10, i.e. "version x+1". Further check from openshift-authentication-operator pod logs: $ oc get cm -n openshift-authentication-operator | grep lock cluster-authentication-operator-lock 0 25h $ oc get lease -n openshift-authentication-operator | grep lock cluster-authentication-operator-lock authentication-operator-84bd79899c-sh9lf_baf2761e-f0cd-4f1c-a4a5-c67e3788e45d 25h $ oc get lease -n openshift-authentication-operator cluster-authentication-operator-lock -o yaml apiVersion: coordination.k8s.io/v1 kind: Lease ... spec: acquireTime: "2022-01-30T04:01:50.000000Z" holderIdentity: authentication-operator-84bd79899c-sh9lf_baf2761e-f0cd-4f1c-a4a5-c67e3788e45d leaseDurationSeconds: 137 leaseTransitions: 2 renewTime: "2022-01-30T04:42:46.623458Z" There are both configmap and lease locks. $ oc patch authentication.operator/cluster --type=merge -p=" spec: operatorLogLevel: TraceAll " Then check openshift-authentication-operator pod logs: delete openshift-authentication-operator pod, wait for the new pod to be created, check pod logs, there are: 2022-01-30T04:01:51.031563107Z I0130 04:01:51.031115 1 leaderelection.go:258] successfully acquired lease openshift-authentication-operator/cluster-authentication-operator-lock 2022-01-30T04:01:51.039809515Z I0130 04:01:51.033340 1 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"openshift-authentication-operator", Name:"cluster-authentication-operator-lock", UID:"7d02b348-61f8-4410-b1b7-d846493e8526", APIVersion:"v1", ResourceVersion:"542456", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' authentication-operator-84bd79899c-sh9lf_baf2761e-f0cd-4f1c-a4a5-c67e3788e45d became leader 2022-01-30T04:01:51.039809515Z I0130 04:01:51.033406 1 event.go:285] Event(v1.ObjectReference{Kind:"Lease", Namespace:"openshift-authentication-operator", Name:"cluster-authentication-operator-lock", UID:"f41ef5f7-354a-4b68-896a-2acfe531dd30", APIVersion:"coordination.k8s.io/v1", ResourceVersion:"542458", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' authentication-operator-84bd79899c-sh9lf_baf2761e-f0cd-4f1c-a4a5-c67e3788e45d became leader This means configmap-based and lease-based elections both work well in 4.10. Compare it versus 4.9, openshift-authentication-operator pod logs only show lines of configmap-based election. No lines of lease-based election. This further verifies 4.10 is working as expected by the bug's PR. After above understanding, no further test can be done IMO, moving to VERIFIED. Per https://github.com/kubernetes/kubernetes/issues/107454 , we should watch QE upgrades from 4.9 (i.e. x) to 4.10 (i.e. x+1)to see if there would be election issue. If there would be, we'll file separate bug. Since 4.11 is not yet rebased to k8s 1.24, we cannot watch upgrade from 4.10 to 4.11 (i.e. x+2) right now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056