2052701 – kube-scheduler should use configmap lease

Bug 2052701 - kube-scheduler should use configmap lease

Summary: kube-scheduler should use configmap lease

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-scheduler
Sub Component:
Version:	4.11
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Jan Chaloupka
QA Contact:	RamaKasturi
Docs Contact:
URL:
Whiteboard:	EmergencyRequest
Depends On:
Blocks:	2052598
TreeView+	depends on / blocked

Reported:	2022-02-09 19:55 UTC by ravig
Modified:	2022-08-10 10:49 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:	2052598
Environment:
Last Closed:	2022-08-10 10:48:34 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2022:5069	0	None	None	None	2022-08-10 10:49:02 UTC

Comment 1 ravig 2022-02-09 19:56:10 UTC

https://github.com/openshift/cluster-kube-scheduler-operator/pull/412

Comment 3 RamaKasturi 2022-02-10 11:53:12 UTC

Verified with the build below and i see that 4.11 use old configmap-based election and new lease-baded election.

Further check from openshift-kube-scheduler-operator has both locks.

[knarra@knarra ~]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-02-10-031822   True        False         116m    Cluster version is 4.11.0-0.nightly-2022-02-10-031822

[knarra@knarra ~]$ oc get cm -n openshift-kube-scheduler-operator | grep lock
openshift-cluster-kube-scheduler-operator-lock   0      135m
[knarra@knarra ~]$ oc get lease -n openshift-kube-scheduler-operator
NAME                                             HOLDER                                                                                    AGE
openshift-cluster-kube-scheduler-operator-lock   openshift-kube-scheduler-operator-6ff9dd6c64-rq244_29cbf69d-e028-4141-82e8-375cd7240b0a   135m
[knarra@knarra ~]$ oc get lease openshift-cluster-kube-scheduler-operator-lock -n openshift-kube-scheduler-operator -o yaml
apiVersion: coordination.k8s.io/v1
kind: Lease
metadata:
  creationTimestamp: "2022-02-10T09:26:30Z"
  name: openshift-cluster-kube-scheduler-operator-lock
  namespace: openshift-kube-scheduler-operator
  resourceVersion: "68301"
  uid: 5a44e948-8128-4b94-992f-b0ecfc8da88d
spec:
  acquireTime: "2022-02-10T10:33:26.000000Z"
  holderIdentity: openshift-kube-scheduler-operator-6ff9dd6c64-rq244_29cbf69d-e028-4141-82e8-375cd7240b0a
  leaseDurationSeconds: 137
  leaseTransitions: 3
  renewTime: "2022-02-10T11:42:24.801086Z"

There are both configmap and lease locks

To make sure this feature is new in 4.11 , comparing it to OCP 4.9

[knarra@knarra ~]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.21    True        False         112m    Cluster version is 4.9.21

[knarra@knarra ~]$ oc get cm -n openshift-kube-scheduler-operator | grep lock
openshift-cluster-kube-scheduler-operator-lock   0      18h

[knarra@knarra ~]$ oc get lease -n openshift-kube-scheduler-operator
No resources found in openshift-kube-scheduler-operator namespace.
[knarra@knarra ~]$ 

only configmap lock exists.

To check the openshift-kube-scheduler-operator for both locks in logs. On 4.9 & 4.10 run command below to set the loglevel to "TraceAll"
# oc edit kubescheduler cluster 
# change loglevel to "TraceAll"

wait for the openshift-kube-scheduler pods to restart, delete the pod in openshift-kube-scheduler-operator namespace and wait for it to be recreated. Once it is up run the command below to check for both locks in logs.

4.11 kube-scheduler-operator pod logs:
=======================================
# oc logs -f openshift-kube-scheduler-operator-6ff9dd6c64-rq244 -n openshift-kube-scheduler-operator

I0210 10:33:26.036628       1 leaderelection.go:248] attempting to acquire leader lease openshift-kube-scheduler-operator/openshift-cluster-kube-scheduler-operator-lock...
I0210 10:33:26.062807       1 leaderelection.go:258] successfully acquired lease openshift-kube-scheduler-operator/openshift-cluster-kube-scheduler-operator-lock
I0210 10:33:26.063033       1 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"openshift-kube-scheduler-operator", Name:"openshift-cluster-kube-scheduler-operator-lock", UID:"2b76713e-a285-4ea2-8f61-c2b9b04314cc", APIVersion:"v1", ResourceVersion:"44512", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' openshift-kube-scheduler-operator-6ff9dd6c64-rq244_29cbf69d-e028-4141-82e8-375cd7240b0a became leader
I0210 10:33:26.063182       1 event.go:285] Event(v1.ObjectReference{Kind:"Lease", Namespace:"openshift-kube-scheduler-operator", Name:"openshift-cluster-kube-scheduler-operator-lock", UID:"5a44e948-8128-4b94-992f-b0ecfc8da88d", APIVersion:"coordination.k8s.io/v1", ResourceVersion:"44513", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' openshift-kube-scheduler-operator-6ff9dd6c64-rq244_29cbf69d-e028-4141-82e8-375cd7240b0a became leader

4.9 kube-scheduler-operator pod logs:
=====================================
I0210 10:34:59.869183       1 leaderelection.go:258] successfully acquired lease openshift-kube-scheduler-operator/openshift-cluster-kube-scheduler-operator-lock
I0210 10:34:59.869681       1 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"openshift-kube-scheduler-operator", Name:"openshift-cluster-kube-scheduler-operator-lock", UID:"e4f2e90e-9497-4e24-aa85-a9050b3f0401", APIVersion:"v1", ResourceVersion:"348553", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' openshift-kube-scheduler-operator-58797bc45d-ph65k_ac1b6de0-fe54-4f67-913d-c7100becff23 became leader
I0210 10:34:59.891055       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-scheduler-operator", Name:"openshift-kube-scheduler-operator", UID:"17b05e11-9702-4213-b520-a7b125c1502c", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'FastControllerResync' Controller "RevisionController" resync interval is set to 0s which might lead to client request throttling

4.9 logs does not contain lease based election.

Based on the above moving bug to verified state.

Comment 7 errata-xmlrpc 2022-08-10 10:48:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Note You need to log in before you can comment on or make changes to this bug.