Description of problem: Kube-scheduler leader election lease duration is set to 15 seconds which is causing it to go through leader elections and restart during the kube-apiserver rollout which currently takes around 60 seconds with shutdown-delay-duration and gracefulTerminationDuration is now set to 0 and 15 seconds ( https://github.com/openshift/cluster-kube-apiserver-operator/pull/1168 and https://github.com/openshift/library-go/pull/1104 ). kube-scheduler leader election timeout should be set to > 60 seconds to handle the downtime gracefully in SNO. Recommended lease duration values to be considered for reference as noted in https://github.com/openshift/enhancements/pull/832/files#diff-2e28754e69aa417e5b6d89e99e42f05bfb6330800fa823753383db1d170fbc2fR183: LeaseDuration=137s, RenewDealine=107s, RetryPeriod=26s. These are the configurable values in k8s.io/client-go based leases and controller-runtime exposes them. This gives us 1. clock skew tolerance == 30s 2. kube-apiserver downtime tolerance == 78s 3. worst non-graceful lease reacquisition == 163s 4. worst graceful lease reacquisition == 26s We can see that leader lease failures in the log: http://dell-r510-01.perf.lab.eng.rdu2.redhat.com/chaos/sno/kube-scheduler/kube-scheduler.log. The leader election can also be disabled given that there's no HA in SNO. Here is the trace of the events during the rollout:http://dell-r510-01.perf.lab.eng.rdu2.redhat.com/chaos/sno/kube-scheduler/cerberus_api_rollout_trace.json. we can see that kube-scheduler restarted at 2021-07-21 18:05:29 and that maps to the leader election from the kube-scheduler logs and events: http://dell-r510-01.perf.lab.eng.rdu2.redhat.com/chaos/sno/kube-scheduler/. We can disable the leader election as well given that there's no HA in SNO. Version-Release number of selected component (if applicable): 4.8.0-0.nightly-2021-07-19-192457 How reproducible: Always Steps to Reproduce: 1. Install a SNO cluster using the latest nightly payload. 2. Trigger kube-apiserver rollout or outage which lasts for at least 60 seconds ( kube-apiserver rollout on a cluster built using payload with https://github.com/openshift/cluster-kube-apiserver-operator/pull/1168 should take ~60 seconds ) - $ oc patch kubeapiserver/cluster --type merge -p '{"spec":{"forceRedeploymentReason":"ITERATIONX"}}' where X can be 1,2...n 3. Observe the state of kube-scheduler. Actual results: kube-scheduler goes through leader election and restarts. Expected results: kube-scheduler should handle the API rollout/outage gracefully. Additional info: Logs including must-gather: http://dell-r510-01.perf.lab.eng.rdu2.redhat.com/chaos/sno/kube-scheduler/
Hello Ravi, Regarding the validation part here, could you please help confirm if we need to verify that there are no errors in kube-scheduler log during the apiserver rollout or make sure that kube-scheduler does not restart ? Thanks kasturi
Hello Rama, kube-scheduler typically restarts trying to acquire leader lease post lease duration expire, so we need to make sure there are no 1) traces of lease expiry/leader elections in the kube-scheduler pod logs 2) pod restarts after the API server rollout which can be triggered using $ oc patch kubeapiserver/cluster --type merge -p '{"spec":{"forceRedeploymentReason":"ITERATIONX"}}' where X can be 1,2...n. Please feel free to let me know if any other details are needed. Thanks.
Verified with the build below and i do not see traces of lease epiry/leader elections in the kube-scheduler pod logs / pod restarts after API server rollout. [knarra@knarra ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-08-29-010334 True False 61m Cluster version is 4.9.0-0.nightly-2021-08-29-010334 Below are the steps followed to verify the bug: =============================================== 1) Rollout kubeapiserver by running the command oc patch kubeapiserver/cluster --type merge -p '{"spec":{"forceRedeploymentReason":"ITERATIONX"}}' where x can be 1,2 ....n. 2) observer kube-scheduler logs & pod restarts for kube-scheduler pods 3) There are no logs related to lease/expiry or leader elections in the logs & no pod restarts as well. Logs from kube-scheduler: =============================== I0830 06:05:24.295099 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/certified-operators-ld4fm" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:05:44.944969 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-ttwgs" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:06:34.296849 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-pks9x" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:08:08.298636 1 factory.go:381] "Unable to schedule pod; no fit; waiting" pod="openshift-authentication/oauth-openshift-5987d9b57-kqq9c" err="0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules." I0830 06:08:34.226414 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-authentication/oauth-openshift-5987d9b57-kqq9c" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:14:37.720134 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-marketplace-2zc5l" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:15:00.172505 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-operator-lifecycle-manager/collect-profiles-27171735--1-wkvcg" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:16:19.300865 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-9psvv" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:16:47.543207 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/certified-operators-5zz62" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:17:43.342258 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-2vq9k" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:27:10.337663 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/certified-operators-fztkk" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:28:33.048586 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-2ln7f" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 E0830 06:29:15.678114 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: unknown (get pods) E0830 06:29:15.678737 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: unknown (get nodes) E0830 06:29:15.678811 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicaSet: unknown (get replicasets.apps) E0830 06:29:15.679015 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: unknown (get csidrivers.storage.k8s.io) E0830 06:29:15.679356 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: unknown (get persistentvolumeclaims) E0830 06:29:15.679433 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: unknown (get csistoragecapacities.storage.k8s.io) E0830 06:29:15.679468 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSINode: unknown (get csinodes.storage.k8s.io) E0830 06:29:15.679516 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: unknown (get storageclasses.storage.k8s.io) E0830 06:29:15.679588 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicationController: unknown (get replicationcontrollers) E0830 06:29:15.679649 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: unknown (get services) E0830 06:29:15.679806 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PodDisruptionBudget: unknown (get poddisruptionbudgets.policy) E0830 06:29:15.680553 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: unknown (get persistentvolumes) E0830 06:29:15.680716 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Namespace: unknown (get namespaces) E0830 06:29:15.682191 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StatefulSet: unknown (get statefulsets.apps) E0830 06:29:15.927237 1 reflector.go:138] k8s.io/apiserver/pkg/authentication/request/headerrequest/requestheader_controller.go:172: Failed to watch *v1.ConfigMap: unknown (get configmaps) E0830 06:29:15.928255 1 reflector.go:138] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:205: Failed to watch *v1.ConfigMap: unknown (get configmaps) E0830 06:29:15.929317 1 reflector.go:138] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:205: Failed to watch *v1.ConfigMap: unknown (get configmaps) I0830 06:29:25.556297 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-j8smz" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:29:25.560620 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-marketplace-868xz" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:30:00.154980 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-operator-lifecycle-manager/collect-profiles-27171750--1-plt5j" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:38:21.942415 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/certified-operators-4d9v2" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:39:28.645011 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-marketplace-47fjb" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:39:53.294110 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-xpbpx" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:43:55.362626 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-k8rtj" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 E0830 06:44:45.450104 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PodDisruptionBudget: unknown (get poddisruptionbudgets.policy) E0830 06:44:45.450127 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: unknown (get pods) E0830 06:44:45.450183 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: unknown (get storageclasses.storage.k8s.io) E0830 06:44:45.450191 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: unknown (get csidrivers.storage.k8s.io) E0830 06:44:45.450228 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: unknown (get services) E0830 06:44:45.450238 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: unknown (get nodes) E0830 06:44:45.450272 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: unknown (get csistoragecapacities.storage.k8s.io) E0830 06:44:45.450282 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StatefulSet: unknown (get statefulsets.apps) E0830 06:44:45.450319 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Namespace: unknown (get namespaces) E0830 06:44:45.450334 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicaSet: unknown (get replicasets.apps) E0830 06:44:45.450369 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSINode: unknown (get csinodes.storage.k8s.io) E0830 06:44:45.450534 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicationController: unknown (get replicationcontrollers) E0830 06:44:45.450577 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: unknown (get persistentvolumeclaims) E0830 06:44:45.450588 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: unknown (get persistentvolumes) I0830 06:45:00.174656 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-operator-lifecycle-manager/collect-profiles-27171765--1-rh9mh" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 E0830 06:49:44.491665 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: unknown (get csistoragecapacities.storage.k8s.io) E0830 06:49:44.491786 1 leaderelection.go:330] error retrieving resource lock openshift-kube-scheduler/kube-scheduler: configmaps "kube-scheduler" is forbidden: User "system:kube-scheduler" cannot get resource "configmaps" in API group "" in the namespace "openshift-kube-scheduler" E0830 06:49:44.491924 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: unknown (get persistentvolumes) E0830 06:49:44.491927 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: unknown (get nodes) E0830 06:49:44.491969 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: unknown (get persistentvolumeclaims) E0830 06:49:44.492006 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: unknown (get storageclasses.storage.k8s.io) E0830 06:49:44.492042 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: unknown (get services) E0830 06:49:44.492113 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: unknown (get pods) E0830 06:49:44.492185 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: unknown (get csidrivers.storage.k8s.io) E0830 06:49:44.492222 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSINode: unknown (get csinodes.storage.k8s.io) E0830 06:49:44.492274 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicaSet: unknown (get replicasets.apps) E0830 06:49:44.492312 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Namespace: unknown (get namespaces) E0830 06:49:44.492347 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicationController: unknown (get replicationcontrollers) E0830 06:49:44.492073 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StatefulSet: unknown (get statefulsets.apps) E0830 06:49:44.492363 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PodDisruptionBudget: unknown (get poddisruptionbudgets.policy) I0830 06:49:54.838208 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/certified-operators-5bzz4" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:49:54.857451 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-marketplace-r8pjx" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:49:55.669169 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-xg64w" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 06:54:00.586761 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-plh7s" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 E0830 06:55:11.423932 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: unknown (get csidrivers.storage.k8s.io) E0830 06:55:11.424080 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Namespace: unknown (get namespaces) E0830 06:55:11.443579 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicaSet: unknown (get replicasets.apps) E0830 06:55:11.443847 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StatefulSet: unknown (get statefulsets.apps) E0830 06:55:11.456139 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: unknown (get nodes) E0830 06:55:11.456424 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: unknown (get storageclasses.storage.k8s.io) E0830 06:55:11.456532 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: unknown (get csistoragecapacities.storage.k8s.io) E0830 06:55:11.466652 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: unknown (get persistentvolumes) E0830 06:55:11.484963 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: unknown (get services) E0830 06:55:11.485087 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicationController: unknown (get replicationcontrollers) E0830 06:55:11.485153 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSINode: unknown (get csinodes.storage.k8s.io) E0830 06:55:11.485213 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PodDisruptionBudget: unknown (get poddisruptionbudgets.policy) E0830 06:55:11.485281 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: unknown (get pods) E0830 06:55:11.505682 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: unknown (get persistentvolumeclaims) E0830 06:55:11.563911 1 reflector.go:138] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:205: Failed to watch *v1.ConfigMap: unknown (get configmaps) E0830 06:55:11.563915 1 reflector.go:138] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:205: Failed to watch *v1.ConfigMap: unknown (get configmaps) E0830 06:55:11.563918 1 reflector.go:138] k8s.io/apiserver/pkg/authentication/request/headerrequest/requestheader_controller.go:172: Failed to watch *v1.ConfigMap: unknown (get configmaps) I0830 07:00:00.160038 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-operator-lifecycle-manager/collect-profiles-27171780--1-gvt9q" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:00:22.970343 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-8tvth" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:00:40.908905 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-marketplace-g8hkn" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:04:39.404093 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-j7bfh" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:09:40.603524 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/certified-operators-xzcd6" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:10:42.638968 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-marketplace-c4d86" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:10:56.599286 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-grr5b" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:15:00.179160 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-operator-lifecycle-manager/collect-profiles-27171795--1-lwmzg" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:15:04.648838 1 scheduler.go:672] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-qfcbg" node="knarra0830-j5nhv-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 Also no pod restarts as well as per below [knarra@knarra ~]$ oc get pods -l app=openshift-kube-scheduler -n openshift-kube-scheduler NAME READY STATUS RESTARTS AGE openshift-kube-scheduler-knarra0830-j5nhv-master-0.c.openshift-qe.internal 3/3 Running 0 74m Tested the same with 4.8 and i could lease lease expiry/leader elections in the kube-scheduler pod and pod restarts as well. 4.8 kube-scheduler logs: ========================= I0830 07:05:29.815713 1 trace.go:205] Trace[1348259986]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:134 (30-Aug-2021 07:04:59.814) (total time: 30000ms): Trace[1348259986]: [30.000737405s] [30.000737405s] END E0830 07:05:29.815742 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: Get "https://api-int.knarra0831.qe.gcp.devcluster.openshift.com:6443/apis/storage.k8s.io/v1beta1/csistoragecapacities?limit=500&resourceVersion=0": dial tcp 10.0.0.3:6443: i/o timeout I0830 07:05:29.827982 1 trace.go:205] Trace[1502019862]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:134 (30-Aug-2021 07:04:59.826) (total time: 30001ms): Trace[1502019862]: [30.001324676s] [30.001324676s] END E0830 07:05:29.828019 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: failed to list *v1.PersistentVolume: Get "https://api-int.knarra0831.qe.gcp.devcluster.openshift.com:6443/api/v1/persistentvolumes?limit=500&resourceVersion=0": dial tcp 10.0.0.3:6443: i/o timeout I0830 07:05:31.305737 1 node_tree.go:65] Added node "knarra0831-vzwlb-master-0.c.openshift-qe.internal" in group "us-central1:\x00:us-central1-a" to NodeTree I0830 07:05:32.515602 1 leaderelection.go:243] attempting to acquire leader lease openshift-kube-scheduler/kube-scheduler... I0830 07:05:49.504422 1 leaderelection.go:253] successfully acquired lease openshift-kube-scheduler/kube-scheduler I0830 07:05:49.513647 1 scheduler.go:604] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-jv6xq" node="knarra0831-vzwlb-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:05:49.514057 1 scheduler.go:604] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-cdh94" node="knarra0831-vzwlb-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:05:49.514165 1 scheduler.go:604] "Successfully bound pod to node" pod="openshift-marketplace/certified-operators-sjqfv" node="knarra0831-vzwlb-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:10:58.938572 1 scheduler.go:604] "Successfully bound pod to node" pod="openshift-marketplace/redhat-marketplace-qc4lb" node="knarra0831-vzwlb-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:17:15.022998 1 scheduler.go:604] "Successfully bound pod to node" pod="openshift-marketplace/community-operators-vhsrd" node="knarra0831-vzwlb-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:17:25.544829 1 scheduler.go:604] "Successfully bound pod to node" pod="openshift-marketplace/redhat-operators-vqg8n" node="knarra0831-vzwlb-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 I0830 07:17:32.297554 1 scheduler.go:604] "Successfully bound pod to node" pod="openshift-marketplace/certified-operators-25bb7" node="knarra0831-vzwlb-master-0.c.openshift-qe.internal" evaluatedNodes=1 feasibleNodes=1 Pod restarts of kube-scheduler pods: ======================================== [knarra@knarra ~]$ oc get pods -l app=openshift-kube-scheduler -n openshift-kube-scheduler NAME READY STATUS RESTARTS AGE openshift-kube-scheduler-knarra0831-vzwlb-master-0.c.openshift-qe.internal 3/3 Running 9 74m kube-scheduler pod was restarted for about 2 times after the rollout of apiserver. Based on the above moving bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759