2042029 – kubedescheduler fails to install completely

Bug 2042029 - kubedescheduler fails to install completely

Summary: kubedescheduler fails to install completely

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-scheduler
Sub Component:
Version:	4.10
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.10.0
Assignee:	ravig
QA Contact:	RamaKasturi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-01-18 17:35 UTC by RamaKasturi
Modified:	2022-03-12 04:41 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-03-12 04:41:03 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2022:0056	0	None	None	None	2022-03-12 04:41:20 UTC

Description RamaKasturi 2022-01-18 17:35:44 UTC

Description of problem:
When trying to install kube-descheduler i see that it does not create kubedescheduler cluster pod and when the operator logs were checked i see below error

[knarra@knarra cucushift]$ oc logs -f descheduler-operator-89cff94d4-k84pc -n openshift-kube-descheduler-operator
W0118 17:29:53.057926       1 cmd.go:213] Using insecure, self-signed certificates
I0118 17:29:53.418651       1 observer_polling.go:159] Starting file observer
I0118 17:29:54.458869       1 builder.go:262] openshift-cluster-kube-descheduler-operator version -
W0118 17:29:54.996505       1 secure_serving.go:69] Use of insecure cipher 'TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256' detected.
W0118 17:29:54.996521       1 secure_serving.go:69] Use of insecure cipher 'TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256' detected.
W0118 17:29:54.997908       1 builder.go:321] unable to get cluster infrastructure status, using HA cluster values for leader election: infrastructures.config.openshift.io "cluster" is forbidden: User "system:serviceaccount:openshift-kube-descheduler-operator:openshift-descheduler" cannot get resource "infrastructures" in API group "config.openshift.io" at the cluster scope
I0118 17:29:54.998233       1 leaderelection.go:248] attempting to acquire leader lease openshift-kube-descheduler-operator/openshift-cluster-kube-descheduler-operator-lock...
I0118 17:29:54.998356       1 event.go:285] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-descheduler-operator", Name:"descheduler-operator", UID:"c792a3aa-0163-4354-b40b-a1fbbb0b083b", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'ClusterInfrastructureStatus' unable to get cluster infrastructure status, using HA cluster values for leader election: infrastructures.config.openshift.io "cluster" is forbidden: User "system:serviceaccount:openshift-kube-descheduler-operator:openshift-descheduler" cannot get resource "infrastructures" in API group "config.openshift.io" at the cluster scope
I0118 17:29:55.001951       1 secure_serving.go:266] Serving securely on [::]:8443
I0118 17:29:55.002276       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0118 17:29:55.002291       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0118 17:29:55.002326       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/serving-cert-930443424/tls.crt::/tmp/serving-cert-930443424/tls.key"
I0118 17:29:55.003700       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0118 17:29:55.004380       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0118 17:29:55.004397       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0118 17:29:55.004416       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0118 17:29:55.004425       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0118 17:29:55.011160       1 leaderelection.go:334] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:openshift-kube-descheduler-operator:openshift-descheduler" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "openshift-kube-descheduler-operator"
I0118 17:29:55.103433       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I0118 17:29:55.104609       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0118 17:29:55.104658       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
E0118 17:30:21.123922       1 leaderelection.go:330] error retrieving resource lock openshift-kube-descheduler-operator/openshift-cluster-kube-descheduler-operator-lock: leases.coordination.k8s.io "openshift-cluster-kube-descheduler-operator-lock" is forbidden: User "system:serviceaccount:openshift-kube-descheduler-operator:openshift-descheduler" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "openshift-kube-descheduler-operator"
E0118 17:30:54.353657       1 leaderelection.go:330] error retrieving resource lock openshift-kube-descheduler-operator/openshift-cluster-kube-descheduler-operator-lock: leases.coordination.k8s.io "openshift-cluster-kube-descheduler-operator-lock" is forbidden: User "system:serviceaccount:openshift-kube-descheduler-operator:openshift-descheduler" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "openshift-kube-descheduler-operator"
E0118 17:31:25.471776       1 leaderelection.go:330] error retrieving resource lock openshift-kube-descheduler-operator/openshift-cluster-kube-descheduler-operator-lock: leases.coordination.k8s.io "openshift-cluster-kube-descheduler-operator-lock" is forbidden: User "system:serviceaccount:openshift-kube-descheduler-operator:openshift-descheduler" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "openshift-kube-descheduler-operator"
E0118 17:32:18.516791       1 leaderelection.go:330] error retrieving resource lock openshift-kube-descheduler-operator/openshift-cluster-kube-descheduler-operator-lock: leases.coordination.k8s.io "openshift-cluster-kube-descheduler-operator-lock" is forbidden: User "system:serviceaccount:openshift-kube-descheduler-operator:openshift-descheduler" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "openshift-kube-descheduler-operator"
E0118 17:33:05.371036       1 leaderelection.go:330] error retrieving resource lock openshift-kube-descheduler-operator/openshift-cluster-kube-descheduler-operator-lock: leases.coordination.k8s.io "openshift-cluster-kube-descheduler-operator-lock" is forbidden: User "system:serviceaccount:openshift-kube-descheduler-operator:openshift-descheduler" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "openshift-kube-descheduler-operator"


Version-Release number of selected component (if applicable):
[knarra@knarra cucushift]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-18-044014   True        False         69m     Cluster version is 4.10.0-0.nightly-2022-01-18-044014
[knarra@knarra cucushift]$ oc get csv -n openshift-kube-descheduler-operator
NAME                                                 DISPLAY                     VERSION               REPLACES   PHASE
clusterkubedescheduleroperator.4.10.0-202201172127   Kube Descheduler Operator   4.10.0-202201172127              Succeeded


How reproducible:
Always

Steps to Reproduce:
1. Install latest 4.10 nightly
2. Install latest descheduler operator
3. Create kubedescheduler cluster instance

Actual results:
Do not see the cluster pod being running and kubedescheduler operator pod logs the error message as shown in the description

Expected results:
User should be able to install kubedescheduler operator


Additional info:

Comment 1 RamaKasturi 2022-01-18 17:36:36 UTC

Marking it TestBlocker as i cannot run any of the descheduler tests due to this bug

Comment 4 Maciej Szulik 2022-01-19 10:12:14 UTC

This will be fixed in https://github.com/openshift/cluster-kube-descheduler-operator/pull/237

Comment 5 Maciej Szulik 2022-01-19 10:19:44 UTC

https://github.com/openshift/cluster-kube-descheduler-operator/pull/237 merged moving to modified

Comment 7 RamaKasturi 2022-01-20 09:40:00 UTC

Verified bug with the build below and i could able to successfully install kubedescheduler.

[knarra@knarra verification-tests]$ oc get csv -n openshift-kube-descheduler-operator
NAME                                                 DISPLAY                     VERSION               REPLACES   PHASE
clusterkubedescheduleroperator.4.10.0-202201191212   Kube Descheduler Operator   4.10.0-202201191212              Succeeded


[knarra@knarra verification-tests]$ oc get pods -n openshift-kube-descheduler-operator
NAME                                    READY   STATUS    RESTARTS   AGE
cluster-684c798996-xhppt                1/1     Running   0          2m15s
descheduler-operator-6f4b4f7dff-4bq9z   1/1     Running   0          3m54s

Based on the above moving bug to verified state.

Comment 8 RamaKasturi 2022-01-20 09:40:57 UTC

cluster version:
====================
[knarra@knarra verification-tests]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-19-150530   True        False         154m    Cluster version is 4.10.0-0.nightly-2022-01-19-150530

Comment 11 errata-xmlrpc 2022-03-12 04:41:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056

Note You need to log in before you can comment on or make changes to this bug.