Bug 1973567 - Autoscaler log report error “Failed to watch *v1.CSIDriver”
Summary: Autoscaler log report error “Failed to watch *v1.CSIDriver”
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.9.0
Assignee: Michael McCune
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks: 1995595
TreeView+ depends on / blocked
 
Reported: 2021-06-18 07:51 UTC by sunzhaohua
Modified: 2021-10-18 17:35 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The cluster autoscaler did not have permission to read csidrivers.storage.k8s.io or csistoragecapacities.storage.k8s.io resources. Consequence: The cluster autoscaler would report errors in its logs stating that its service account does not have access to interact with these resources. Fix: The Role for the cluster autoscaler has been updated to include the new resources. Result: The cluster autoscaler no longer creates error messages in its logs about interacting with these resources.
Clone Of:
: 1995595 (view as bug list)
Environment:
Last Closed: 2021-10-18 17:35:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-autoscaler-operator pull 210 0 None open Bug 1973567: add csidrivers to the cluster-autoscaler cluster role 2021-06-22 22:28:02 UTC
Github openshift cluster-autoscaler-operator pull 212 0 None open Bug 1973567: add csistoragecapacities to cluster-autoscaler cluster role 2021-06-25 14:00:56 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:35:37 UTC

Description sunzhaohua 2021-06-18 07:51:27 UTC
Description of problem:
Autoscaler log report error “E0618 07:23:15.255288       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:serviceaccount:openshift-machine-api:cluster-autoscaler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope”


Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-06-14-145150

How reproducible:
always

Steps to Reproduce:
1. Create a clusterautoscaler
2. Create a machineautoscaler 
3. Add workload
4. Check autoscaler logs

Actual results:
Autoscale logs always output error msg:

oc logs -f cluster-autoscaler-default-75c55cf9d7-kwtt8
I0618 06:25:05.288288       1 klogx.go:86] Pod openshift-machine-api/scale-up-6cc4bdd5db-69c8z is unschedulable
I0618 06:25:05.289389       1 klogx.go:86] Pod openshift-machine-api/scale-up-6cc4bdd5db-jlrxc is unschedulable
I0618 06:25:05.290411       1 klogx.go:86] Pod openshift-machine-api/scale-up-6cc4bdd5db-shjtv is unschedulable
I0618 06:25:05.294384       1 scale_up.go:453] No expansion options
I0618 06:25:05.297314       1 scale_down.go:917] No candidates for scale down
E0618 06:25:07.144282       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:serviceaccount:openshift-machine-api:cluster-autoscaler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
I0618 06:25:15.454335       1 klogx.go:86] Pod openshift-machine-api/scale-up-6cc4bdd5db-shjtv is unschedulable
I0618 06:25:15.454660       1 klogx.go:86] Pod openshift-machine-api/scale-up-6cc4bdd5db-svp5v is unschedulable


Expected results:
Autoscaler logs doesn’t have such error msgs.

Additional info:

Comment 1 Michael McCune 2021-06-21 17:55:38 UTC
i think we just need to update the role for the machine-api service account. i am starting to investigate.

Comment 3 sunzhaohua 2021-06-25 01:03:24 UTC
Failed to verify

 oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-06-24-082405   True        False         61m     Cluster version is 4.9.0-0.nightly-2021-06-24-082405

E0624 16:04:42.548972       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:serviceaccount:openshift-machine-api:cluster-autoscaler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope
I0624 16:04:46.391301       1 static_autoscaler.go:319] 2 unregistered nodes present

Comment 4 sunzhaohua 2021-06-25 01:03:25 UTC
Failed to verify

 oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-06-24-082405   True        False         61m     Cluster version is 4.9.0-0.nightly-2021-06-24-082405

E0624 16:04:42.548972       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:serviceaccount:openshift-machine-api:cluster-autoscaler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope
I0624 16:04:46.391301       1 static_autoscaler.go:319] 2 unregistered nodes present

Comment 6 sunzhaohua 2021-06-28 08:58:52 UTC
verified
oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-06-27-223612   True        False         3m43s   Cluster version is 4.9.0-0.nightly-2021-06-27-223612

I0628 08:53:12.609381       1 scale_down.go:917] No candidates for scale down
I0628 08:53:22.628438       1 static_autoscaler.go:401] No unschedulable pods
I0628 08:53:22.628875       1 scale_down.go:917] No candidates for scale down
W0628 08:53:32.712601       1 clusterstate.go:432] AcceptableRanges have not been populated yet. Skip checking
I0628 08:53:33.288638       1 static_autoscaler.go:401] No unschedulable pods
I0628 08:53:33.691161       1 pre_filtering_processor.go:66] Skipping ip-10-0-242-215.us-east-2.compute.internal - node group min size reached
I0628 08:53:34.088723       1 scale_down.go:917] No candidates for scale down
I0628 08:53:44.909621       1 klogx.go:86] Pod openshift-machine-api/scale-up-6cc4bdd5db-2k59x is unschedulable

Comment 8 Michael McCune 2021-08-19 13:25:18 UTC
@ancollin i have created https://bugzilla.redhat.com/show_bug.cgi?id=1995595 and am working on the backports

Comment 11 errata-xmlrpc 2021-10-18 17:35:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.