Bug 1948505 - [vSphere csi driver operator] vmware-vsphere-csi-driver-operator pod restart every 10 minutes
Summary: [vSphere csi driver operator] vmware-vsphere-csi-driver-operator pod restart ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.8.0
Assignee: Fabio Bertinatto
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-12 10:17 UTC by Wei Duan
Modified: 2021-07-27 22:59 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 22:59:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-storage-operator pull 156 0 None open Bug 1948505: Add missing RBAC rules for vSphere 2021-04-13 10:44:54 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:59:46 UTC

Description Wei Duan 2021-04-12 10:17:20 UTC
Description of problem:
vmware-vsphere-csi-driver-operator pod restart every 10 minutes, and full with following error:
Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-04-09-222447

How reproducible:
Aways

Steps to Reproduce:
1. Set Up an OCP4.8 cluster on vSphere
2. Enable TechPreviewNoUpgrade featureset
3. Check the vmware-vsphere-csi-driver-operator pod and its log 
$ oc -n openshift-cluster-csi-drivers get pod
NAME                                                    READY   STATUS    RESTARTS   AGE
vmware-vsphere-csi-driver-controller-8495967fc5-qzj97   6/6     Running   0          6h53m
vmware-vsphere-csi-driver-node-d7mt2                    3/3     Running   0          6h53m
vmware-vsphere-csi-driver-node-db6ht                    3/3     Running   0          6h53m
vmware-vsphere-csi-driver-node-kgjf9                    3/3     Running   0          6h53m
vmware-vsphere-csi-driver-node-sl4xr                    3/3     Running   0          6h53m
vmware-vsphere-csi-driver-node-v9qrv                    3/3     Running   0          6h53m
vmware-vsphere-csi-driver-operator-658558ff9b-vr6wv     1/1     Running   37         6h53m

Actual results:
vmware-vsphere-csi-driver-operator pod restarted 37 times in about 7 hours, the following log thrown in every ~40 seconds
Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope

Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:
  
PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
Full logs:
$ oc -n openshift-cluster-csi-drivers logs vmware-vsphere-csi-driver-operator-658558ff9b-vr6wv -f
W0412 09:53:40.923933       1 cmd.go:204] Using insecure, self-signed certificates
I0412 09:53:41.747018       1 observer_polling.go:159] Starting file observer
I0412 09:53:41.781121       1 builder.go:238] vmware-vsphere-csi-driver-operator version 7b3e2ca-7b3e2caa96f2fef9b2e446035768eb6968d293d4
W0412 09:53:41.989534       1 secure_serving.go:69] Use of insecure cipher 'TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256' detected.
W0412 09:53:41.989559       1 secure_serving.go:69] Use of insecure cipher 'TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256' detected.
I0412 09:53:41.990477       1 leaderelection.go:243] attempting to acquire leader lease openshift-cluster-csi-drivers/vmware-vsphere-csi-driver-operator-lock...
I0412 09:53:41.995346       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0412 09:53:41.995370       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0412 09:53:41.995403       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0412 09:53:41.995409       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0412 09:53:41.995355       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 09:53:41.995482       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 09:53:41.995803       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/serving-cert-744926360/tls.crt::/tmp/serving-cert-744926360/tls.key
I0412 09:53:41.995824       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0412 09:53:41.995805       1 secure_serving.go:197] Serving securely on [::]:8443
I0412 09:53:42.096295       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0412 09:53:42.096730       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0412 09:53:42.096802       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
[wduan@preserve-wduan-ws vsphere-problem-detector]$ oc -n openshift-cluster-csi-drivers logs vmware-vsphere-csi-driver-operator-658558ff9b-vr6wv -f 
W0412 09:53:40.923933       1 cmd.go:204] Using insecure, self-signed certificates
I0412 09:53:41.747018       1 observer_polling.go:159] Starting file observer
I0412 09:53:41.781121       1 builder.go:238] vmware-vsphere-csi-driver-operator version 7b3e2ca-7b3e2caa96f2fef9b2e446035768eb6968d293d4
W0412 09:53:41.989534       1 secure_serving.go:69] Use of insecure cipher 'TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256' detected.
W0412 09:53:41.989559       1 secure_serving.go:69] Use of insecure cipher 'TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256' detected.
I0412 09:53:41.990477       1 leaderelection.go:243] attempting to acquire leader lease openshift-cluster-csi-drivers/vmware-vsphere-csi-driver-operator-lock...
I0412 09:53:41.995346       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0412 09:53:41.995370       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0412 09:53:41.995403       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0412 09:53:41.995409       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0412 09:53:41.995355       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 09:53:41.995482       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 09:53:41.995803       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/serving-cert-744926360/tls.crt::/tmp/serving-cert-744926360/tls.key
I0412 09:53:41.995824       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0412 09:53:41.995805       1 secure_serving.go:197] Serving securely on [::]:8443
I0412 09:53:42.096295       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0412 09:53:42.096730       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0412 09:53:42.096802       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I0412 09:54:49.268915       1 leaderelection.go:253] successfully acquired lease openshift-cluster-csi-drivers/vmware-vsphere-csi-driver-operator-lock
I0412 09:54:49.270593       1 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"openshift-cluster-csi-drivers", Name:"vmware-vsphere-csi-driver-operator-lock", UID:"88174873-afb0-44bc-aaa8-7f60b09116f2", APIVersion:"v1", ResourceVersion:"196554", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' vmware-vsphere-csi-driver-operator-658558ff9b-vr6wv_d50eb82a-fb43-422a-9ed8-a51264e6bdfd became leader
I0412 09:54:49.274871       1 starter.go:140] Starting the informers
I0412 09:54:49.274915       1 starter.go:145] Starting targetconfigcontroller
I0412 09:54:49.274928       1 starter.go:149] Starting controllerset
I0412 09:54:49.275322       1 base_controller.go:66] Waiting for caches to sync for VMwareVSphereDriverTargetConfigController
I0412 09:54:49.275369       1 base_controller.go:66] Waiting for caches to sync for VMwareVSphereDriverStorageClassController
E0412 09:54:49.282550       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
I0412 09:54:49.375099       1 base_controller.go:66] Waiting for caches to sync for VMwareVSphereDriverNodeServiceController
I0412 09:54:49.375123       1 base_controller.go:72] Caches are synced for VMwareVSphereDriverNodeServiceController 
I0412 09:54:49.375123       1 base_controller.go:66] Waiting for caches to sync for ManagementStateController
I0412 09:54:49.375143       1 base_controller.go:72] Caches are synced for ManagementStateController 
I0412 09:54:49.375149       1 base_controller.go:109] Starting #1 worker of ManagementStateController controller ...
I0412 09:54:49.375149       1 base_controller.go:66] Waiting for caches to sync for LoggingSyncer
I0412 09:54:49.375162       1 base_controller.go:72] Caches are synced for LoggingSyncer 
I0412 09:54:49.375167       1 base_controller.go:109] Starting #1 worker of LoggingSyncer controller ...
I0412 09:54:49.375130       1 base_controller.go:109] Starting #1 worker of VMwareVSphereDriverNodeServiceController controller ...
I0412 09:54:49.375183       1 base_controller.go:66] Waiting for caches to sync for ConfigObserver
I0412 09:54:49.375253       1 base_controller.go:66] Waiting for caches to sync for VMwareVSphereDriverControllerServiceController
I0412 09:54:49.375263       1 base_controller.go:72] Caches are synced for VMwareVSphereDriverControllerServiceController 
I0412 09:54:49.375266       1 base_controller.go:109] Starting #1 worker of VMwareVSphereDriverControllerServiceController controller ...
I0412 09:54:49.375379       1 base_controller.go:72] Caches are synced for VMwareVSphereDriverTargetConfigController 
I0412 09:54:49.375386       1 base_controller.go:109] Starting #1 worker of VMwareVSphereDriverTargetConfigController controller ...
I0412 09:54:49.375394       1 base_controller.go:72] Caches are synced for VMwareVSphereDriverStorageClassController 
I0412 09:54:49.375397       1 base_controller.go:109] Starting #1 worker of VMwareVSphereDriverStorageClassController controller ...
I0412 09:54:49.383237       1 base_controller.go:66] Waiting for caches to sync for StaticResourceController
I0412 09:54:49.383259       1 base_controller.go:72] Caches are synced for StaticResourceController 
I0412 09:54:49.383263       1 base_controller.go:109] Starting #1 worker of StaticResourceController controller ...
I0412 09:54:49.511361       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-cluster-csi-drivers", Name:"vmware-vsphere-csi-driver-operator", UID:"b00a0d3d-c196-47a3-abea-0f0ca337822e", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'StorageClassUpdated' Updated StorageClass.storage.k8s.io/thin-csi because it changed
E0412 09:54:50.622948       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:54:53.024835       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:54:56.505854       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:55:07.125065       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:55:31.029439       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:56:20.600688       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:56:50.691044       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:57:39.168256       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:58:18.598090       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:59:09.844095       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 09:59:44.890907       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
I0412 09:59:49.471417       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-cluster-csi-drivers", Name:"vmware-vsphere-csi-driver-operator", UID:"b00a0d3d-c196-47a3-abea-0f0ca337822e", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'StorageClassUpdated' Updated StorageClass.storage.k8s.io/thin-csi because it changed
E0412 10:00:16.297937       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 10:00:52.263490       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 10:01:32.749515       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 10:02:18.199465       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 10:02:58.477753       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 10:03:32.638110       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
E0412 10:04:20.954062       1 reflector.go:138] k8s.io/client-go.4/tools/cache/reflector.go:167: Failed to watch *v1.Proxy: failed to list *v1.Proxy: proxies.config.openshift.io is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:vmware-vsphere-csi-driver-operator" cannot list resource "proxies" in API group "config.openshift.io" at the cluster scope
I0412 10:04:49.376773       1 shared_informer.go:266] stop requested
F0412 10:04:49.376924       1 base_controller.go:95] unable to sync caches for ConfigObserver

Comment 2 Wei Duan 2021-04-19 03:41:17 UTC
Verified pass on 4.8.0-0.nightly-2021-04-18-101412.

$ oc -n openshift-cluster-csi-drivers get pod 
NAME                                                    READY   STATUS    RESTARTS   AGE
...
vmware-vsphere-csi-driver-operator-d4b7b59df-rw7fd      1/1     Running   0          102m

Comment 5 errata-xmlrpc 2021-07-27 22:59:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.