Bug 2101520 - csi-snapshot-controller-operator occasionally establishes an unusual number of watch requests
Summary: csi-snapshot-controller-operator occasionally establishes an unusual number o...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.11
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.12.0
Assignee: Fabio Bertinatto
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-27 17:05 UTC by Devan Goodwin
Modified: 2023-01-17 19:51 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-17 19:50:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-csi-snapshot-controller-operator pull 129 0 None open Bug 2101520: bump library-go to get a that prevents watchers leaks 2022-10-11 12:33:43 UTC
Github openshift library-go pull 1400 0 None open Bug 2101520: Log right error when initializing delegating authentication 2022-09-29 09:55:19 UTC
Github openshift library-go pull 1401 0 None open Bug 2101520: test connectivity before retry getting extension-apiserver-authentication 2022-10-11 11:06:11 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:51:07 UTC

Description Devan Goodwin 2022-06-27 17:05:18 UTC
In investigating https://issues.redhat.com/browse/TRT-347 we found that this operator appears to be going beyond it's past expected limits for watch requests. To some extent we expect these values to grow over time, but a couple runs were identified at nearly twice the expected limit from last summer. 

Examples pulled from: https://search.ci.openshift.org/?search=Operator.*csi-snapshot-controller-operator.*produces+more+watch+requests+than+expected&maxAge=336h&context=0&type=bug%2Bjunit&name=gcp&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

Note that the limit coded for GCP last summer was 134.

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/operator-framework_operator-marketplace/478/pull-ci-operator-framework-operator-marketplace-master-e2e-gcp/1539988916965216256

   "Operator \"csi-snapshot-controller-operator\" produces more watch requests than expected: watchrequestcount=256, upperbound=134, ratio=1.9104477611940298",


https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.12-e2e-gcp/1537183611613089792

  "Operator \"csi-snapshot-controller-operator\" produces more watch requests than expected: watchrequestcount=220, upperbound=134, ratio=1.6417910447761195",



For this bug we were just hoping for an explanation for why the number of watch requests can vary so heavily, and if it's expected. If so and this is not a concerning bug, this can be closed.

Note that this situation is very very rare, but it does happen.

Pod logs are available for both and there definitely looks to be some funky things going on with watches, but it's hard for us external to the code to know why.

We are drafting a PR to bump the limits a little in general separately, though not quite high enough to cover these outliers.

Comment 1 Fabio Bertinatto 2022-09-28 18:23:02 UTC
Seems like it fails to start a controller here, then it retries multiple times. This is likely the cause of the leaks:

> I0615 21:46:44.754123       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:45.757131       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:46.757610       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:47.757257       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:48.757348       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:49.757570       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:50.763997       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:51.758744       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:52.757022       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:53.757490       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:54.757458       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:55.757647       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:56.757318       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:57.757696       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:58.757371       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:46:59.757287       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:00.757527       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:01.757276       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:02.757422       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:03.757743       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:04.757850       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:05.758162       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:06.756991       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:07.757692       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:08.757005       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:09.757522       1 server.go:50] Error initializing delegating authentication (will retry): <nil>
> I0615 21:47:10.758219       1 server.go:50] Error initializing delegating authentication (will retry): <nil>

This is the call that's failing: https://github.com/bertinatto/library-go/blob/2ba6eb4e665d297e2337bcabe4a09efc426b4f10/pkg/config/serving/server.go#L48

However, judging from the entries below, the controller is in fact being started *at every retry*, so are the informers:

> I0615 21:47:16.164614       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.164921       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.168369       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.168434       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.168863       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.170983       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.171060       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.171955       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.172015       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.172073       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.172568       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.172632       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.172970       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.172919       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.172944       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173021       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173069       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173085       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173102       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173242       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173261       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173275       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173325       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173345       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173362       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173378       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173409       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> I0615 21:47:16.173291       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
> 
> (...)
> 
> I0615 21:47:16.371445       1 shared_informer.go:262] Caches are synced for RequestHeaderAuthRequestController
> I0615 21:47:16.374774       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
> I0615 21:47:16.389071       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.389430       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.389568       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.389802       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.390384       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.390966       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.391479       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.392956       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.394572       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.395056       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.395502       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.396148       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.397962       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.402784       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.406326       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.406293       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.410299       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.410450       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.410816       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.422048       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.422171       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.467400       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.467505       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.469395       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.471798       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.471926       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.472709       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:47:16.478355       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
> I0615 21:50:14.909697       1 base_controller.go:73] Caches are synced for LoggingSyncer 
> I0615 21:50:14.909756       1 base_controller.go:73] Caches are synced for ManagementStateController 
> I0615 21:50:14.909794       1 base_controller.go:73] Caches are synced for StatusSyncer_csi-snapshot-controller 
> I0615 21:50:14.909836       1 base_controller.go:73] Caches are synced for CSISnapshotWebhookController 
> I0615 21:50:14.910261       1 base_controller.go:73] Caches are synced for StaticResourceController

Comment 3 Wei Duan 2022-10-05 09:21:07 UTC
Don't see such error after the fix in recent CI, will check if the right error message will be logged.

Comment 4 Fabio Bertinatto 2022-10-11 11:06:02 UTC
Moving back to POST as we need to merge 2 other PRs.

Comment 5 Fabio Bertinatto 2022-10-11 12:33:26 UTC
Moving back to POST as we need to merge another PR.

Comment 7 Wei Duan 2022-10-14 10:39:15 UTC
Checked csi-snapshot-controller-operator log from many 4.12 CI job but did not find "Error initializing delegating authentication" error yet.

Comment 12 errata-xmlrpc 2023-01-17 19:50:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.