Bug 1745671 - cluster-kubesscheduler-operator makes 4 req/s at idle
Summary: cluster-kubesscheduler-operator makes 4 req/s at idle
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-scheduler
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.3.0
Assignee: Jan Chaloupka
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-26 15:30 UTC by Clayton Coleman
Modified: 2020-01-23 11:05 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-23 11:05:22 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cluster-kube-scheduler-operator pull 167 'None' closed Bug 1745671: use cached clients in resource syncer 2020-03-26 09:30:05 UTC
Github openshift cluster-kube-scheduler-operator pull 174 'None' closed bug 1745671: Use informers instead of directly accessing clientset 2020-03-26 09:30:06 UTC
Github openshift cluster-kube-scheduler-operator pull 176 'None' closed Bug 1745671: Do not register event handlers twice for configmaps under openshift-kube-scheduler 2020-03-26 09:30:05 UTC
Red Hat Product Errata RHBA-2020:0062 None None None 2020-01-23 11:05:41 UTC

Description Clayton Coleman 2019-08-26 15:30:52 UTC
{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="secrets",scope="namespace",service="kubernetes",verb="GET",version="v1"}	1.9777777777777776
{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET",version="v1"}	1.2444444444444442
{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="404",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET",version="v1"}	0.8333333333333334
{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/json",endpoint="https",group="operator.openshift.io",job="apiserver",namespace="default",resource="kubeschedulers",scope="resource",service="kubernetes",verb="GET",version="v1"}	0.8111111111111111
{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",group="rbac.authorization.k8s.io",job="apiserver",namespace="default",resource="rolebindings",scope="namespace",service="kubernetes",verb="GET",version="v1"}	0.6222222222222221

The Kube scheduler operator appears to be hot looping on certain API requests in 4.2.0 CI latest.

None of these calls should be this frequent.  This is the #1 most impacting steady state component on the cluster.

Comment 1 Michal Fojtik 2019-08-26 18:01:41 UTC
Maciej, making this 4.2 blocker, somebody need to investigate

Comment 2 Mike Dame 2019-08-28 17:20:49 UTC
I don't have much experience parsing metrics, so I may be off here, but when I look at a new running cluster I see similar request counts for other components (KAS-O and KCM-O):

apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="0",component="apiserver",contentType="application/vnd.kubernetes.protobuf;stream=watch",instance="localhost:8080",job="apiserver",resource="configmaps",scope="namespace",verb="WATCH",version="v1"}	46
apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="0",component="apiserver",contentType="application/vnd.kubernetes.protobuf;stream=watch",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="WATCH",version="v1"}	22
apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="configmaps",scope="namespace",verb="LIST",version="v1"}	24
apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="LIST",version="v1"}	1
apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="404",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="GET",version="v1"}	803

apiserver_request_count{client="cluster-kube-controller-manager-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="0",component="apiserver",contentType="application/vnd.kubernetes.protobuf;stream=watch",instance="localhost:8080",job="apiserver",resource="configmaps",scope="namespace",verb="WATCH",version="v1"}	42
apiserver_request_count{client="cluster-kube-controller-manager-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="0",component="apiserver",contentType="application/vnd.kubernetes.protobuf;stream=watch",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="WATCH",version="v1"}	23
apiserver_request_count{client="cluster-kube-controller-manager-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="configmaps",scope="namespace",verb="LIST",version="v1"}	22
apiserver_request_count{client="cluster-kube-controller-manager-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="LIST",version="v1"}

However these appear to be list watches, where the scheduler ones are just GETs. Should these list watches be making this many requests too? I'm trying to narrow down if this is specific to kube-scheduler-operator or possibly due to our resourcesync functions in library-go

Comment 3 Mike Dame 2019-08-28 18:08:02 UTC
Looking at the operator logs we're definitely doing a lot of resource syncing attempts here:

I0828 17:57:32.678155       1 config_observer_controller.go:86] decode of existing config failed with error: EOF
I0828 17:57:32.738008       1 request.go:530] Throttling request took 396.296489ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/policy-configmap
I0828 17:57:32.938013       1 request.go:530] Throttling request took 388.220327ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/services/scheduler
I0828 17:57:33.138011       1 request.go:530] Throttling request took 395.852882ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/kube-scheduler-client-cert-key
I0828 17:57:33.338000       1 request.go:530] Throttling request took 397.737447ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/serviceaccounts/openshift-kube-scheduler-sa
I0828 17:57:33.538030       1 request.go:530] Throttling request took 397.035456ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/secrets/kube-scheduler-client-cert-key
I0828 17:57:33.738009       1 request.go:530] Throttling request took 397.533044ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/config
I0828 17:57:33.938014       1 request.go:530] Throttling request took 395.999506ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/policy-configmap
I0828 17:57:34.138006       1 request.go:530] Throttling request took 395.527066ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/serviceaccount-ca
I0828 17:57:34.338022       1 request.go:530] Throttling request took 395.755406ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/kube-scheduler-client-cert-key
I0828 17:57:34.538024       1 request.go:530] Throttling request took 395.044149ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/secrets/serving-cert
I0828 17:57:34.685855       1 config_observer_controller.go:86] decode of existing config failed with error: EOF
I0828 17:57:34.737992       1 request.go:530] Throttling request took 396.780179ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/secrets/kube-scheduler-client-cert-key
I0828 17:57:34.938011       1 request.go:530] Throttling request took 396.228828ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/kube-scheduler-pod
I0828 17:57:35.138013       1 request.go:530] Throttling request took 396.212538ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/policy-configmap
I0828 17:57:35.338015       1 request.go:530] Throttling request took 389.542288ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler
I0828 17:57:35.538010       1 request.go:530] Throttling request took 396.271653ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/kube-scheduler-client-cert-key
I0828 17:57:35.738024       1 request.go:530] Throttling request took 397.901615ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/scheduler-kubeconfig
I0828 17:57:35.938005       1 request.go:530] Throttling request took 396.748567ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/secrets/kube-scheduler-client-cert-key

Comment 4 Mike Dame 2019-08-28 21:19:08 UTC
Adding some debugging to the configobserver functions, I see a lot of updates to the kube-scheduler configmap in the openshift-kube-scheduler namespace. It's also rapidly increasing resourceversion, the only other change is updating the lease renewal time. I think this would trigger the entire resource/config sync loop so it could possibly be an issue with that. Is it supposed to be renewing its lease this much?

Comment 6 ge liu 2019-09-19 03:27:36 UTC
Hi Mike, is there any suggestion how to verify it?

Comment 14 Maciej Szulik 2019-09-23 14:03:07 UTC
Quoting Clayton:

Election is the only thing that has to be that fast - we want short election windows ok kcm and sched (the lock config map) but everything else should be slower
each cm should be pinging every 2 sec so rate should be 1.5 req/s

We are currently at ~2.5:

{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET",version="v1"}	2.4789473684210526

which is about twice the desired rate, Jan is looking at the details, but it's not a blocking issue, we can backport if needed.
Lowering the priority on this one.

Comment 15 Maciej Szulik 2019-09-23 14:03:37 UTC
For future generations:

sort_desc(sum without (instance) (rate(apiserver_request_count{verb="GET", client=~"cluster-kube-.*-operator.*"}[10m])))

is the query I've used to get the metric.

Comment 16 Jan Chaloupka 2019-09-23 16:13:28 UTC
https://github.com/openshift/cluster-kube-scheduler-operator/blob/master/pkg/operator/target_config_reconciler_v410_00.go#L33

```go
...
func createTargetConfigReconciler_v311_00_to_latest(c TargetConfigReconciler, recorder events.Recorder, operatorConfig *operatorv1.KubeScheduler) (bool, error) {
	...
	directResourceResults := resourceapply.ApplyDirectly(c.kubeClient, c.eventRecorder, v410_00_assets.Asset,
		...
	)
	...
	_, _, err := manageKubeSchedulerConfigMap_v311_00_to_latest(c.configMapLister, c.kubeClient.CoreV1(), recorder, operatorConfig)
	...
	_, _, err = manageServiceAccountCABundle(c.configMapLister, c.kubeClient.CoreV1(), recorder)

	...
	_, _, err = managePod_v311_00_to_latest(c.kubeClient.CoreV1(), c.kubeClient.CoreV1(), recorder, operatorConfig, c.targetImagePullSpec, c.featureGateLister)

	...
...


createTargetConfigReconciler_v311_00_to_latest is actually not using informers but directly accesses c.kubeClient.CoreV1(). Called as part of targetConfigReconciler.

Comment 17 Jan Chaloupka 2019-09-24 10:08:21 UTC
There's a bunch of other GET requests:

I0924 11:57:35.058004    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler 200 OK in 108 milliseconds
I0924 11:57:35.167835    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler/configmaps/scheduler-kubeconfig 200 OK in 109 milliseconds
I0924 11:57:35.278935    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/rolebindings/system:openshift:leader-locking-kube-scheduler 200 OK in 109 milliseconds
I0924 11:57:35.391748    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings/system:openshift:operator:kube-scheduler:public-2 200 OK in 111 milliseconds
I0924 11:57:35.503246    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/rbac.authorization.k8s.io/v1/namespaces/openshift-kube-scheduler/roles/system:openshift:sa-listing-configmaps 200 OK in 110 milliseconds
I0924 11:57:35.614160    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/rbac.authorization.k8s.io/v1/namespaces/openshift-kube-scheduler/rolebindings/system:openshift:sa-listing-configmaps 200 OK in 109 milliseconds
I0924 11:57:35.724722    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler/services/scheduler 200 OK in 109 milliseconds
I0924 11:57:35.835701    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler/serviceaccounts/openshift-kube-scheduler-sa 200 OK in 109 milliseconds

All of them originates from https://github.com/openshift/library-go/tree/master/pkg/operator/resource/resourceapply code which does not rely on informers.

Also:

I0924 12:04:51.154941    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler/secrets/serving-cert 200 OK in 110 milliseconds

where again no secret get through informer is used.

And lastly:

I0924 12:04:50.138667    6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/operator.openshift.io/v1/kubeschedulers/cluster 200 OK in 117 milliseconds

without an informer.

Comment 18 Jan Chaloupka 2019-09-27 14:05:44 UTC
Possible optimizations:
- increase --leader-elect-retry-period from 2s to 3s does not bring much significant improvement: 2.48 -> 2.3
- removing unused informers, replacing operatorclient with informer and removing duplicated informer: https://github.com/openshift/cluster-kube-scheduler-operator/pull/175 + https://github.com/openshift/cluster-kube-scheduler-operator/pull/176: 2.48 -> 2
- replacing direct config getters with informers https://github.com/openshift/cluster-kube-scheduler-operator/pull/174: we can get bellow 1.5

Given where we are in the 4.2 release process, all the optimizations represent stability risks. We need more time to assess what are consequences of introducing such changes. Thus moving to 4.3. Once we have better picture of impact of the changes we may decide to backport some to 4.2.z.

Comment 22 Xingxing Xia 2019-12-26 14:40:43 UTC
Sorry for late looking at this while engaged in other testings.
Investigated above comments and PR, verified in env of version 4.3.0-0.nightly-2019-12-24-221900:
Queried below in Prometheus UI: sort_desc(sum without (instance) (rate(apiserver_request_count{verb="GET", client=~"cluster-kube-scheduler-operator.*"}[10m])))
Got:
{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET",version="v1"}	2.007017543859649
{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",group="rbac.authorization.k8s.io",job="apiserver",namespace="default",resource="rolebindings",scope="namespace",service="kubernetes",verb="GET",version="v1"}	1.0035087719298246
{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="secrets",scope="namespace",service="kubernetes",verb="GET",version="v1"}	0.5017543859649123
...

Comment 24 errata-xmlrpc 2020-01-23 11:05:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.