{client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="secrets",scope="namespace",service="kubernetes",verb="GET",version="v1"} 1.9777777777777776 {client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET",version="v1"} 1.2444444444444442 {client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="404",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET",version="v1"} 0.8333333333333334 {client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/json",endpoint="https",group="operator.openshift.io",job="apiserver",namespace="default",resource="kubeschedulers",scope="resource",service="kubernetes",verb="GET",version="v1"} 0.8111111111111111 {client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",group="rbac.authorization.k8s.io",job="apiserver",namespace="default",resource="rolebindings",scope="namespace",service="kubernetes",verb="GET",version="v1"} 0.6222222222222221 The Kube scheduler operator appears to be hot looping on certain API requests in 4.2.0 CI latest. None of these calls should be this frequent. This is the #1 most impacting steady state component on the cluster.
Maciej, making this 4.2 blocker, somebody need to investigate
I don't have much experience parsing metrics, so I may be off here, but when I look at a new running cluster I see similar request counts for other components (KAS-O and KCM-O): apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="0",component="apiserver",contentType="application/vnd.kubernetes.protobuf;stream=watch",instance="localhost:8080",job="apiserver",resource="configmaps",scope="namespace",verb="WATCH",version="v1"} 46 apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="0",component="apiserver",contentType="application/vnd.kubernetes.protobuf;stream=watch",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="WATCH",version="v1"} 22 apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="configmaps",scope="namespace",verb="LIST",version="v1"} 24 apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="LIST",version="v1"} 1 apiserver_request_count{client="cluster-kube-apiserver-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="404",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="GET",version="v1"} 803 apiserver_request_count{client="cluster-kube-controller-manager-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="0",component="apiserver",contentType="application/vnd.kubernetes.protobuf;stream=watch",instance="localhost:8080",job="apiserver",resource="configmaps",scope="namespace",verb="WATCH",version="v1"} 42 apiserver_request_count{client="cluster-kube-controller-manager-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="0",component="apiserver",contentType="application/vnd.kubernetes.protobuf;stream=watch",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="WATCH",version="v1"} 23 apiserver_request_count{client="cluster-kube-controller-manager-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="configmaps",scope="namespace",verb="LIST",version="v1"} 22 apiserver_request_count{client="cluster-kube-controller-manager-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",instance="localhost:8080",job="apiserver",resource="secrets",scope="namespace",verb="LIST",version="v1"} However these appear to be list watches, where the scheduler ones are just GETs. Should these list watches be making this many requests too? I'm trying to narrow down if this is specific to kube-scheduler-operator or possibly due to our resourcesync functions in library-go
Looking at the operator logs we're definitely doing a lot of resource syncing attempts here: I0828 17:57:32.678155 1 config_observer_controller.go:86] decode of existing config failed with error: EOF I0828 17:57:32.738008 1 request.go:530] Throttling request took 396.296489ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/policy-configmap I0828 17:57:32.938013 1 request.go:530] Throttling request took 388.220327ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/services/scheduler I0828 17:57:33.138011 1 request.go:530] Throttling request took 395.852882ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/kube-scheduler-client-cert-key I0828 17:57:33.338000 1 request.go:530] Throttling request took 397.737447ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/serviceaccounts/openshift-kube-scheduler-sa I0828 17:57:33.538030 1 request.go:530] Throttling request took 397.035456ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/secrets/kube-scheduler-client-cert-key I0828 17:57:33.738009 1 request.go:530] Throttling request took 397.533044ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/config I0828 17:57:33.938014 1 request.go:530] Throttling request took 395.999506ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/policy-configmap I0828 17:57:34.138006 1 request.go:530] Throttling request took 395.527066ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/serviceaccount-ca I0828 17:57:34.338022 1 request.go:530] Throttling request took 395.755406ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/kube-scheduler-client-cert-key I0828 17:57:34.538024 1 request.go:530] Throttling request took 395.044149ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/secrets/serving-cert I0828 17:57:34.685855 1 config_observer_controller.go:86] decode of existing config failed with error: EOF I0828 17:57:34.737992 1 request.go:530] Throttling request took 396.780179ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/secrets/kube-scheduler-client-cert-key I0828 17:57:34.938011 1 request.go:530] Throttling request took 396.228828ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/kube-scheduler-pod I0828 17:57:35.138013 1 request.go:530] Throttling request took 396.212538ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/policy-configmap I0828 17:57:35.338015 1 request.go:530] Throttling request took 389.542288ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler I0828 17:57:35.538010 1 request.go:530] Throttling request took 396.271653ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets/kube-scheduler-client-cert-key I0828 17:57:35.738024 1 request.go:530] Throttling request took 397.901615ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/configmaps/scheduler-kubeconfig I0828 17:57:35.938005 1 request.go:530] Throttling request took 396.748567ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-scheduler/secrets/kube-scheduler-client-cert-key
Adding some debugging to the configobserver functions, I see a lot of updates to the kube-scheduler configmap in the openshift-kube-scheduler namespace. It's also rapidly increasing resourceversion, the only other change is updating the lease renewal time. I think this would trigger the entire resource/config sync loop so it could possibly be an issue with that. Is it supposed to be renewing its lease this much?
Hi Mike, is there any suggestion how to verify it?
Quoting Clayton: Election is the only thing that has to be that fast - we want short election windows ok kcm and sched (the lock config map) but everything else should be slower each cm should be pinging every 2 sec so rate should be 1.5 req/s We are currently at ~2.5: {client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET",version="v1"} 2.4789473684210526 which is about twice the desired rate, Jan is looking at the details, but it's not a blocking issue, we can backport if needed. Lowering the priority on this one.
For future generations: sort_desc(sum without (instance) (rate(apiserver_request_count{verb="GET", client=~"cluster-kube-.*-operator.*"}[10m]))) is the query I've used to get the metric.
https://github.com/openshift/cluster-kube-scheduler-operator/blob/master/pkg/operator/target_config_reconciler_v410_00.go#L33 ```go ... func createTargetConfigReconciler_v311_00_to_latest(c TargetConfigReconciler, recorder events.Recorder, operatorConfig *operatorv1.KubeScheduler) (bool, error) { ... directResourceResults := resourceapply.ApplyDirectly(c.kubeClient, c.eventRecorder, v410_00_assets.Asset, ... ) ... _, _, err := manageKubeSchedulerConfigMap_v311_00_to_latest(c.configMapLister, c.kubeClient.CoreV1(), recorder, operatorConfig) ... _, _, err = manageServiceAccountCABundle(c.configMapLister, c.kubeClient.CoreV1(), recorder) ... _, _, err = managePod_v311_00_to_latest(c.kubeClient.CoreV1(), c.kubeClient.CoreV1(), recorder, operatorConfig, c.targetImagePullSpec, c.featureGateLister) ... ... createTargetConfigReconciler_v311_00_to_latest is actually not using informers but directly accesses c.kubeClient.CoreV1(). Called as part of targetConfigReconciler.
There's a bunch of other GET requests: I0924 11:57:35.058004 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler 200 OK in 108 milliseconds I0924 11:57:35.167835 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler/configmaps/scheduler-kubeconfig 200 OK in 109 milliseconds I0924 11:57:35.278935 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/rolebindings/system:openshift:leader-locking-kube-scheduler 200 OK in 109 milliseconds I0924 11:57:35.391748 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings/system:openshift:operator:kube-scheduler:public-2 200 OK in 111 milliseconds I0924 11:57:35.503246 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/rbac.authorization.k8s.io/v1/namespaces/openshift-kube-scheduler/roles/system:openshift:sa-listing-configmaps 200 OK in 110 milliseconds I0924 11:57:35.614160 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/rbac.authorization.k8s.io/v1/namespaces/openshift-kube-scheduler/rolebindings/system:openshift:sa-listing-configmaps 200 OK in 109 milliseconds I0924 11:57:35.724722 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler/services/scheduler 200 OK in 109 milliseconds I0924 11:57:35.835701 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler/serviceaccounts/openshift-kube-scheduler-sa 200 OK in 109 milliseconds All of them originates from https://github.com/openshift/library-go/tree/master/pkg/operator/resource/resourceapply code which does not rely on informers. Also: I0924 12:04:51.154941 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/api/v1/namespaces/openshift-kube-scheduler/secrets/serving-cert 200 OK in 110 milliseconds where again no secret get through informer is used. And lastly: I0924 12:04:50.138667 6416 round_trippers.go:438] GET https://api.jchaloup.devcluster.openshift.com:6443/apis/operator.openshift.io/v1/kubeschedulers/cluster 200 OK in 117 milliseconds without an informer.
Possible optimizations: - increase --leader-elect-retry-period from 2s to 3s does not bring much significant improvement: 2.48 -> 2.3 - removing unused informers, replacing operatorclient with informer and removing duplicated informer: https://github.com/openshift/cluster-kube-scheduler-operator/pull/175 + https://github.com/openshift/cluster-kube-scheduler-operator/pull/176: 2.48 -> 2 - replacing direct config getters with informers https://github.com/openshift/cluster-kube-scheduler-operator/pull/174: we can get bellow 1.5 Given where we are in the 4.2 release process, all the optimizations represent stability risks. We need more time to assess what are consequences of introducing such changes. Thus moving to 4.3. Once we have better picture of impact of the changes we may decide to backport some to 4.2.z.
https://github.com/openshift/cluster-kube-scheduler-operator/pull/176 merged
Sorry for late looking at this while engaged in other testings. Investigated above comments and PR, verified in env of version 4.3.0-0.nightly-2019-12-24-221900: Queried below in Prometheus UI: sort_desc(sum without (instance) (rate(apiserver_request_count{verb="GET", client=~"cluster-kube-scheduler-operator.*"}[10m]))) Got: {client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="configmaps",scope="namespace",service="kubernetes",verb="GET",version="v1"} 2.007017543859649 {client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",group="rbac.authorization.k8s.io",job="apiserver",namespace="default",resource="rolebindings",scope="namespace",service="kubernetes",verb="GET",version="v1"} 1.0035087719298246 {client="cluster-kube-scheduler-operator/v0.0.0 (linux/amd64) kubernetes/$Format",code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",endpoint="https",job="apiserver",namespace="default",resource="secrets",scope="namespace",service="kubernetes",verb="GET",version="v1"} 0.5017543859649123 ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062