Bug 1934555

Summary:

Creating controller client much slower on some clusters

Product:

OpenShift Container Platform

Reporter:

Michael Gugino <mgugino>

Component:

Cloud Compute

Assignee:

Joel Speed <jspeed>

Cloud Compute sub component:

Other Providers

QA Contact:

sunzhaohua <zhsun>

Status:

CLOSED WONTFIX

Docs Contact:

Severity:

medium

Priority:

medium

CC:

akashem, aos-bugs, dahernan, dhellmann, lmohanty, mfojtik, mgugino, mimccune, scuppett, sreber, tmicheli, wking, xxia, zhsun

Version:

4.7

Keywords:

Upgrades

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

All

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

1934216

Environment:

Last Closed:

2021-08-27 09:39:22 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
machineset-controller with slow client creation	none
cluster logging operator slow client creation	none

Description Michael Gugino 2021-03-03 13:24:17 UTC

+++ This bug was initially created as a clone of Bug #1934216 +++

--- snip ---


--- Additional comment from Michael Gugino on 2021-03-02 21:43:43 UTC ---

Redacted some URLs from the lab cluster.

4.6 has a 14 second delay between "Registering Components" whereas 4.7 has a 42 second delay.  This delay is just above the 3 x 10 seconds (30) liveness probe timeout on the pod, which causes endless failure and restart.

A potential workaround for the stuck cluster is to copy the machine-controllers deployment, remove all containers except the machineset controller, adjust the liveness probe to 6 x 30, rename, and apply.  This should restore the machineset capabilities while we work on a fix.


I added some debug statements to the binaries in between those "Registering Components" calls.


$ ./machineset-debug-47
2021/03/02 16:30:19 Registering Components.
2021/03/02 16:30:19 after get config
I0302 16:30:21.360936 4015643 request.go:655] Throttling request took 1.043307661s, request: GET:https://my-cluster.example.redhat.com:6443/apis/snapshot.storage.k8s.io/v1beta1?timeout=32s
I0302 16:30:31.410138 4015643 request.go:655] Throttling request took 4.949716425s, request: GET:https://my-cluster.example.redhat.com:6443/apis/authorization.openshift.io/v1?timeout=32s
2021/03/02 16:30:33 After new manager call
2021/03/02 16:30:33 After new defaulter call
I0302 16:30:41.577767 4015643 request.go:655] Throttling request took 1.001682443s, request: GET:https://my-cluster.example.redhat.com:6443/apis/apiregistration.k8s.io/v1beta1?timeout=32s
2021/03/02 16:30:47 After new validator call
2021/03/02 16:30:47 After new msdefaulter call
I0302 16:30:51.625198 4015643 request.go:655] Throttling request took 3.149199313s, request: GET:https://my-cluster.example.redhat.com:6443/apis/coordination.k8s.io/v1?timeout=32s
I0302 16:31:01.652524 4015643 request.go:655] Throttling request took 6.905982449s, request: GET:https://my-cluster.example.redhat.com:6443/apis/operator.openshift.io/v1?timeout=32s
2021/03/02 16:31:01 After new msvalidator call
2021/03/02 16:31:01 Registering Components.
2021/03/02 16:31:01 Starting the Cmd.

$ ./machineset-debug-46
2021/03/02 16:34:23 Registering Components.
I0302 16:34:25.690275 4016198 request.go:645] Throttling request took 1.046962653s, request: GET:https://my-cluster.example.redhat.com:6443/apis/hco.kubevirt.io/v1alpha1?timeout=32s
I0302 16:34:35.740404 4016198 request.go:645] Throttling request took 4.851744443s, request: GET:https://my-cluster.example.redhat.com:6443/apis/operator.cert-manager.io/v1alpha1?timeout=32s
2021/03/02 16:34:37 after manager
2021/03/02 16:34:38 after defaulter
2021/03/02 16:34:38 after validator
2021/03/02 16:34:38 after msdefaulter
2021/03/02 16:34:38 after msvalidator
2021/03/02 16:34:38 Registering Components.
2021/03/02 16:34:38 Starting the Cmd.


--- snip ---

Comment 1 Stefan Schimanski 2021-03-03 13:52:03 UTC

This is client-side throttling. Compare the oc fix https://github.com/openshift/oc/pull/696/files.

Comment 2 Michael Gugino 2021-03-03 14:05:47 UTC

The cause of the above condition is two fold.  First, the machineset controller was building 3 clients instead of just 1.  Second, on some clusters, the time it takes to build a client is much longer than on other clusters.  On a typical CI run, I observed a 3 second client creation time on the machineset-controller.  On the test cluster I had access to, I observed a 14 second client creation time.

The cluster doesn't seem to be starved of any resource, such as CPU.  One possible cause is some aggregated API is being slow to respond, or the API server is missing, TBD.  The upstream docs state the spec for an aggregated API response is 5 seconds, which seems very high.

Comment 3 Stefan Schimanski 2021-03-03 14:13:22 UTC

To make this actionable:

1. fix the client throttling. It's not helpful to look into claimed slowness when your binary is slow because it throttles itself.
2. show a request (timestamp and URL) that is slower than expected, and provide must-gather and audit logs for the cluster when the request is processed.

Comment 4 Michael Gugino 2021-03-03 14:35:49 UTC

Created attachment 1760396 [details]
machineset-controller with slow client creation

Attached the slow client creation from in-cluster controller.  The controller fails to start after 30 seconds due to liveness probe settings.

This controller contains the 3 client creations, which pushes the total time above 30 seconds.  However, on CI, client creation takes 3 seconds, so we're trying to determine what the discrepancy is.

Comment 5 Stefan Schimanski 2021-03-03 14:49:40 UTC

The attached log still shows client-side throttling a lot (> 5s for requests). Before those are fixed there is no sense to dig deeper into server issues.

Comment 6 Michael Gugino 2021-03-03 14:58:23 UTC

Created attachment 1760403 [details]
cluster logging operator slow client creation

This is another controller with a very slow rest client creation.  Something is causing this cluster to have excessively slow client creation times.  Is this due to number of API groups or cluster objects?

If this is exclusively caused clientside, we need to come up with some best practices to ensure this doesn't happen.

Comment 7 Abu Kashem 2021-03-03 15:40:15 UTC

The throttling messages in the log are completely client-side. We enabled log level '-v=4' or higher and did no see the following message that indicates server-side throttling.  
> Got a Retry-After 1s response for attempt 

https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/rest/request.go#L959

Moving it back to the cloud team so they can investigate further. We are happy to look at it if there are evidence of slowness at the apiserver side or any other issues related to the apiserver.
But right now, based on the log it is client-side throttling that is slowing the requests down (the server is not involved here)

Comment 8 Michael Gugino 2021-03-03 15:43:32 UTC

Freshly installed cluster:

grep 'GET https://api.mgugino-devb2.devcluster.openshift.com' machineset.c1.out  | wc -l
89

Impacted cluster:

grep 'GET https://impacted-cluster.example.redhat.com:6443' machineset.c2.out  | wc -l
302

Comment 9 Michael Gugino 2021-03-03 16:59:31 UTC

Freshly installed cluster:

./oc get apiservices | wc -l
80


Impacted cluster:

./oc get apiservices | wc -l
145



From the previous comment, the fresh install is 89 calls for a total of about 3 seconds, equaling about 33ms per call.  The impacted cluster is 14/302, for about 46ms per call.

We need a cluster-wide setting for operator rest clients.  Other components may be hitting this issue, we'll need to investigate what an appropriate setting is so we don't DOS the API server on upgrades.

Comment 10 Michael Gugino 2021-03-03 18:52:53 UTC

On the impacted cluster:

Adjusting the rest client Burst to 100 (up from the default of 10) still results in 7 to 12 second delays (down slightly from 14).  250 results near in instantaneous completion.

It seems that once we exhaust the first 100 requests we incrementally backoff for the balance of the requests.

Perhaps we need to explore QPS as well: https://pkg.go.dev/k8s.io/client-go/rest#Config

Comment 11 Michael Gugino 2021-03-03 19:04:19 UTC

Testing on the impacted cluster, QPS seems to have a greater impact than Burst.  QPS defaults to 5 and is what causes the excessive throttling on the client side.

rest.Burst = 150
rest.QPS = 75.0

With these settings, time is decreased to 4 seconds.  This should allow us to scale to 1000 requests before encountering any problems.  We should investigate what the upper limit with these settings might be so we can alert/do something useful when this limit is hit.

Comment 12 Joel Speed 2021-03-25 14:57:18 UTC

Note to future readers, we discussed this out of band and decided that we should try to investigate creating a static restmapper that can be used in our components so that we don't have to dynamically build the restmapping from the API on startup, this would resolve the huge rush of API calls that our components make when we start up.

Alternatively, we should switch to typed clients and ditch the controller runtime client

Comment 13 Joel Speed 2021-03-25 15:07:16 UTC

I think we can use a DefaultRestMapper [1] which uses the runtime.Scheme (which we always construct anyway) to avoid making the API calls to perform the discovery.

If we want to have a fallback to the dynamic discovery, we can also use a MultiRestMapper [2] to allow the restmapper to use the discovery if the resource isn't available in the scheme.
If we take this approach, we should use a Lazy DynamicRestMapper [3].

Once we create this rest mapper, we can pass it into any controller runtime manager we create by providing a rest mapper provider function in the options.

[1]: https://github.com/kubernetes/apimachinery/blob/57f2a0733447cfd41294477d833cce6580faaca3/pkg/api/meta/restmapper.go#L76-L97
[2]: https://github.com/kubernetes/apimachinery/blob/57f2a0733447cfd41294477d833cce6580faaca3/pkg/api/meta/multirestmapper.go#L27-L28
[3]: https://github.com/kubernetes-sigs/controller-runtime/blob/fa42462a01b0f33cfb42dd7396d198435a013122/pkg/client/apiutil/dynamicrestmapper.go#L55-L60

Comment 14 W. Trevor King 2021-03-31 03:52:34 UTC

Bug 1934216 had an immediate workaround for this issue, so I'm removing UpgradeBlocker to pull it out of our suspect queue [1].  If folks are concerned about client creation time creating similar breakage even with bug 1934216's mitigation in place, I recommend making this bug blocker+ and fixing it before 4.8 GAs.  If, in addition, we hear about folks on 4.7.z with the mitigation in place still breaking, please add UpgradeBlocker back here and float an impact statement.

[1]: https://github.com/openshift/enhancements/pull/426

Comment 16 Joel Speed 2021-04-20 16:56:57 UTC

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1934555#c13: This doesn't actually work, building a static restmapper requires more information than we can gather from a runtime scheme. We will have to keep the dynamic rest mapper for now as this is far simpler and the benefit is negligible.