Bug 1934555
Summary: | Creating controller client much slower on some clusters | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Michael Gugino <mgugino> | ||||||
Component: | Cloud Compute | Assignee: | Joel Speed <jspeed> | ||||||
Cloud Compute sub component: | Other Providers | QA Contact: | sunzhaohua <zhsun> | ||||||
Status: | CLOSED WONTFIX | Docs Contact: | |||||||
Severity: | medium | ||||||||
Priority: | medium | CC: | akashem, aos-bugs, dahernan, dhellmann, lmohanty, mfojtik, mgugino, mimccune, scuppett, sreber, tmicheli, wking, xxia, zhsun | ||||||
Version: | 4.7 | Keywords: | Upgrades | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | 1934216 | Environment: | |||||||
Last Closed: | 2021-08-27 09:39:22 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Michael Gugino
2021-03-03 13:24:17 UTC
This is client-side throttling. Compare the oc fix https://github.com/openshift/oc/pull/696/files. The cause of the above condition is two fold. First, the machineset controller was building 3 clients instead of just 1. Second, on some clusters, the time it takes to build a client is much longer than on other clusters. On a typical CI run, I observed a 3 second client creation time on the machineset-controller. On the test cluster I had access to, I observed a 14 second client creation time. The cluster doesn't seem to be starved of any resource, such as CPU. One possible cause is some aggregated API is being slow to respond, or the API server is missing, TBD. The upstream docs state the spec for an aggregated API response is 5 seconds, which seems very high. To make this actionable: 1. fix the client throttling. It's not helpful to look into claimed slowness when your binary is slow because it throttles itself. 2. show a request (timestamp and URL) that is slower than expected, and provide must-gather and audit logs for the cluster when the request is processed. Created attachment 1760396 [details]
machineset-controller with slow client creation
Attached the slow client creation from in-cluster controller. The controller fails to start after 30 seconds due to liveness probe settings.
This controller contains the 3 client creations, which pushes the total time above 30 seconds. However, on CI, client creation takes 3 seconds, so we're trying to determine what the discrepancy is.
The attached log still shows client-side throttling a lot (> 5s for requests). Before those are fixed there is no sense to dig deeper into server issues. Created attachment 1760403 [details]
cluster logging operator slow client creation
This is another controller with a very slow rest client creation. Something is causing this cluster to have excessively slow client creation times. Is this due to number of API groups or cluster objects?
If this is exclusively caused clientside, we need to come up with some best practices to ensure this doesn't happen.
The throttling messages in the log are completely client-side. We enabled log level '-v=4' or higher and did no see the following message that indicates server-side throttling. > Got a Retry-After 1s response for attempt https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/rest/request.go#L959 Moving it back to the cloud team so they can investigate further. We are happy to look at it if there are evidence of slowness at the apiserver side or any other issues related to the apiserver. But right now, based on the log it is client-side throttling that is slowing the requests down (the server is not involved here) Freshly installed cluster: grep 'GET https://api.mgugino-devb2.devcluster.openshift.com' machineset.c1.out | wc -l 89 Impacted cluster: grep 'GET https://impacted-cluster.example.redhat.com:6443' machineset.c2.out | wc -l 302 Freshly installed cluster: ./oc get apiservices | wc -l 80 Impacted cluster: ./oc get apiservices | wc -l 145 From the previous comment, the fresh install is 89 calls for a total of about 3 seconds, equaling about 33ms per call. The impacted cluster is 14/302, for about 46ms per call. We need a cluster-wide setting for operator rest clients. Other components may be hitting this issue, we'll need to investigate what an appropriate setting is so we don't DOS the API server on upgrades. On the impacted cluster: Adjusting the rest client Burst to 100 (up from the default of 10) still results in 7 to 12 second delays (down slightly from 14). 250 results near in instantaneous completion. It seems that once we exhaust the first 100 requests we incrementally backoff for the balance of the requests. Perhaps we need to explore QPS as well: https://pkg.go.dev/k8s.io/client-go/rest#Config Testing on the impacted cluster, QPS seems to have a greater impact than Burst. QPS defaults to 5 and is what causes the excessive throttling on the client side. rest.Burst = 150 rest.QPS = 75.0 With these settings, time is decreased to 4 seconds. This should allow us to scale to 1000 requests before encountering any problems. We should investigate what the upper limit with these settings might be so we can alert/do something useful when this limit is hit. Note to future readers, we discussed this out of band and decided that we should try to investigate creating a static restmapper that can be used in our components so that we don't have to dynamically build the restmapping from the API on startup, this would resolve the huge rush of API calls that our components make when we start up. Alternatively, we should switch to typed clients and ditch the controller runtime client I think we can use a DefaultRestMapper [1] which uses the runtime.Scheme (which we always construct anyway) to avoid making the API calls to perform the discovery. If we want to have a fallback to the dynamic discovery, we can also use a MultiRestMapper [2] to allow the restmapper to use the discovery if the resource isn't available in the scheme. If we take this approach, we should use a Lazy DynamicRestMapper [3]. Once we create this rest mapper, we can pass it into any controller runtime manager we create by providing a rest mapper provider function in the options. [1]: https://github.com/kubernetes/apimachinery/blob/57f2a0733447cfd41294477d833cce6580faaca3/pkg/api/meta/restmapper.go#L76-L97 [2]: https://github.com/kubernetes/apimachinery/blob/57f2a0733447cfd41294477d833cce6580faaca3/pkg/api/meta/multirestmapper.go#L27-L28 [3]: https://github.com/kubernetes-sigs/controller-runtime/blob/fa42462a01b0f33cfb42dd7396d198435a013122/pkg/client/apiutil/dynamicrestmapper.go#L55-L60 Bug 1934216 had an immediate workaround for this issue, so I'm removing UpgradeBlocker to pull it out of our suspect queue [1]. If folks are concerned about client creation time creating similar breakage even with bug 1934216's mitigation in place, I recommend making this bug blocker+ and fixing it before 4.8 GAs. If, in addition, we hear about folks on 4.7.z with the mitigation in place still breaking, please add UpgradeBlocker back here and float an impact statement. [1]: https://github.com/openshift/enhancements/pull/426 Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1934555#c13: This doesn't actually work, building a static restmapper requires more information than we can gather from a runtime scheme. We will have to keep the dynamic rest mapper for now as this is far simpler and the benefit is negligible. |