| Summary: | Limit the number of pods with the starting state on a node | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Frederic Giloux <fgiloux> |
| Component: | RFE | Assignee: | Derek Carr <decarr> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Xiaoli Tian <xtian> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.2.1 | CC: | aos-bugs, jeder, jmencak, jokerman, mmccomas, tkatarki |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-04-18 19:55:23 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Frederic Giloux
2016-11-08 15:27:19 UTC
We've built into our cluster-loader utility something called a tuningset, which is a way of enforcing some "pacing" on clients. These are ways to set intervals and rates so that we can load at maximum speed, while keeping the system stable. We had to do this in openshift v2 as well, but v3 is even worse in terms of parallelism. In the case of container creation, much of the failures or fragility can be pinned to docker. We're prototyping a way to measure current "busy-ness" of docker by reading it's API, and using that as auto-tuning backpressure that our client will use. In this way, we can load as fast as docker can safely go. I don't yet know if docker will have the features we need, and it might also not be the only source of information we need. It might be beneficial to look not only at docker but at the system resource profile as well, potentially detecting storage I/O saturation and pacing (queuing) client requests. Essentially we need a way to "protect" docker (and any other runtime) from Kubernetes. Amazon does this by rate-limiting their API to protect their control plane. This is an RFE to rate limit via QPS the number of container start operations that are made to the container runtime from the kubelet. I think this ask has been discussed upstream thoroughly. See https://github.com/kubernetes/kubernetes/issues/3312 It point to some best practices and other features and issues that can address the issue. I don't think there is any other work in upstream in this problem space. |