Bug 2050230
| Summary: | Implement LIST call chunking in openshift-sdn | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Amit Kesarkar <akesarka> |
| Component: | Networking | Assignee: | Jaime CaamaƱo Ruiz <jcaamano> |
| Networking sub component: | openshift-sdn | QA Contact: | Qiujie Li <qili> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | medium | CC: | ancollin, gvaughn, jcaamano, jpradhan, qili, rravaiol, trozet, zzhao |
| Version: | 4.8 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.13.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-05-17 22:46:32 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Amit Kesarkar
2022-02-03 13:58:07 UTC
Filling in more information on the bug description.
Please let me know if I can provide anything else here, I am happy to assist.
Description of problem:
In a large cluster, sdn daemonset can DoS the kube-apiserver with un-paginated LIST calls on high count resources.
Version-Release number of selected component (if applicable):
4.8.23
How reproducible:
100%
Steps to Reproduce:
1. Create greater than 500 pods, networkpolicies, services, endpoints, netnamespaces, or projects in a project.
2. Restart one or more SDN pods.
Actual results:
Verify through kube-apiserver audit events that LIST calls on these resources are executed without paging, and are thus querying >500 resources in a single LIST request.
Repeated significantly large list requests (>15k) can cause the kube-apiserver, openshift-apiserver, and etcd to consume extremely large amounts of memory, which can lead to other issues.
Expected results:
SDN should make fixed-size LIST calls using pagination as to limit the amount of memory balooning on the controlPlane.
Additional info:
These are all the counts of resources in a cluster environment that are being executed at high frequency when controlPlane becomes unstable, which only negatively contribute to the controlPlane instability.
```
$ oc get --raw '/api/v1/endpoints?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&resourceVersion=480361449' | jq -s ' .[].items[].metadata.name' | wc
10694 10694 251354
$ oc get --raw '/apis/network.openshift.io/v1/netnamespaces?resourceVersion=480360525' | jq -s ' .[].items[].metadata.name' | wc
4984 4984 106907
$ oc get --raw '/apis/network.openshift.io/v1/hostsubnets?resourceVersion=480361230' | jq -s ' .[].items[].metadata.name' | wc
256 256 10365
$ oc get --raw '/api/v1/pods?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&resourceVersion=480361631' | jq -s ' .[].items[].metadata.name' | wc
18134 18134 586113
$ oc get --raw '/api/v1/services?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&resourceVersion=480361112' | jq -s ' .[].items[].metadata.name' | wc
11012 11012 260087
$ oc get --raw '/apis/networking.k8s.io/v1/networkpolicies?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&resourceVersion=480360489' | jq -s ' .[].items[].metadata.name' | wc
15438 15438 456408
```
@qili Hi, Qiujie Could you help take look if you can help verified this bug during your testing. thanks. > @ancollin is there any way to verify if this change has actually helped at all?
I took a look through the audit logs (Thank you Qiujie for uploading).
I do still see list calls with `limit=500&resourceVersion=0`, but they appear to be followed by a watch request.
These look like the consequence of ListWatch that we can't get around.
I looked for the request parameters that I originally filed on (i.e. "labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&resourceVersion=480361631") and I do see fewer occurrences of these.
Only two list calls: services and endpointslices.
Unfortunately both of these also have the page-negating `resourceVersion=0`, so I gather these are also the initial List of the ListWatch.
To restate the problem: The large un-paginated list calls only come after the API has already been unstable, and will only generate additional load on already-burdened API servers.
Even though these un-paginated calls only happen in certain conditions, I believe the times when these certain conditions are met are precisely when the paginated calls are needed most.
I will not know whether these make a difference in the customer environment until they are released in a Z stream, but based on these results, I do expect the unpaginated list calls to continue to be disruptive.
I do see pagination being used in two calls, evident by the subsequent "continue" calls (resources: namespaces and netnamespaces, time: 2022-04-26T09:18:03 ), so I believe you have done as much as you can from the sdn side, and the rest is chasing down client-go (as you said).
Thank you for your help to bring this to a close.
If there is some way to add this as data to support removing unpaginated ListWatch bugs, other to improve API stability on large clusters, or similar client-go bugs, I am all for that and please let me know how you best think to approach those maintainers.
@jcaamano I didn't see where I can remove FailedQA. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.13.0 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:1326 |