Bug 1951815
Summary: | Reduce number of kubelet WATCH requests | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Evan Cordell <ecordell> |
Component: | Node | Assignee: | Elana Hashman <ehashman> |
Node sub component: | Kubelet | QA Contact: | Sunil Choudhary <schoudha> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | urgent | ||
Priority: | urgent | CC: | akashem, andbartl, aos-bugs, bjarolim, dahernan, dgautam, ecordell, jiazha, krizza, nhale, openshift-bugs-escalate, pducai, skolicha |
Version: | 4.6 | ||
Target Milestone: | --- | ||
Target Release: | 4.7.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: Kubelet can sometimes open a large number of WATCH requests for secrets and configmaps, particularly on node reboot.
Consequence: The API servers may be overwhelmed under load.
Fix: Reduce the number of kubelet WATCH requests.
Result: Load is reduced on API servers.
|
Story Points: | --- |
Clone Of: | 1943704 | Environment: | |
Last Closed: | 2021-05-19 15:16:26 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1939734 | ||
Bug Blocks: | 1960002 |
Comment 1
Elana Hashman
2021-04-20 22:11:54 UTC
I have a PR up: https://github.com/openshift/kubernetes/pull/692 Patch is pending verification of https://bugzilla.redhat.com/show_bug.cgi?id=1939734 where this was initially reported. Checked on 4.7.0-0.nightly-2021-05-12-004740, rebooted node multiple times. I see the number of watch calls are low. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-05-12-004740 True False 119m Cluster version is 4.7.0-0.nightly-2021-05-12-004740 $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-134-125.us-east-2.compute.internal Ready worker 136m v1.20.0+75370d3 ip-10-0-139-107.us-east-2.compute.internal Ready master 141m v1.20.0+75370d3 ip-10-0-182-213.us-east-2.compute.internal Ready worker 136m v1.20.0+75370d3 ip-10-0-187-71.us-east-2.compute.internal Ready master 145m v1.20.0+75370d3 ip-10-0-193-213.us-east-2.compute.internal Ready master 145m v1.20.0+75370d3 ip-10-0-194-243.us-east-2.compute.internal Ready worker 136m v1.20.0+75370d3 $ oc debug node/ip-10-0-139-107.us-east-2.compute.internal Starting pod/ip-10-0-139-107us-east-2computeinternal-debug ... ... sh-4.4# journalctl | grep -i "Starting reflector" | wc -l 252 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.11 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1550 |