Bug 1854434
| Summary: | Traffic spikes every 10 minutes to master API | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Dirk Porter <dporter> | ||||
| Component: | kube-controller-manager | Assignee: | Mike Dame <mdame> | ||||
| Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 3.11.0 | CC: | antgarci, aos-bugs, asoni, bbennett, bshirren, dmellado, gmarkley, jolee, knarra, maszulik, mdame, mfojtik, nchavan, nelluri, openshift-bugs-escalate, rcarrier, rkant, rvanderp, scuppett, suchaudh, svaughn, tsmetana, vlaad | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.6.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
Cause: Various inconsistencies in storage and comparison of endpoints led to the endpoints controller incorrectly assuming there were many more updates that needed to be made to the cluster than there actually were.
Consequence: Endpoints controller would send large amounts of API requests at every informer re-sync interval, causing degradation in large clusters with many endpoints.
Fix: Update storage and comparison functions for endpoints to be more consistent.
Result: Endpoints controller now only sends updates when necessary.
|
Story Points: | --- | ||||
| Clone Of: | |||||||
| : | 1870540 1882030 (view as bug list) | Environment: | |||||
| Last Closed: | 2020-10-27 16:12:46 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1870540, 1882030 | ||||||
| Attachments: |
|
||||||
*** Bug 1854435 has been marked as a duplicate of this bug. *** Previously mentioned data has been uploaded. Based on comment 6, the traffic is coming from the system:serviceaccount:kube-system:endpoint-controller which sounds like the kube controller manager... reassigning. Created attachment 1711463 [details]
Requested increase log levels
Upon thorough inspection of the code and debugging trials we found behavior similar to what was described still exists in the current code (4.6/1.19). It's obviously surprising that such an issue has gone undetected for so long, but we isolated the behavior to a key section of code that checks for changes in Services and Endpoints in the cluster. This code could result in a false-negative due to comparison of sorted vs unsorted lists. I have opened a 3.11 PR to fix the bug at https://github.com/openshift/origin/pull/25424 I have also opened an upstream PR for the latest Kubernetes at https://github.com/kubernetes/kubernetes/pull/94112. The community maintainers will also provide review for these changes, and I expect that the change will be backported through supported versions. The 3.11 PR has moved to https://github.com/openshift/origin/pull/25426 which includes more code changes that help with this Update about the current progress: The proposed patch has been posted for an upstream review and it turned out the problem lies actually in the way kubernetes is hashing the endpoint structures: https://github.com/kubernetes/kubernetes/pull/94112#discussion_r478397569 and the proposed fix only mitigated the real problem. We need to fix the root issue first. I'm in constant contact with all the people responsible for the endpoints controller, I provided them with possible solutions to the problem discovered by Jordan yesterday. As soon as we get final agreement on the approach I'll be backporting this to all necessary versions of OpenShift. Hello Tomas and Maciej, Is there any type of workaround or temporary fix we can provide to a customer? My customer is experiencing major critical production issues across their environments. Their case was opened on 6/26 and they have been dealing with constant application downtime across their entire network. Please let me know if there is anything we can do to help them work around this issue until a proper errata is released. Regards, Tony Garcia Red Hat OpenShift TAM Both the Upstream PR (https://github.com/kubernetes/kubernetes/pull/94112) and the 3.11 backport (https://github.com/openshift/origin/pull/25426) have been approved and are in the merge queue. These should merge later today if CI passes. I'm adding UpcomingSprint, because the work for this bug is done and waiting to merge Moving the bug to verified state as i see that updates are less after the bug fix has happened. Below are the steps followed to verify the bug.
[ramakasturinarra@dhcp35-60 ~]$ oc version
Client Version: 4.6.0-202009150116.p0-3db46dc
Server Version: 4.6.0-0.nightly-2020-09-14-221526
Kubernetes Version: v1.19.0+35ab7c5
Before Fix:
===================
1) Install 4.5 cluster with nightly
2) set kubecontrollermanager logLevel to Trace so that it runs at -v=6
3) Now run the command below to see the frequency of updates happening i.e for every 10 mins.
I0916 05:28:23.909056 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/node-exporter, ready: 6 not ready: 0
I0916 05:28:23.909662 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/alertmanager-main, ready: 6 not ready: 0
I0916 05:28:23.914992 1 endpoints_controller.go:510] Update endpoints for openshift-machine-api/cluster-autoscaler-operator, ready: 2 not ready: 0
I0916 05:28:23.913246 1 endpoints_controller.go:510] Update endpoints for openshift-marketplace/marketplace-operator-metrics, ready: 2 not ready: 0
I0916 05:28:23.919947 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/prometheus-operator, ready: 1 not ready: 0
I0916 05:28:23.923106 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/openshift-state-metrics, ready: 2 not ready: 0
I0916 05:28:23.924974 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/prometheus-operated, ready: 4 not ready: 0
I0916 05:28:23.926701 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/alertmanager-operated, ready: 9 not ready: 0
I0916 05:28:23.930065 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/cluster-monitoring-operator, ready: 1 not ready: 0
I0916 05:28:23.930581 1 endpoints_controller.go:510] Update endpoints for openshift-image-registry/image-registry-operator, ready: 1 not ready: 0
I0916 05:28:23.934368 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/kube-state-metrics, ready: 2 not ready: 0
I0916 05:28:23.936261 1 endpoints_controller.go:510] Update endpoints for openshift-dns/dns-default, ready: 18 not ready: 0
I0916 05:28:23.939285 1 endpoints_controller.go:510] Update endpoints for openshift-sdn/sdn, ready: 6 not ready: 0
I0916 05:28:23.943335 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/telemeter-client, ready: 1 not ready: 0
I0916 05:28:23.945914 1 endpoints_controller.go:510] Update endpoints for openshift-cluster-machine-approver/machine-approver, ready: 1 not ready: 0
I0916 05:28:23.947878 1 endpoints_controller.go:510] Update endpoints for openshift-multus/multus-admission-controller, ready: 6 not ready: 0
I0916 05:28:23.951329 1 endpoints_controller.go:510] Update endpoints for openshift-ingress/router-internal-default, ready: 6 not ready: 0
I0916 05:28:23.951379 1 endpoints_controller.go:510] Update endpoints for openshift-cluster-samples-operator/metrics, ready: 1 not ready: 0
I0916 05:38:23.915181 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/kube-state-metrics, ready: 2 not ready: 0
I0916 05:38:23.916672 1 endpoints_controller.go:510] Update endpoints for openshift-dns/dns-default, ready: 18 not ready: 0
I0916 05:38:23.917024 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/telemeter-client, ready: 1 not ready: 0
I0916 05:38:23.918275 1 endpoints_controller.go:510] Update endpoints for openshift-sdn/sdn, ready: 6 not ready: 0
I0916 05:38:23.928198 1 endpoints_controller.go:510] Update endpoints for openshift-cluster-machine-approver/machine-approver, ready: 1 not ready: 0
I0916 05:38:23.932283 1 endpoints_controller.go:510] Update endpoints for openshift-multus/multus-admission-controller, ready: 6 not ready: 0
I0916 05:38:23.932546 1 endpoints_controller.go:510] Update endpoints for openshift-ingress/router-internal-default, ready: 6 not ready: 0
I0916 05:38:23.932961 1 endpoints_controller.go:510] Update endpoints for openshift-cluster-samples-operator/metrics, ready: 1 not ready: 0
I0916 05:38:23.934526 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/node-exporter, ready: 6 not ready: 0
I0916 05:38:23.938472 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/alertmanager-main, ready: 6 not ready: 0
I0916 05:38:23.943770 1 endpoints_controller.go:510] Update endpoints for openshift-marketplace/marketplace-operator-metrics, ready: 2 not ready: 0
I0916 05:38:23.946585 1 endpoints_controller.go:510] Update endpoints for openshift-machine-api/cluster-autoscaler-operator, ready: 2 not ready: 0
I0916 05:38:23.947588 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/openshift-state-metrics, ready: 2 not ready: 0
I0916 05:38:23.948295 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/prometheus-operated, ready: 4 not ready: 0
I0916 05:38:23.950412 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/alertmanager-operated, ready: 9 not ready: 0
I0916 05:38:23.953437 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/cluster-monitoring-operator, ready: 1 not ready: 0
I0916 05:38:23.956470 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/prometheus-operator, ready: 1 not ready: 0
I0916 05:38:23.960298 1 endpoints_controller.go:510] Update endpoints for openshift-image-registry/image-registry-operator, ready: 1 not ready: 0
I0916 05:48:23.922660 1 endpoints_controller.go:510] Update endpoints for openshift-sdn/sdn, ready: 6 not ready: 0
I0916 05:48:23.923202 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/telemeter-client, ready: 1 not ready: 0
I0916 05:48:23.924504 1 endpoints_controller.go:510] Update endpoints for openshift-cluster-machine-approver/machine-approver, ready: 1 not ready: 0
I0916 05:48:23.935735 1 endpoints_controller.go:510] Update endpoints for openshift-multus/multus-admission-controller, ready: 6 not ready: 0
I0916 05:48:23.935916 1 endpoints_controller.go:510] Update endpoints for openshift-ingress/router-internal-default, ready: 6 not ready: 0
I0916 05:48:23.935922 1 endpoints_controller.go:510] Update endpoints for openshift-cluster-samples-operator/metrics, ready: 1 not ready: 0
I0916 05:48:23.940551 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/node-exporter, ready: 6 not ready: 0
I0916 05:48:23.940787 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/alertmanager-main, ready: 6 not ready: 0
I0916 05:48:23.945877 1 endpoints_controller.go:510] Update endpoints for openshift-marketplace/marketplace-operator-metrics, ready: 2 not ready: 0
I0916 05:48:23.946141 1 endpoints_controller.go:510] Update endpoints for openshift-machine-api/cluster-autoscaler-operator, ready: 2 not ready: 0
I0916 05:48:23.946793 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/prometheus-operator, ready: 1 not ready: 0
I0916 05:48:23.951126 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/prometheus-operated, ready: 4 not ready: 0
I0916 05:48:23.951178 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/openshift-state-metrics, ready: 2 not ready: 0
I0916 05:48:23.954907 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/alertmanager-operated, ready: 9 not ready: 0
I0916 05:48:23.955575 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/cluster-monitoring-operator, ready: 1 not ready: 0
I0916 05:48:23.959774 1 endpoints_controller.go:510] Update endpoints for openshift-image-registry/image-registry-operator, ready: 1 not ready: 0
I0916 05:48:23.962264 1 endpoints_controller.go:510] Update endpoints for openshift-monitoring/kube-state-metrics, ready: 2 not ready: 0
I0916 05:48:23.962831 1 endpoints_controller.go:510] Update endpoints for openshift-dns/dns-default, ready: 18 not ready: 0
[ramakasturinarra@dhcp35-60 ~]$ oc get services --all-namespaces | wc -l
67
[ramakasturinarra@dhcp35-60 ~]$ oc logs pod/kube-controller-manager-ip-10-0-204-252.ap-northeast-1.compute.internal -n openshift-kube-controller-manager -c kube-controller-manager | grep endpoints_controller | grep -c "About to update endpoints"
244
[ramakasturinarra@dhcp35-60 ~]$ oc logs pod/kube-controller-manager-ip-10-0-204-252.ap-northeast-1.compute.internal -n openshift-kube-controller-manager -c kube-controller-manager | grep endpoints_controller | grep -c "Update endpoints for"
72
After the Fix:
===================
1) Install latest 4.6 cluster which has the fix
2) set kube-controller-manager loglevel to Trace so that it runs at -v=6
3) you can see that there are no updates in about 10 minutes.
[ramakasturinarra@dhcp35-60 ~]$ oc get services --all-namespaces | wc -l
76
[ramakasturinarra@dhcp35-60 ~]$ oc logs pod/kube-controller-manager-ip-10-0-211-68.us-east-2.compute.internal -n openshift-kube-controller-manager -c kube-controller-manager | grep endpoints_controller | grep -c "About to update endpoints"
213
[ramakasturinarra@dhcp35-60 ~]$ oc logs pod/kube-controller-manager-ip-10-0-211-68.us-east-2.compute.internal -n openshift-kube-controller-manager -c kube-controller-manager | grep endpoints_controller | grep -c "Update endpoints for"
0
spec:
logLevel: Trace
Based on the above moving bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |
Description of problem: Customer is reporting that their f5 vip is getting ~50G more traffic during a spike that occurs every 10 minutes. Looking at the pcaps, it appears all the traffic is, obviously, going to the api. Looking at the amount of nodes on the cluster, it appears there are over 800. A early observation was the output on the sosreport of the sdn. However, the sdn pods looked fine when I looked at them: "root 16910 0.0 0.0 13272 3044 ? Ss Jun17 1:45 /bin/bash -c #!/bin/bash set -euo pipefail # cleanup old log files rm -f /var/log/openvswitch-old/ovsdb-server.log /var/log/openvswitch-old/ovs-vswitchd.log mkdir -p /var/log/openvswitch # if another process is listening on the cni-server socket, wait until it exits trap 'kill $(jobs -p); exit 0' TERM retries=0 while true; do if /usr/share/openvswitch/scripts/ovs-ctl status &>/dev/null; then echo "warning: Another process is currently managing OVS, waiting 15s ..." 2>&1 sleep 15 & wait (( retries += 1 )) else break fi if [[ "${retries}" -gt 40 ]]; then echo "error: Another process is currently managing OVS, exiting" 2>&1 exit 1 fi done # launch OVS function quit { /usr/share/openvswitch/scripts/ ovs-ctl stop exit 0 } trap quit SIGTERM /usr/share/openvswitch/scripts/ovs-ctl start --no-ovs-vswitchd --system-id=random # Restrict the number of pthreads ovs-vswitchd creates to reduce the # amount of RSS it uses on hosts with many cores # https://bugzilla.redhat.com/show_bug.cgi?id=1571379 # https://bugzilla.redhat.com/show_bug.cgi?id=1572797 if [[ `nproc` -gt 12 ]]; then ovs-vsctl --no-wait set Open_vSwitch . other_config:n-revalidator-threads=4 ovs-vsctl --no-wait set Open_vSwitch . other_config:n-handler-threads=10 fi /usr/share/openvswitch/scripts/ovs-ctl start --no-ovsdb-server --system-id=random tail --follow=name /var/log/openvswitch/ovs-vswitchd.log /var/log/openvswitch/ovsdb-server.log & sleep 20 while true; do if ! /usr/share/openvswitch/scripts/ovs-ctl status &>/dev/null; then echo "OVS seems to have crashed, exiting" quit fi sleep 15 done" I have been trying to figure out what could be causing a spike every 10 minutes within a cluster, and cannot think of a process that reports every 10 minutes that would have this sort of impact. Looking for assistance on what to check next. I think this might be the consequence for the amount of nodes. What could be sending traffic to the API? Version-Release number of selected component (if applicable): 3.11 How reproducible: I have not seen anything that would cause this in my looking Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: