Bug 1975281

Summary: kube-rbac-proxy container with cpu throttling
Product: OpenShift Container Platform Reporter: Jordi Claret <jclaretm>
Component: MonitoringAssignee: Philip Gough <pgough>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.11.0CC: alegrand, anpicker, aos-bugs, erooth, kakkoyun, mnoguera, pgough, pkrupa, spasquie
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-07 11:01:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jordi Claret 2021-06-23 11:39:06 UTC
Description of problem:

Openshift monitoring operator deploy node-exporter daemon set, and kube-rbac-proxy container from node-exporter pod shows cpu throttling without reaching cpu limits.

Version-Release number of selected component (if applicable):

Openshift v3.11.404 and v3.11.439
RHEL 3.10.0-1160.25.1.el7.x86_64

Steps to Reproduce:

1. Deploy monitoring stack [1]

2. View throttling metrics in cpu.stat for kube-rbac-proxy container 

# oc exec node-exporter-pkj6q -c kube-rbac-proxy -n openshift-monitoring -- cat /sys/fs/cgroup/cpu/cpu.stat
nr_periods 218213
nr_throttled 22498
throttled_time 1906932081759

3.- View daemonset resources, By default monitoring operator set 20m for kube-rbac-proxy and it is not reaching its cpu limits. Operator does not allow modify ds resouces.

# oc get ds node-exporter -o yaml
...
          name: kube-rbac-proxy
          resources:
            limits:
              cpu: 20m
              memory: 40Mi
            requests:
              cpu: 10m
              memory: 20Mi

...

4.- by default cpu-cfs-quota is true. No namespace limits or quotas.

# oc get cm -o yaml node-config-infra | grep -i cpu-cfs-quota
<no output>


Actual results:

kube-rbac-proxy container from node-exporter pod shows cpu throttling without reaching cpu limits.

Expected results:
No cpu throttling and let operator manage cpu/memory resources for this container/pod.

Additional info:
[1] - https://docs.openshift.com/container-platform/3.11/install_config/prometheus_cluster_monitoring.html#installing-monitoring-stack

Comment 9 Junqi Zhao 2021-06-30 13:09:42 UTC
tested with ose-cluster-monitoring-operator:v3.11.463, no throttling for kube-rbac-proxy containers, resources limit is removed, influenced pods: node-exporter/kube-state-metrics pods
# oc -n openshift-monitoring get po | grep -E "node-exporter|kube-state-metrics"
kube-state-metrics-5764c88896-44bzf           3/3       Running   0          7m
node-exporter-6xwwh                           2/2       Running   0          8m
node-exporter-ff85f                           2/2       Running   0          8m
node-exporter-jx5bh                           2/2       Running   0          8m
# oc -n openshift-monitoring exec -c kube-rbac-proxy node-exporter-6xwwh -- cat /sys/fs/cgroup/cpu/cpu.stat
nr_periods 0
nr_throttled 0
throttled_time 0
# oc -n openshift-monitoring exec -c kube-rbac-proxy-main kube-state-metrics-5764c88896-44bzf -- cat /sys/fs/cgroup/cpu/cpu.stat
nr_periods 0
nr_throttled 0
throttled_time 0
# oc -n openshift-monitoring exec -c  kube-rbac-proxy-self kube-state-metrics-5764c88896-44bzf -- cat /sys/fs/cgroup/cpu/cpu.stat
nr_periods 0
nr_throttled 0
throttled_time 0


resource settings
**********************************
kube-state-metrics-5764c88896-44bzf
Container Name: kube-rbac-proxy-main
resources: map[requests:map[cpu:10m memory:20Mi]]
Container Name: kube-rbac-proxy-self
resources: map[requests:map[cpu:10m memory:20Mi]]
Container Name: kube-state-metrics
resources: map[]


node-exporter-6xwwh
Container Name: node-exporter
resources: map[]
Container Name: kube-rbac-proxy
resources: map[requests:map[cpu:10m memory:20Mi]]
**********************************

Comment 12 errata-xmlrpc 2021-07-07 11:01:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 3.11.465 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2639