Bug 1953041

Summary: openshift-authentication-operator uses 3.9k% of its requested CPU
Product: OpenShift Container Platform Reporter: aaleman
Component: apiserver-authAssignee: Sergiusz Urbaniak <surbania>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: low Docs Contact:
Priority: low    
Version: 4.7CC: aos-bugs, ccoleman, kewang, mfojtik, pmali, surbania, wking
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:03:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screenshot
none
cpu usage
none
memory usage
none
auth-o cpu-memory-usage-per-request none

Description aaleman 2021-04-23 19:06:41 UTC
Created attachment 1774907 [details]
screenshot

Description of problem:

openshift-authentication-operator uses 3.9k% of its requested CPU. I would expected it to not use more than maybe 10 times its requests (and that seems a lot already).


Version-Release number of selected component (if applicable):
Server Version: 4.7.6


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 W. Trevor King 2021-04-23 19:23:31 UTC
To save folks a click, screenshot shows openshift-authentication-operator requesting 10 mCPU and consuming 391 mCPU.

Comment 2 Clayton Coleman 2021-04-23 19:59:47 UTC
Authentication operator historically was at <10m so whatever is going on is novel and broken.  I would expect a fix to take us below 10m

Comment 3 Standa Laznicka 2021-04-30 14:09:01 UTC
we've had a BZ for an unreasonable memory consumption that might have even caused a very high CPU consumption and we fixed that (1954544). No matter whether this is the case or not, I'll hijack this bugzilla to adjust the operator's memory/CPU requests.

Comment 4 Sergiusz Urbaniak 2021-05-06 08:10:50 UTC
suggestion: I will measure CPU usage by using the same query that is being used for "kubectl top pod" (https://github.com/openshift/cluster-monitoring-operator/blob/4d6bf3d9ed8187ed13854fce3d75d32a0525b1db/assets/prometheus-adapter/config-map.yaml#L7) and will then adapt the settings accordingly. We used the same pattern for other monitoring components.

Comment 5 Sergiusz Urbaniak 2021-05-07 11:49:39 UTC
Created attachment 1780689 [details]
cpu usage

On an idle cluster we are consuming around ~20 millicores. I will adjust the request accordingly.

The request as submitted by the OP seems like a spike to me.

Comment 6 Sergiusz Urbaniak 2021-05-07 11:58:58 UTC
Created attachment 1780690 [details]
memory usage

In addition, we need to bump memory usage, it is idling around ~250Mi

Comment 8 Xingxing Xia 2021-05-26 10:23:06 UTC
Created attachment 1787177 [details]
auth-o cpu-memory-usage-per-request

Tested in several envs like 4.8.0-0.nightly-2021-05-25-223219:
$ oc get po -n openshift-authentication-operator -o yaml # same as PR update
...
      resources:
        requests:
          cpu: 20m
          memory: 200Mi
...

As my screenshot, checked in Web Console, saw the cpu usage/request is about 173%, memory is about 74%, acceptable. So moving to VERIFIED

Comment 12 errata-xmlrpc 2021-07-27 23:03:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438