Bug 2012346 - prometheus-k8s-0 cpu usage keeps increasing for the first 3 days
Summary: prometheus-k8s-0 cpu usage keeps increasing for the first 3 days
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.9
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: 4.9.z
Assignee: Luis Sanchez
QA Contact: Ke Wang
URL:
Whiteboard:
Depends On: 2004585
Blocks: 2013405
TreeView+ depends on / blocked
 
Reported: 2021-10-08 22:21 UTC by OpenShift BugZilla Robot
Modified: 2021-11-02 23:43 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2013405 (view as bug list)
Environment:
Last Closed: 2021-10-26 17:22:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 1242 0 None Merged [release-4.9] Bug 2012346: prometheus-k8s-0 cpu usage keeps increasing for the first 3 days 2021-11-02 23:42:43 UTC
Red Hat Product Errata RHBA-2021:3935 0 None None None 2021-10-26 17:23:04 UTC

Comment 5 Ke Wang 2021-10-16 07:42:42 UTC
$ oc get clusterversion
oc gNAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-10-14-182021   True        False         26h   Cluster version is 4.9.0-0.nightly-2021-10-14-182021

$ $ oc get no
NAME                                                  STATUS   ROLES           AGE   VERSION
master-00.kewang-15sno2.qe.devcluster.openshift.com   Ready    master,worker   26h   v1.22.0-rc.0+894a78b

The new PrometheusRule starts working,
$ oc get PrometheusRule -n openshift-kube-apiserver kube-apiserver-slos-basic
NAME                        AGE
kube-apiserver-slos-basic   5h1m

Using clusterbuster gives some workload on cluster,
$ ./clusterbuster -P server -b 5 -p 2 -D .01 -M 1 -N 3 -r 4 -d 2 -c 6 -m 1000 -v -s 5 -x

After one day, the cpu usage of prometheus pod always keeps ~0.25, due to resource retention limitation, only can observe one day, will paste the screen-shot about this.

Comment 9 errata-xmlrpc 2021-10-26 17:22:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.9.4 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3935

Comment 10 W. Trevor King 2021-11-02 23:43:51 UTC
https://github.com/openshift/cluster-kube-apiserver-operator/pull/1242 landed after 4.9.0 and before 4.9.4, as described in comment 9.  Updating Target Release to match.


Note You need to log in before you can comment on or make changes to this bug.