Created attachment 1823359 [details]
prometheus pod cpu usage for the first 5 days
Description of problem:
On a freshly deployed SNO cluster, cpu usage of prometheus pod will keep increase from ~0.15 to ~0.45 for the first 3 days. It takes too long to settle.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Deploy a sno cluster
2. Deploy some workload pods
3. Observe cpu usage of each platform pod via prometheus queries for a week
3. prometheus-k8s-0 cpu usage kept increasing steadily for the first 3 days from ~0.15 to ~0.45, and settled on the 4th day.
3. It should settle earlier with lower cpu usage (maybe).
Prom chart for prometheus-k8s-0 pod is attached
Looking at the attached screenshot, I don't see any major issues.
What alarms you about the situation and what results/threshold would you expect to see instead?
We are expecting < 300 mc based on previous measurements.
Verified on 4.9.1.
prometheus-k8s-0 pod cpu usage has been relatively stable in the past 4 days since fresh deployment. It uses ~0.17 cpu in steady state.
Verified on 4.10 nightly payload, prometheus-k8s-0 pod cpu usage is around 0.1 with some workload pods created by clusterbuster tool(./clusterbuster -P server -b 5 -p 2 -D .01 -M 1 -N 4 -r 4 -d 2 -c 6 -m 1000 -v -s 5 -x) in the past 2 days.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.