Bug 1913618
| Summary: | Completed pods skew the Quota metrics | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | iwatson | ||||
| Component: | Monitoring | Assignee: | Philip Gough <pgough> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 4.6 | CC: | anpicker, dahernan, dgrisonn, erooth, hongyli, ianwatson, janantha, lcosic, pducai, pgough, spasquie | ||||
| Target Milestone: | --- | Keywords: | EasyFix, Reopened | ||||
| Target Release: | 4.9.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
Cause: Prometheus stores metrics for Pods/Jobs that have Failed/Completed. These Pods/Jobs may have had associated resources (requests and/ore limits).
Consequence: Despite the fact that the Failed/Completed Pods/Jobs were no longer requesting resources, the dashboards did not take that into account and was skewing values for CPU and memory.
Fix: Rewrite PromQL expression to filter out irrelevant data by ensuring we only account for running and pending containers.
Result: Dashboards display correct memory and CPU resource requests and limits.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-10-18 17:28:58 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
iwatson
2021-01-07 09:15:15 UTC
I got infor form customer: Grafana was incorrect on both in Grafana and on the console @Peter The CPU and memory quota considers pods in Running/Pending state only as specified in [1] and [2] which is expected [1]: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/2785a9f0addd11c77c82a0c3e8580b556621049d/rules/apps.libsonnet#L69 [2]: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/2785a9f0addd11c77c82a0c3e8580b556621049d/rules/apps.libsonnet#L83 Can you please attach the screenshot of dashboard you are referring to and let us know what is the exact issue Reviewed in sprint awaiting response from Peter to proceed further Closing this bug since insufficient data to proceed further. Please re-open the bug if this is still needed Reopening as its still a bug. Please read the initial description of the bug which has all the information required. Can you help me with the screenshot of dashboard you are referring to I had posted this comment https://bugzilla.redhat.com/show_bug.cgi?id=1913618#c5 it was in private mode, changed to public if that was not visible to you The query in question is
sum(kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) by (namespace)
This is on the Default / Kubernetes / Compute Resources / Cluster under "CPU Quota". This is a incorrect metric as it does not take into account completed pods.
The CPU Quota is made up of several queries, the query I have picked out is for "CPU Requests".
Also the same issue for the Memory Quota.
The correct query should be sum( kube_pod_container_resource_requests_cpu_cores{cluster=\"$cluster\"}) join <query to determine running pods> ) by (namespace)
@iwatson I believe this is the correct behaviour as Kubernetes will release resource requests by Completed/Failed jobs and builds. @ianwatson If you can, would you mind dropping in a screenshot of the dashboard that is causing you issues. Thanks Kubernetes will indeed release the resources when the job/build completes. This is exactly the issue. The dashboard does not show this release as it accounts for all pods regardless of their status. Ie create a new project and start a job with 1 cpu request/limit. Look at the difference between oc describe quota and your graph/Prometheus metric to see the difference. Oc describe quota will show you at 0 cpu request resources. The dashboard will show you at 1 cpu requests resources. I don’t see how the screenshot will help, I’ve identified the exact Prometheus query that is behind the dashboard and now provided exact steps to see this difference. Created attachment 1792025 [details]
Screenshot of Kube/Compute Resources / Cluster dashboard memory requests
Firstly, I take your point and see that it could potentially lead to some confusion in the dashboards and we extend the query to take into account the Pod phase The metric you mention `kube_pod_container_resource_requests` has been deprecated from kube-state-metrics and is not present in the latest version, see https://github.com/kubernetes/kube-state-metrics/pull/1224. The metric is behaving correctly see https://github.com/kubernetes/kube-state-metrics/issues/458 for explanation as is it's replacement https://github.com/kubernetes/kube-state-metrics/issues/1051 Secondly, the reason you were asked for a screenshot is so we could determine exactly what dashboard and section you felt was misleading, since the metric you mention is used in several places, so let me make sure we are on the same page. In the screenshot I have attached of the latest Kube/Compute Resources / Cluster Memory requests, I have requested 1Gi memory for 6 jobs and ran them to completion. The requests still show as 6 and there is no utilisation. Is the ask to ensure that this value should be 0 since the Pods owned by the Job are in the Completed phase? Besides doing the same for CPU requests, are there any other particular dashboards that you feel are misleading? Thanks Thanks Phillip Yes that is the ask, that in your scenario the CPU Request Quota / Memory Request Quota on the Kube/Compute Resources / Cluster should be 0. The use case is for cluster administrators to identify users who are using lots of requests but little to no utilization. This is not tricky given the current behaviour as the top projects are in general not the worst offenders, due to completed pods being present. The other graphs under Kube/Compute Resources/* also display the CPU Request /Memory Request quota a finer detail, I would argue that the behavior should be made consistent across all graphs. use following script to deploy in test namespace
***********************
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
resources:
requests:
memory: "200Mi"
cpu: "1000m"
limits:
memory: "200Mi"
cpu: "1000m"
restartPolicy: Never
***********************
checked with 4.9.0-0.nightly-2021-07-04-140102, only Kubernetes / Compute Resources / Cluster dashboard shows no value for Memory/CPU limit/request
other three dashboards
Kubernetes / Compute Resources / Pod
Kubernetes / Compute Resources / Namespace (Pods)
Kubernetes / Compute Resources / Node (Pods)
still show value for Memory/CPU limit/request
tested with 4.9.0-0.nightly-2021-07-25-125326, checked the following dashboards, no value for Memory/CPU limit/request Kubernetes / Compute Resources / Cluster dashboard Kubernetes / Compute Resources / Pod Kubernetes / Compute Resources / Namespace (Pods) Kubernetes / Compute Resources / Node (Pods) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |