1948926 – Memory Usage of Dashboard 'Kubernetes / Compute Resources / Pod' contain wrong CPU query

Bug 1948926 - Memory Usage of Dashboard 'Kubernetes / Compute Resources / Pod' contain wrong CPU query

Summary: Memory Usage of Dashboard 'Kubernetes / Compute Resources / Pod' contain wron...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.8.0
Assignee:	Jan Fajerski
QA Contact:	hongyan li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-04-13 04:27 UTC by hongyan li
Modified:	2021-07-27 23:00 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-07-27 22:59:54 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
query screenshot (87.99 KB, image/png) 2021-04-13 04:27 UTC, hongyan li	no flags	Details
console screenshot (52.34 KB, image/png) 2021-04-13 04:57 UTC, hongyan li	no flags	Details
CPU request (74.99 KB, image/png) 2021-04-13 05:55 UTC, hongyan li	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	kubernetes-monitoring kubernetes-mixin pull 590	0	None	closed	Kubernetes/Compute Resources/Pod: Fix resource label	2021-05-04 13:45:32 UTC
Red Hat Product Errata	RHSA-2021:2438	0	None	None	None	2021-07-27 23:00:11 UTC

Description hongyan li 2021-04-13 04:27:57 UTC

Created attachment 1771496 [details]
query screenshot

Created attachment 1771496 [details]
query screenshot

Description of problem:
Memory of the following Dashboards include wrong query
'Kubernetes / Compute Resources / Pod'

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-04-09-222447

How reproducible:
always

Steps to Reproduce:
1. Go to Monitoring -> Dashboards
2. Select DB 'Kubernetes / Compute Resources / Pod'
3. Find 'Memory Usage' and mouse over picture, there is request
4. click inspect, there are wrong query 
---------------------------------
sum(
    kube_pod_container_resource_requests{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)
-----
sum(
    kube_pod_container_resource_limits{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)



Actual results:


Expected results:


Additional info:
Screenshot for DB 'Kubernetes / Compute Resources / Pod' is uploaded

Comment 1 hongyan li 2021-04-13 04:57:10 UTC

Created attachment 1771497 [details]
console screenshot

Comment 2 hongyan li 2021-04-13 05:55:00 UTC

Request has been included in CPU usage which is responding to query 
sum(
    kube_pod_container_resource_requests{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)

Please see screenshot

Comment 3 hongyan li 2021-04-13 05:55:43 UTC

Created attachment 1771505 [details]
CPU request

Comment 4 Junqi Zhao 2021-04-13 06:16:31 UTC

sum(
    kube_pod_container_resource_requests{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)
-----
sum(
    kube_pod_container_resource_limits{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)

resource="cpu"
should change  to resource="memory", or remove resource="cpu"

Comment 5 Junqi Zhao 2021-04-13 06:19:18 UTC

"Kubernetes / Compute Resources / Pod" dashboard
configmap is grafana-dashboard-k8s-resources-pod

"Kubernetes / Compute Resources / Namespace (Pods)" dashboard
configmap is grafana-dashboard-k8s-resources-namespace

 "Kubernetes / Compute Resources / Namespace (Workloads)" dashboard
configmap is grafana-dashboard-k8s-resources-workloads-namespace

Comment 6 Junqi Zhao 2021-04-13 06:29:47 UTC

(In reply to Junqi Zhao from comment #5)
> "Kubernetes / Compute Resources / Pod" dashboard
> configmap is grafana-dashboard-k8s-resources-pod
> 
> "Kubernetes / Compute Resources / Namespace (Pods)" dashboard
> configmap is grafana-dashboard-k8s-resources-namespace
> 
>  "Kubernetes / Compute Resources / Namespace (Workloads)" dashboard
> configmap is grafana-dashboard-k8s-resources-workloads-namespace

only "Kubernetes / Compute Resources / Pod" dashboard has issue, need to fix it

Comment 7 hongyan li 2021-04-13 06:51:40 UTC

Memory usage for DB 'Kubernetes / Compute Resources / Namespace (Pods)' and DB 'Kubernetes / Compute Resources / Namespace (Workloads)' don't need the following queries which return no data
scalar(kube_resourcequota{cluster="", namespace="openshift-monitoring", type="hard",resource="requests.memory"})
and
scalar(kube_resourcequota{cluster="", namespace="openshift-monitoring", type="hard",resource="limits.memory"})

Comment 8 hongyan li 2021-04-13 07:21:01 UTC

On 4.7, query for DB 'Kubernetes / Compute Resources / Pod' is correct as the following

                            {
                                "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", image!=\"\"}) by (container)",
                                "format": "time_series",
                                "intervalFactor": 2,
                                "legendFormat": "{{container}}",
                                "legendLink": null,
                                "step": 10
                            },
                            {
                                "expr": "sum(\n    kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
                                "format": "time_series",
                                "intervalFactor": 2,
                                "legendFormat": "requests",
                                "legendLink": null,
                                "step": 10
                            },
                            {
                                "expr": "sum(\n    kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
                                "format": "time_series",
                                "intervalFactor": 2,
                                "legendFormat": "limits",
                                "legendLink": null,
                                "step": 10
                            }

Comment 9 hongyan li 2021-04-13 07:26:47 UTC

(In reply to hongyan li from comment #7)
> Memory usage for DB 'Kubernetes / Compute Resources / Namespace (Pods)' and
> DB 'Kubernetes / Compute Resources / Namespace (Workloads)' don't need the
> following queries which return no data
> scalar(kube_resourcequota{cluster="", namespace="openshift-monitoring",
> type="hard",resource="requests.memory"})
> and
> scalar(kube_resourcequota{cluster="", namespace="openshift-monitoring",
> type="hard",resource="limits.memory"})

For these two DBs, 4.7 include same queries with name quota - limits and quota - request

Comment 10 hongyan li 2021-04-13 07:46:36 UTC

Issue related to comment #6, #7 and #9, filed a new bug https://bugzilla.redhat.com/show_bug.cgi?id=1948972 which exists on both 4.7 and 4.8

Comment 11 Damien Grisonnet 2021-04-14 10:12:07 UTC

This doesn't seem to be a bug, it's normal for the query to return no data since we don't define resource limits for the monitoring stack pods.

Comment 12 hongyan li 2021-04-15 03:15:51 UTC

This is a bug with wrong CPU data in Memory Usage. Refer https://bugzilla.redhat.com/show_bug.cgi?id=1948926#c4

Comment 13 Damien Grisonnet 2021-04-15 08:09:10 UTC

You are very right, there is definitely a bug with the memory limits/requests query for which we set `resource=cpu` instead of `resource=memory`.

This issue seems to be coming from upstream as we don't replace the resource value here: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/dashboards/resources/pod.libsonnet#L50-L58

Comment 15 Junqi Zhao 2021-05-25 10:52:19 UTC

tested with 4.8.0-0.nightly-2021-05-21-233425, "Kubernetes / Compute Resources / Pod" dashboard, select any pod under any project, "Memory Usage" section, click "Inspect" to check the expression, the wrong expression in Comment 4 is updated to correct values, see from resource="memory" in the expr  

sum(
    kube_pod_container_resource_requests{cluster="", namespace="openshift-monitoring", pod="alertmanager-main-0", resource="memory"}
)

Comment 18 errata-xmlrpc 2021-07-27 22:59:54 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Note You need to log in before you can comment on or make changes to this bug.