Bug 1948926 - Memory Usage of Dashboard 'Kubernetes / Compute Resources / Pod' contain wrong CPU query
Summary: Memory Usage of Dashboard 'Kubernetes / Compute Resources / Pod' contain wron...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.8.0
Assignee: Jan Fajerski
QA Contact: hongyan li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-13 04:27 UTC by hongyan li
Modified: 2021-07-27 23:00 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 22:59:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
query screenshot (87.99 KB, image/png)
2021-04-13 04:27 UTC, hongyan li
no flags Details
console screenshot (52.34 KB, image/png)
2021-04-13 04:57 UTC, hongyan li
no flags Details
CPU request (74.99 KB, image/png)
2021-04-13 05:55 UTC, hongyan li
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github kubernetes-monitoring kubernetes-mixin pull 590 0 None closed Kubernetes/Compute Resources/Pod: Fix resource label 2021-05-04 13:45:32 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:00:11 UTC

Description hongyan li 2021-04-13 04:27:57 UTC
Created attachment 1771496 [details]
query screenshot

Created attachment 1771496 [details]
query screenshot

Description of problem:
Memory of the following Dashboards include wrong query
'Kubernetes / Compute Resources / Pod'

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-04-09-222447

How reproducible:
always

Steps to Reproduce:
1. Go to Monitoring -> Dashboards
2. Select DB 'Kubernetes / Compute Resources / Pod'
3. Find 'Memory Usage' and mouse over picture, there is request
4. click inspect, there are wrong query 
---------------------------------
sum(
    kube_pod_container_resource_requests{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)
-----
sum(
    kube_pod_container_resource_limits{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)



Actual results:


Expected results:


Additional info:
Screenshot for DB 'Kubernetes / Compute Resources / Pod' is uploaded

Comment 1 hongyan li 2021-04-13 04:57:10 UTC
Created attachment 1771497 [details]
console screenshot

Comment 2 hongyan li 2021-04-13 05:55:00 UTC
Request has been included in CPU usage which is responding to query 
sum(
    kube_pod_container_resource_requests{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)

Please see screenshot

Comment 3 hongyan li 2021-04-13 05:55:43 UTC
Created attachment 1771505 [details]
CPU request

Comment 4 Junqi Zhao 2021-04-13 06:16:31 UTC
sum(
    kube_pod_container_resource_requests{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)
-----
sum(
    kube_pod_container_resource_limits{cluster="", namespace="openshift-monitoring", pod="prometheus-k8s-0", resource="cpu"}
)

resource="cpu"
should change  to resource="memory", or remove resource="cpu"

Comment 5 Junqi Zhao 2021-04-13 06:19:18 UTC
"Kubernetes / Compute Resources / Pod" dashboard
configmap is grafana-dashboard-k8s-resources-pod

"Kubernetes / Compute Resources / Namespace (Pods)" dashboard
configmap is grafana-dashboard-k8s-resources-namespace

 "Kubernetes / Compute Resources / Namespace (Workloads)" dashboard
configmap is grafana-dashboard-k8s-resources-workloads-namespace

Comment 6 Junqi Zhao 2021-04-13 06:29:47 UTC
(In reply to Junqi Zhao from comment #5)
> "Kubernetes / Compute Resources / Pod" dashboard
> configmap is grafana-dashboard-k8s-resources-pod
> 
> "Kubernetes / Compute Resources / Namespace (Pods)" dashboard
> configmap is grafana-dashboard-k8s-resources-namespace
> 
>  "Kubernetes / Compute Resources / Namespace (Workloads)" dashboard
> configmap is grafana-dashboard-k8s-resources-workloads-namespace

only "Kubernetes / Compute Resources / Pod" dashboard has issue, need to fix it

Comment 7 hongyan li 2021-04-13 06:51:40 UTC
Memory usage for DB 'Kubernetes / Compute Resources / Namespace (Pods)' and DB 'Kubernetes / Compute Resources / Namespace (Workloads)' don't need the following queries which return no data
scalar(kube_resourcequota{cluster="", namespace="openshift-monitoring", type="hard",resource="requests.memory"})
and
scalar(kube_resourcequota{cluster="", namespace="openshift-monitoring", type="hard",resource="limits.memory"})

Comment 8 hongyan li 2021-04-13 07:21:01 UTC
On 4.7, query for DB 'Kubernetes / Compute Resources / Pod' is correct as the following

                            {
                                "expr": "sum(container_memory_working_set_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container!=\"\", image!=\"\"}) by (container)",
                                "format": "time_series",
                                "intervalFactor": 2,
                                "legendFormat": "{{container}}",
                                "legendLink": null,
                                "step": 10
                            },
                            {
                                "expr": "sum(\n    kube_pod_container_resource_requests_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
                                "format": "time_series",
                                "intervalFactor": 2,
                                "legendFormat": "requests",
                                "legendLink": null,
                                "step": 10
                            },
                            {
                                "expr": "sum(\n    kube_pod_container_resource_limits_memory_bytes{cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\"})\n",
                                "format": "time_series",
                                "intervalFactor": 2,
                                "legendFormat": "limits",
                                "legendLink": null,
                                "step": 10
                            }

Comment 9 hongyan li 2021-04-13 07:26:47 UTC
(In reply to hongyan li from comment #7)
> Memory usage for DB 'Kubernetes / Compute Resources / Namespace (Pods)' and
> DB 'Kubernetes / Compute Resources / Namespace (Workloads)' don't need the
> following queries which return no data
> scalar(kube_resourcequota{cluster="", namespace="openshift-monitoring",
> type="hard",resource="requests.memory"})
> and
> scalar(kube_resourcequota{cluster="", namespace="openshift-monitoring",
> type="hard",resource="limits.memory"})

For these two DBs, 4.7 include same queries with name quota - limits and quota - request

Comment 10 hongyan li 2021-04-13 07:46:36 UTC
Issue related to comment #6, #7 and #9, filed a new bug https://bugzilla.redhat.com/show_bug.cgi?id=1948972 which exists on both 4.7 and 4.8

Comment 11 Damien Grisonnet 2021-04-14 10:12:07 UTC
This doesn't seem to be a bug, it's normal for the query to return no data since we don't define resource limits for the monitoring stack pods.

Comment 12 hongyan li 2021-04-15 03:15:51 UTC
This is a bug with wrong CPU data in Memory Usage. Refer https://bugzilla.redhat.com/show_bug.cgi?id=1948926#c4

Comment 13 Damien Grisonnet 2021-04-15 08:09:10 UTC
You are very right, there is definitely a bug with the memory limits/requests query for which we set `resource=cpu` instead of `resource=memory`.

This issue seems to be coming from upstream as we don't replace the resource value here: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/dashboards/resources/pod.libsonnet#L50-L58

Comment 15 Junqi Zhao 2021-05-25 10:52:19 UTC
tested with 4.8.0-0.nightly-2021-05-21-233425, "Kubernetes / Compute Resources / Pod" dashboard, select any pod under any project, "Memory Usage" section, click "Inspect" to check the expression, the wrong expression in Comment 4 is updated to correct values, see from resource="memory" in the expr  

sum(
    kube_pod_container_resource_requests{cluster="", namespace="openshift-monitoring", pod="alertmanager-main-0", resource="memory"}
)

Comment 18 errata-xmlrpc 2021-07-27 22:59:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.