Bug 1754459 - The system dashboards show a significantly lower disk and memory usage than the hosts' dashboards
Summary: The system dashboards show a significantly lower disk and memory usage than t...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Console Metal3 Plugin
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Paul Gier
QA Contact: Udi Kalifon
URL:
Whiteboard:
Depends On: 1759945 1774653
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-23 10:15 UTC by Udi Kalifon
Modified: 2019-11-20 15:56 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-09 13:32:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Data in the system dashboards and the host dashboard (89.70 KB, image/png)
2019-09-23 10:15 UTC, Udi Kalifon
no flags Details

Description Udi Kalifon 2019-09-23 10:15:54 UTC
Created attachment 1618158 [details]
Data in the system dashboards and the host dashboard

Description of problem:
In the system dashboards I see a memory usage of 1-2 GB, and a disk usage of 1-2 GB as well. At the same time, any baremetal host that I check shows 10-20 GB RAM used, and 20-30 GB disk. The aggregated results in the system dashboards can't possibly show less than any single host.


How reproducible:
100%


Steps to Reproduce:
1. Compare how much RAM and disk the baremetal hosts use, with the total usage that you see in the system dashboards


Additional info:
See attached screen shot

Comment 1 Honza Pokorny 2019-10-07 17:05:55 UTC
The two dashboards use different Prometheus exporters:

Overview dashboard: (sum(kube_node_status_capacity_memory_bytes) - sum(kube_node_status_allocatable_memory_bytes))[60m:5m]

Baremetal host dashboard: node_memory_Active_bytes

Comment 2 Honza Pokorny 2019-10-07 22:16:40 UTC
Lily, any ideas what could cause the large discrepancy?

Comment 3 Paul Gier 2019-10-09 09:04:35 UTC
I think the issue is that kube_node_status_allocatable_memory_bytes is the total memory which can be used for PODs (including ones that are currently running).  So this value is determined when the kubelet is first started.  It's the value of the total node memory minus memory which is reserved for the kubelet itself and for system processes (ssh, etc).  So the overview dashboard calculation listed by Honza ends up being:

  (node_capacity - (node_capacity - mem_reserved_for_system_stuff))

The node_capacity values cancel, and the value shown in the dashboard is actually how much memory we're holding for kubelet and system processes.
Some additional explanation is available here: https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/

I'm not sure if the disk usage difference is a similar issue.

Comment 4 Lili Cosic 2019-10-09 13:23:32 UTC
I think this can be closed as its an issue in our console and your is actually correct. Thanks for the ping on this!


Note You need to log in before you can comment on or make changes to this bug.