Bug 1754459
| Summary: | The system dashboards show a significantly lower disk and memory usage than the hosts' dashboards | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Udi Kalifon <ukalifon> | ||||
| Component: | Console Metal3 Plugin | Assignee: | Paul Gier <pgier> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Udi Kalifon <ukalifon> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | unspecified | CC: | aos-bugs, hpokorny, lcosic | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-10-09 13:32:17 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1759945, 1774653 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
The two dashboards use different Prometheus exporters: Overview dashboard: (sum(kube_node_status_capacity_memory_bytes) - sum(kube_node_status_allocatable_memory_bytes))[60m:5m] Baremetal host dashboard: node_memory_Active_bytes Lily, any ideas what could cause the large discrepancy? I think the issue is that kube_node_status_allocatable_memory_bytes is the total memory which can be used for PODs (including ones that are currently running). So this value is determined when the kubelet is first started. It's the value of the total node memory minus memory which is reserved for the kubelet itself and for system processes (ssh, etc). So the overview dashboard calculation listed by Honza ends up being: (node_capacity - (node_capacity - mem_reserved_for_system_stuff)) The node_capacity values cancel, and the value shown in the dashboard is actually how much memory we're holding for kubelet and system processes. Some additional explanation is available here: https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/ I'm not sure if the disk usage difference is a similar issue. I think this can be closed as its an issue in our console and your is actually correct. Thanks for the ping on this! |
Created attachment 1618158 [details] Data in the system dashboards and the host dashboard Description of problem: In the system dashboards I see a memory usage of 1-2 GB, and a disk usage of 1-2 GB as well. At the same time, any baremetal host that I check shows 10-20 GB RAM used, and 20-30 GB disk. The aggregated results in the system dashboards can't possibly show less than any single host. How reproducible: 100% Steps to Reproduce: 1. Compare how much RAM and disk the baremetal hosts use, with the total usage that you see in the system dashboards Additional info: See attached screen shot