Bug 1943265
Summary: | Negative Memory Utilization for Cluster Compute Resources Dashboard | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | jhusta <jhusta> | ||||
Component: | Monitoring | Assignee: | Simon Pasquier <spasquie> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Yadan Pei <yapei> | ||||
Severity: | low | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 4.8 | CC: | alegrand, anpicker, aos-bugs, erooth, jhadvig, jhusta, jokerman, juzhao, krmoser, lcosic, wolfgang.voesch | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.9.0 | ||||||
Hardware: | s390x | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-10-01 14:22:23 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1934148 | ||||||
Attachments: |
|
Description
jhusta
2021-03-25 17:00:50 UTC
Created attachment 1766360 [details]
Screen Shots of Dashboard and usage by nodes
I see that this query has now been changed to `1 - sum(:node_memory_MemAvailable_bytes:sum{cluster=""}) / sum(kube_node_status_allocatable{resource="memory",cluster=""})` (changed by https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/534). Not sure if this change would be expected to resolve this issue. Pawel, could you confirm? FWIW, I am not seeing negative values with my test cluster. Seems like this can be happening when there is a large chunk of memory reserved for other uses. In such scenario node available memory will be much higher than what is allowed to be allocated by scheduler. This leads to have higher than one right part of the equation (`sum(:node_memory_MemAvailable_bytes:sum{cluster=""}) / sum(kube_node_status_allocatable{resource="memory",cluster=""})`) and causes negative values in overall. The PR https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/534 won't fix this as we need a different way to track this, preferably one where we don't subtract metric values from 1. checked with 4.9.0-0.nightly-2021-08-04-131508, Dashboard Kubernetes/Compute Resources/Cluster, "Memory Utilisation" expression now is 1 - sum(:node_memory_MemAvailable_bytes:sum{cluster=""}) / sum(node_memory_MemTotal_bytes{cluster=""}) this can guarantee no negative value @juzhao I was able to validate the fix on 4.9. This defect can be closed. Thank you! |