Bug 1955044
| Summary: | SystemMemoryExceedsReservation alert calculating incorrect system-reserved when hugepages reserved memory is configured. | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Sanket N <snalawad> |
| Component: | Monitoring | Assignee: | Sergiusz Urbaniak <surbania> |
| Status: | CLOSED DUPLICATE | QA Contact: | Junqi Zhao <juzhao> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.6 | CC: | alegrand, anpicker, erooth, kakkoyun, lcosic, pkrupa, spasquie, surbania |
| Target Milestone: | --- | Flags: | snalawad:
needinfo-
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-04-29 12:46:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
*** This bug has been marked as a duplicate of bug 1953846 *** |
Description of problem: SystemMemoryExceedsReservation alert calculating incorrect system-reserved due to hugepages memory negation from allocatable memory. ~~~ SystemMemoryExceedsReservation alert: sum by (node)\ \ (container_memory_rss{id=\"/system.slice\"}) > ((sum by (node) (kube_node_status_capacity{resource=\"\ memory\"} - kube_node_status_allocatable{resource=\"memory\"})) * 0.9) ~~~ The alert for SystemMemoryExceedsReservation is monitoring container_memory_rss{id=\"/system.slice\"}) and the condition is to be satisfied when the system memory exceeds 90% of system reserved memory. THe system reserved memory is calculated by the expression : ~~~ ((sum by (node) (kube_node_status_capacity{resource=\"\memory\"} - kube_node_status_allocatable{resource=\"memory\"})) ~~~ System Reserved = Capacity.memory - Allocatable.memory When hugepages are configured the allocatable.memory is neageted by the hugepage space and the system reserved will capture this neageted value for Allocatable.memory ---------------------------------------------------------------------------------------- Without Hugepages | After Hugepages configured ---------------------------------------------|------------------------------------------- Capacity: | Capacity: cpu: 4 | cpu: 4 ephemeral-storage: 41407468Ki | ephemeral-storage: 41407468Ki hugepages-1Gi: 0 | hugepages-1Gi: 0 hugepages-2Mi: 0 | hugepages-2Mi: 100Mi memory: 8153272Ki | memory: 8153256Ki pods: 250 | pods: 250 Allocatable: | Allocatable: cpu: 3500m | cpu: 3500m ephemeral-storage: 37087380622 | ephemeral-storage: 37087380622 hugepages-1Gi: 0 | hugepages-1Gi: 0 hugepages-2Mi: 0 | hugepages-2Mi: 100Mi memory: 7002296Ki | memory: 6899880Ki <=== pods: 250 | pods: 250 ---------------------------------------------------------------------------------------- Due to this the system reserved value in the alert will be greater than what we have configured for the cluster and this will affect the alert to miss the 90% mark of the system-reserved. Expected results: The following should be consider for the alert equation to compensate the hugepages negation so system reserve value would be unaffected. \ (container_memory_rss{id=\"/system.slice\"}) > ( Capacity.memory -(Allocatable.memory + hugepages-1Gi + hugepages-2Mi))*90