1993218 – alerts: SystemMemoryExceedsReservation triggers too quickly

Bug 1993218 - alerts: SystemMemoryExceedsReservation triggers too quickly

Summary: alerts: SystemMemoryExceedsReservation triggers too quickly

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.6.z
Assignee:	Ryan Phillips
QA Contact:	Sunil Choudhary
Docs Contact:
URL:
Whiteboard:
Depends On:	1992687
Blocks:
TreeView+	depends on / blocked

Reported:	2021-08-12 14:51 UTC by OpenShift BugZilla Robot
Modified:	2024-12-20 20:41 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-09-09 01:52:52 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	5788171	0	None	None	None	2021-09-03 16:45:03 UTC
Red Hat Product Errata	RHBA-2021:3395	0	None	None	None	2021-09-09 01:53:14 UTC

Comment 1 Ryan Phillips 2021-08-12 19:23:36 UTC


*** This bug has been marked as a duplicate of bug 1980844 ***

Comment 3 Scott Dodson 2021-08-18 13:24:44 UTC

These are independent fixes, the PR on this moves from immediate alerts to 15m threshold while that may not fix the problem overall it does address the issues as described in the bug so re-opening and moving back to ON_QA. We'll see new bugs coming down from the change from 90% to 95% in the future but that's likely to be weeks out.

Comment 6 Sunil Choudhary 2021-09-02 16:28:11 UTC

Checked on 4.6.0-0.nightly-2021-08-31-113011, the alert is updated.
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2021-08-31-113011   True        False         4h21m   Cluster version is 4.6.0-0.nightly-2021-08-31-113011

Comment 8 errata-xmlrpc 2021-09-09 01:52:52 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.44 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3395

Comment 10 Jace Liang 2021-11-19 03:11:56 UTC

I have a customer already in OpenShift 4.6.44, but still having this kind of issue.

It's a new installed cluster, barely no customer's application yet. 
The node's memory is 128G total, the memory usage is about 10Gi during last 24hrs. 
But the SystemMemoryExceedsReservation alert triggered 7 times.

Comment 11 Lucian Maly (Red Hat) 2021-11-26 02:42:37 UTC

Customer is currently on 4.6.45, but still observing this issue on one node:                                                                       

# free -g                                                                 
              total        used        free      shared  buff/cache   available 
Mem:             62          15          17           0          29          46 
Swap:             0           0           0                                     

# oc describe node
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource  Requests           Limits
  --------  --------           ------
  cpu       5592m (74%)        11350m (151%)
  memory    27442798464 (41%)  38308997376 (57%)

# oc get --raw /api/v1/nodes/<NODE>/proxy/stats/summary | jq '.node.systemContainers[].memory.usageBytes'
kubelet =    461,053,952 B
runtime = 19,042,807,808 B
misc    = 24,289,464,320 B
pods    = 20,525,260,800 B

Note You need to log in before you can comment on or make changes to this bug.