Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1810136

Summary:	[4.2] A pod that gradually leaks memory causes node to become unreachable for 10 minutes
Product:	OpenShift Container Platform	Reporter:	Ryan Phillips <rphillips>
Component:	Node	Assignee:	Ryan Phillips <rphillips>
Status:	CLOSED ERRATA	QA Contact:	MinLi <minmli>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	4.4	CC:	alklein, aos-bugs, brad.williams, ccoleman, cfillekes, chanphil, christian.lapolt, hannsj_uhl, Holger.Wolf, jerzhang, jokerman, jshepherd, krmoser, lakshmi.ravichandran1, lmohanty, minmli, mwoodson, nbziouec, rcgingra, rphillips, schoudha, tdale, wvoesch
Target Milestone:	---	Keywords:	Reopened, Upgrades
Target Release:	4.2.z
Hardware:	s390x
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1808429	Environment:
Last Closed:	2020-07-01 16:08:20 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1808429
Bug Blocks:	1765215, 1766237, 1801826, 1801829, 1802687

Comment 1 Ryan Phillips 2020-03-04 19:47:02 UTC

*** Bug 1795185 has been marked as a duplicate of this bug. ***

Comment 2 Ryan Phillips 2020-03-26 15:36:12 UTC

*** Bug 1802639 has been marked as a duplicate of this bug. ***

Comment 6 MinLi 2020-06-10 10:30:00 UTC

verified with version : 4.4.0-0.nightly-2020-06-08-083627

memory-hog-pod got Evicted after 3m2s

$ oc get pod -o wide 
NAME             READY   STATUS    RESTARTS   AGE    IP       NODE                                         NOMINATED NODE   READINESS GATES
memory-hog-pod   0/1     Evicted   0          6m4s   <none>   ip-10-0-210-116.us-east-2.compute.internal   <none>           <none>

$ oc describe node ip-10-0-210-116.us-east-2.compute.internal
Name:               ip-10-0-210-116.us-east-2.compute.internal
Roles:              worker
...
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests      Limits
  --------                    --------      ------
  cpu                         547m (36%)    100m (6%)
  memory                      2027Mi (29%)  537Mi (7%)
  ephemeral-storage           0 (0%)        0 (0%)
  hugepages-1Gi               0 (0%)        0 (0%)
  hugepages-2Mi               0 (0%)        0 (0%)
  attachable-volumes-aws-ebs  0             0
Events:
  Type     Reason                     Age    From                                                 Message
  ----     ------                     ----   ----                                                 -------
  Warning  EvictionThresholdMet       2m40s  kubelet, ip-10-0-210-116.us-east-2.compute.internal  Attempting to reclaim memory
  Normal   NodeHasInsufficientMemory  2m33s  kubelet, ip-10-0-210-116.us-east-2.compute.internal  Node ip-10-0-210-116.us-east-2.compute.internal status is now: NodeHasInsufficientMemory

Comment 8 errata-xmlrpc 2020-07-01 16:08:20 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2589