Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1810136

Summary: [4.2] A pod that gradually leaks memory causes node to become unreachable for 10 minutes
Product: OpenShift Container Platform Reporter: Ryan Phillips <rphillips>
Component: NodeAssignee: Ryan Phillips <rphillips>
Status: CLOSED ERRATA QA Contact: MinLi <minmli>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: alklein, aos-bugs, brad.williams, ccoleman, cfillekes, chanphil, christian.lapolt, hannsj_uhl, Holger.Wolf, jerzhang, jokerman, jshepherd, krmoser, lakshmi.ravichandran1, lmohanty, minmli, mwoodson, nbziouec, rcgingra, rphillips, schoudha, tdale, wvoesch
Target Milestone: ---Keywords: Reopened, Upgrades
Target Release: 4.2.z   
Hardware: s390x   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1808429 Environment:
Last Closed: 2020-07-01 16:08:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1808429    
Bug Blocks: 1765215, 1766237, 1801826, 1801829, 1802687    

Comment 1 Ryan Phillips 2020-03-04 19:47:02 UTC
*** Bug 1795185 has been marked as a duplicate of this bug. ***

Comment 2 Ryan Phillips 2020-03-26 15:36:12 UTC
*** Bug 1802639 has been marked as a duplicate of this bug. ***

Comment 6 MinLi 2020-06-10 10:30:00 UTC
verified with version : 4.4.0-0.nightly-2020-06-08-083627

memory-hog-pod got Evicted after 3m2s

$ oc get pod -o wide 
NAME             READY   STATUS    RESTARTS   AGE    IP       NODE                                         NOMINATED NODE   READINESS GATES
memory-hog-pod   0/1     Evicted   0          6m4s   <none>   ip-10-0-210-116.us-east-2.compute.internal   <none>           <none>

$ oc describe node ip-10-0-210-116.us-east-2.compute.internal
Name:               ip-10-0-210-116.us-east-2.compute.internal
Roles:              worker
...
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests      Limits
  --------                    --------      ------
  cpu                         547m (36%)    100m (6%)
  memory                      2027Mi (29%)  537Mi (7%)
  ephemeral-storage           0 (0%)        0 (0%)
  hugepages-1Gi               0 (0%)        0 (0%)
  hugepages-2Mi               0 (0%)        0 (0%)
  attachable-volumes-aws-ebs  0             0
Events:
  Type     Reason                     Age    From                                                 Message
  ----     ------                     ----   ----                                                 -------
  Warning  EvictionThresholdMet       2m40s  kubelet, ip-10-0-210-116.us-east-2.compute.internal  Attempting to reclaim memory
  Normal   NodeHasInsufficientMemory  2m33s  kubelet, ip-10-0-210-116.us-east-2.compute.internal  Node ip-10-0-210-116.us-east-2.compute.internal status is now: NodeHasInsufficientMemory

Comment 8 errata-xmlrpc 2020-07-01 16:08:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2589