Description of problem: Customer is facing an issue with the SystemMemoryExceedsReservation alert on worker nodes. How reproducible: Everytime Actual results: Kubelet consuming high memory around ~40GB. Expected results: SystemMemoryExceedsReservation alert after increasing the System Reserved Memory to 9GB should be gone Additional info: We increased the System Reserved Memory to 9GB as per the KCS[0] and documentation[1]. Even after increasing the System Reserved Memory, we found that the Kubelet on all the worker nodes is consuming high memory. ~~~ [core@ocpnonprod-xxxx-worker-xxxx ~]$ top top - 09:48:50 up 21:45, 1 user, load average: 7.69, 7.79, 8.49 Tasks: 379 total, 1 running, 377 sleeping, 0 stopped, 1 zombie %Cpu(s): 74.7 us, 16.5 sy, 0.2 ni, 7.8 id, 0.0 wa, 0.7 hi, 0.2 si, 0.0 st MiB Mem : 128919.9 total, 68571.9 free, 45363.9 used, 14984.1 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 82294.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1976 root 20 0 41.5g 37.2g 71340 S 614.6 29.5 2000:12 kubelet ------------------------------------------------------------------------------ [root@ocpnonprod-xxxx-worker-xxxx /]# top top - 09:49:47 up 22:19, 1 user, load average: 15.75, 13.10, 14.09 Tasks: 1058 total, 7 running, 1048 sleeping, 0 stopped, 3 zombie %Cpu(s): 64.6 us, 28.2 sy, 0.2 ni, 5.2 id, 0.0 wa, 1.0 hi, 0.8 si, 0.0 st MiB Mem : 128919.9 total, 58637.4 free, 50217.3 used, 20065.1 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 78033.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1980 root 20 0 46.9g 41.8g 70724 S 522.0 33.2 1987:18 kubelet ------------------------------------------------------------------------------ [root@ocpnonprod-xxxx-worker-xxxx ~]# top top - 09:50:27 up 21:51, 1 user, load average: 8.48, 9.43, 11.14 Tasks: 507 total, 1 running, 505 sleeping, 0 stopped, 1 zombie %Cpu(s): 7.7 us, 64.4 sy, 0.4 ni, 25.9 id, 0.0 wa, 1.1 hi, 0.5 si, 0.0 st MiB Mem : 128919.9 total, 58985.9 free, 53438.4 used, 16495.5 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 74253.3 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1973 root 20 0 50.1g 43.9g 71068 S 198.0 34.9 2236:23 kubelet ~~~ [0] https://access.redhat.com/solutions/5843241 [1] https://docs.openshift.com/container-platform/4.8/nodes/nodes/nodes-nodes-resources-configuring.html#nodes-nodes-resources-configuring-auto_nodes-nodes-resources-configuring
Hello Team, I have a customer who is facing a similar issue in his RHOCP v4.8.14 cluster. The customer is getting a "SystemMemoryExceedsReservation" warning on all the master and worker nodes of the cluster even well after configuring the reservation to 12G. The customer has shared a must-gather which I will share on this Bugzilla. Let me know if more information is needed from the customer's environment for further analysis. Regards, Mridul Markandey
*** Bug 2056502 has been marked as a duplicate of this bug. ***
Verified on 4.8.34
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.34 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0795
*** Bug 2067292 has been marked as a duplicate of this bug. ***