Bug 2002807

Summary: Increase in RSS memory in CRI-O and Kubelet
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: NodeAssignee: Peter Hunt <pehunt>
Node sub component: CRI-O QA Contact: MinLi <minmli>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: aos-bugs, kgordeev, minmli, nagrawal, rphillips, rsandu, rsevilla, schoudha, wking
Version: 4.9Keywords: Performance, TestBlocker
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.6.44 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-29 12:06:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2002806    
Bug Blocks:    

Comment 1 Peter Hunt 2021-09-09 18:58:13 UTC
introduced in 4.6.43, fixed in 4.6.44

Comment 4 MinLi 2021-09-14 09:44:41 UTC
I did a comparison between 4.6.44 and 4.9 nightly. Crio memory usage is in similar range.

$ oc get clusterversion 
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.44    True        False         113m    Cluster version is 4.6.44

$ oc get node 
NAME                                        STATUS   ROLES    AGE    VERSION
ip-10-0-48-134.us-east-2.compute.internal   Ready    worker   129m   v1.19.0+4c3480d
ip-10-0-60-90.us-east-2.compute.internal    Ready    master   139m   v1.19.0+4c3480d
ip-10-0-62-119.us-east-2.compute.internal   Ready    master   139m   v1.19.0+4c3480d
ip-10-0-77-184.us-east-2.compute.internal   Ready    master   139m   v1.19.0+4c3480d
ip-10-0-77-210.us-east-2.compute.internal   Ready    worker   128m   v1.19.0+4c3480d

$ oc debug node/ip-10-0-48-134.us-east-2.compute.internal
Starting pod/ip-10-0-48-134us-east-2computeinternal-debug ...
...
sh-4.4# ps aux | grep -e  "\/usr\/bin\/crio"
root        1459  2.6  0.6 2786464 104000 ?      Ssl  07:29   3:31 /usr/bin/crio --enable-metrics=true --metrics-port=9537

sh-4.4# ps -p 1459 -o pid,rss,vsz,cmd
    PID   RSS    VSZ CMD
   1459 104056 2786464 /usr/bin/crio --enable-metrics=true --metrics-port=9537



$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-09-01-193941   True        False         125m    Cluster version is 4.9.0-0.nightly-2021-09-01-193941

$ oc get nodes
NAME                                          STATUS   ROLES    AGE    VERSION
ip-10-0-134-113.ap-south-1.compute.internal   Ready    worker   157m   v1.22.0-rc.0+bbcc9ae
ip-10-0-136-105.ap-south-1.compute.internal   Ready    master   168m   v1.22.0-rc.0+bbcc9ae
ip-10-0-170-97.ap-south-1.compute.internal    Ready    master   168m   v1.22.0-rc.0+bbcc9ae
ip-10-0-183-52.ap-south-1.compute.internal    Ready    worker   157m   v1.22.0-rc.0+bbcc9ae
ip-10-0-193-186.ap-south-1.compute.internal   Ready    worker   156m   v1.22.0-rc.0+bbcc9ae
ip-10-0-201-41.ap-south-1.compute.internal    Ready    master   168m   v1.22.0-rc.0+bbcc9ae

$ oc debug node/ip-10-0-134-113.ap-south-1.compute.internal
Starting pod/ip-10-0-134-113ap-south-1computeinternal-debug ...
...

sh-4.4# ps -p 1295 -o pid,rss,vsz,cmd 
    PID   RSS    VSZ CMD
   1295 111392 2028600 /usr/bin/crio

Comment 5 MinLi 2021-09-15 04:16:29 UTC
*** Bug 2002809 has been marked as a duplicate of this bug. ***

Comment 8 errata-xmlrpc 2021-09-29 12:06:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.46 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3643