Bug 1755584

Summary: Memory leak CRIO due to no garbage collection in /run/crio/exits for exited containers [4.1.z]
Product: OpenShift Container Platform Reporter: Derrick Ornelas <dornelas>
Component: ContainersAssignee: Mrunal Patel <mpatel>
Status: CLOSED ERRATA QA Contact: weiwei jiang <wjiang>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.zCC: aos-bugs, bbreard, dahernan, dornelas, dustymabe, dwalsh, imcleod, jforrest, jligon, jokerman, mpatel, nstielau, schoudha
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1755044 Environment:
Last Closed: 2019-11-07 06:32:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1755044    
Bug Blocks: 1186913    

Comment 3 Jessica Forrester 2019-10-22 14:43:37 UTC
I don't think this got picked up correctly by the BZ automation. If all fixes related to this are merged, please move the BZ to MODIFIED so the errata sweep will pick it up.

Comment 4 Derrick Ornelas 2019-10-22 16:13:42 UTC
Checking 4.1.z

# yum list cri-o --enablerepo=rhel-7-server-ose-4.1-rpms
Loaded plugins: product-id, search-disabled-repos, subscription-manager
Available Packages
cri-o.x86_64                                        1.13.11-0.10.dev.rhaos4.1.gitbdeb2ca.el7                                        rhel-7-server-ose-4.1-rpms  



$ git clone -q https://github.com/cri-o/cri-o.git && cd cri-o/

$ git checkout -b 1.13.11-0.10.dev.rhaos4.1.gitbdeb2ca.el7 bdeb2ca
Switched to a new branch '1.13.11-0.10.dev.rhaos4.1.gitbdeb2ca.el7'

$ git branch --contains c1333348b80cae27afaa13a5ee0ced89970ef109
* 1.13.11-0.10.dev.rhaos4.1.gitbdeb2ca.el7

$ git branch --contains 6e1cdec42aa9abca5f8b85a280ca9a0fff9c4ad8
* 1.13.11-0.10.dev.rhaos4.1.gitbdeb2ca.el7


It looks like the fixes from https://github.com/cri-o/cri-o/pull/2855 are in there

Comment 6 weiwei jiang 2019-10-29 08:36:14 UTC
Checked with cri-o-1.13.11-0.14.dev.rhaos4.1.git3338d4d.el8.x86_64. Verified.


[root@ip-10-0-154-235 /]# rpm-ostree status                                                                                                                                                                                                   
State: idle                                                                                                                                                                                                                                   
AutomaticUpdates: disabled                                                                                                                                                                                                                    
Deployments:                                                                                                                                                                                                                                  
● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1a9fb9cbf18134339011c130c571130c7a68040711d7e738710d5f3979f9d76e                                                                                                              
              CustomOrigin: Managed by pivot tool                                                                                                                                                                                             
                   Version: 410.8.20191025.0 (2019-10-25T14:09:05Z)     

➜  ~ oc get nodes -o name | xargs -n 1 -I {} oc debug {} -- chroot /host bash -c 'ls /run/crio/exits | wc -l' 2>/dev/null
6
34
37
5
62
➜  ~ oc get nodes -o wide 
NAME                                         STATUS   ROLES    AGE    VERSION             INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                   KERNEL-VERSION                CONTAINER-RUNTIME
ip-10-0-139-94.us-east-2.compute.internal    Ready    worker   153m   v1.13.4+493dbf621   10.0.139.94    <none>        Red Hat Enterprise Linux CoreOS 410.8.20191025.0 (Ootpa)   4.18.0-80.11.2.el8_0.x86_64   cri-o://1.13.11-0.14.dev.rhaos
4.1.git3338d4d.el8-dev
ip-10-0-141-60.us-east-2.compute.internal    Ready    master   158m   v1.13.4+493dbf621   10.0.141.60    <none>        Red Hat Enterprise Linux CoreOS 410.8.20191025.0 (Ootpa)   4.18.0-80.11.2.el8_0.x86_64   cri-o://1.13.11-0.14.dev.rhaos
4.1.git3338d4d.el8-dev
ip-10-0-151-11.us-east-2.compute.internal    Ready    master   159m   v1.13.4+493dbf621   10.0.151.11    <none>        Red Hat Enterprise Linux CoreOS 410.8.20191025.0 (Ootpa)   4.18.0-80.11.2.el8_0.x86_64   cri-o://1.13.11-0.14.dev.rhaos
4.1.git3338d4d.el8-dev
ip-10-0-154-235.us-east-2.compute.internal   Ready    worker   153m   v1.13.4+493dbf621   10.0.154.235   <none>        Red Hat Enterprise Linux CoreOS 410.8.20191025.0 (Ootpa)   4.18.0-80.11.2.el8_0.x86_64   cri-o://1.13.11-0.14.dev.rhaos
4.1.git3338d4d.el8-dev
ip-10-0-174-97.us-east-2.compute.internal    Ready    master   159m   v1.13.4+493dbf621   10.0.174.97    <none>        Red Hat Enterprise Linux CoreOS 410.8.20191025.0 (Ootpa)   4.18.0-80.11.2.el8_0.x86_64   cri-o://1.13.11-0.14.dev.rhaos
4.1.git3338d4d.el8-dev
➜  ~ for i in {1..10}; do echo ==$i; oc scale dc/h --replicas=10; sleep 10; oc scale dc/h --replicas=1; sleep 10; done 
==1
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==2
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==3
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==4
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==5
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==6
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==7
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==8
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==9
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
==10
deploymentconfig.apps.openshift.io/h scaled
deploymentconfig.apps.openshift.io/h scaled
➜  ~ oc get nodes -o name | xargs -n 1 -I {} oc debug {} -- chroot /host bash -c 'ls /run/crio/exits | wc -l' 2>/dev/null
4
32
35
3
60

Comment 8 errata-xmlrpc 2019-11-07 06:32:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3294