Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1506651

Summary: [CRI-O] CRI-O increasing memory usage over time
Product: OpenShift Container Platform Reporter: Sebastian Jug <sejug>
Component: ContainersAssignee: Antonio Murdaca <amurdaca>
Status: CLOSED ERRATA QA Contact: Walid A. <wabouham>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.7.0CC: amurdaca, aos-bugs, bmcelvee, decarr, dma, jhonce, jokerman, mifiedle, mmccomas, sejug, wsun
Target Milestone: ---   
Target Release: 3.9.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: aos-scalability-37
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-28 14:08:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Graph of CRI-O memory usage over 7 runs
none
CRI-O /metrics output on initial process start (before)
none
CRI-O /metrics output steady state after many scale runs (after)
none
Tar of 5 run heap dumps & metrics
none
Graph of CRI-O memory usage over several runs (after fix)
none
cri-o pidstat memory usage on 4th run deploying 500 pausepods none

Description Sebastian Jug 2017-10-26 13:47:35 UTC
Created attachment 1343750 [details]
Graph of CRI-O memory usage over 7 runs

Description of problem:
There may be a potential memory/goroutine leak going on, as we see the memory usage consistently increase over time.

Version-Release number of selected component (if applicable):
atomic-openshift-3.7.0-0.143.7.git.0.ee33a8f.el7.x86_64
crio version 1.8.0-dev, commit: "9ec09fa3ae89a5cda511a95bf51e3a8856a66632"
runc version 1.0.0-rc4+dev, commit: 74a17296470088de3805e138d3d87c62e613dfc4

How reproducible:
Every time


Steps to Reproduce:
1. Run scale test involving large number of pods
2. Record memory usage over time.
3. Repeat Steps 1-2.


Actual results:
CRI-O memory usage continues to grow over time and does not appear to stabilize.

Expected results:
CRI-O memory usage should remain the same or return to low usage after test completion.

Comment 1 Sebastian Jug 2017-10-26 13:48:39 UTC
Created attachment 1343751 [details]
CRI-O /metrics output on initial process start (before)

Comment 2 Sebastian Jug 2017-10-26 13:49:09 UTC
Created attachment 1343752 [details]
CRI-O /metrics output steady state after many scale runs (after)

Comment 3 Antonio Murdaca 2017-10-26 15:18:20 UTC
would be awesome to get goroutine stack dump on this (if you enabled --profile, http://localhost:6060/debug/pprof/goroutine?debug=3 should give you goroutines dump). I'll try to reproduce this myself on my host by starting/stopping a bunch of pods but it's not reproducing.

Comment 4 Sebastian Jug 2017-10-27 12:13:32 UTC
Created attachment 1344276 [details]
Tar of 5 run heap dumps & metrics

We can see that there seem to be progressively more IO wait goroutines in the heap coming from potentially the cmux vendor package?

Additionally the duration of the goroutines progressively increases.

In the metrics we see all memory related metrics increasing run after run, the open file descriptors seem to be heavily tied to the goroutine count.

Comment 5 Antonio Murdaca 2017-10-27 12:18:49 UTC
Is this version of openshift using cadvisor? Cause that could explain the cmux overhead.

As for open file number, I opened a PR to fix the systemd unit as well.

Comment 6 Antonio Murdaca 2017-10-27 12:21:22 UTC
I'll look at the attached files in the following hours.

Comment 7 Sebastian Jug 2017-10-27 12:25:10 UTC
Yes, we are using cadvisor CRI-O compatible version of OpenShift.

Comment 8 Antonio Murdaca 2017-10-27 12:37:15 UTC
Alright, I think cmux isn't really performing well then, we might need a patch. I'll debug this further, do you have access to a cluster that exhibits this issue now?

Comment 9 Sebastian Jug 2017-10-27 13:20:21 UTC
Yes, I'll reach out to your privately and give you the details.

Comment 10 Antonio Murdaca 2017-10-27 22:01:45 UTC
Reproduced on my host with kubernetes. We're leaking goroutines in our multiplexed grpc/http server. I already have a fix I'm testing.

Comment 11 Antonio Murdaca 2017-10-27 22:14:40 UTC
The fix is here https://github.com/kubernetes-incubator/cri-o/pull/1082

Needs a cadvisor patch as well to change where to find the info endpoint:

diff --git a/container/crio/client.go b/container/crio/client.go
index b588552..9dcafaa 100644
--- a/container/crio/client.go
+++ b/container/crio/client.go
@@ -24,7 +24,7 @@ import (
 )

 const (
-       CrioSocket            = "/var/run/crio.sock"
+       CrioSocket            = "/var/run/crio/info.sock"
        maxUnixSocketPathSize = len(syscall.RawSockaddrUnix{}.Path)
 )

Comment 13 Antonio Murdaca 2017-10-30 16:48:08 UTC
This is the PR that should actually fixing this issue: https://github.com/kubernetes-incubator/cri-o/pull/1090

Sebastian is testing it out.

Comment 14 Sebastian Jug 2017-10-30 18:23:27 UTC
Created attachment 1345564 [details]
Graph of CRI-O memory usage over several runs (after fix)

Looks like this issue has been resolved. (see graph for verification)
Thanks!

Comment 16 Antonio Murdaca 2017-11-03 17:45:17 UTC
cri-o v1.0.2 has been released with the fix for this BZ

Comment 18 DeShuai Ma 2018-01-04 08:55:53 UTC
@Sebastian Jug help verify the bug, thanks

Comment 19 DeShuai Ma 2018-02-07 06:10:57 UTC
From email communication result, Sebastian Jug is working on reproducing and verify the bug.

Comment 20 Walid A. 2018-02-14 17:32:49 UTC
Verified on a 4 nodes OCP 3.9 cluster m4.xlarge instance types in AWS EC2, 1 master/etcd, 2 compute nodes, 1 infra node.

Executed 4 backto-back Node Vertical tests deploying 500 pause-pods on each compute, and monitoring cri-o process memory usage with pidstat from pbench.  cri-o RSS memory usage started around 74MB and did not exceed 130-150 MB after 500 pods were deployed on the the compute node.  When the pausepods were deleted, cri-o process memory returned back to the 74MB range

# runc --version
runc version spec: 1.0.0

# oc describe node | grep Runtime
 Container Runtime Version:  cri-o://1.9.2

# openshift version
openshift v3.9.0-0.41.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8

pidstat memory usage graph attached.

pbench data in next private comment.

Comment 22 Walid A. 2018-02-14 17:41:17 UTC
Created attachment 1396074 [details]
cri-o pidstat memory usage on 4th run deploying 500 pausepods

Comment 26 errata-xmlrpc 2018-03-28 14:08:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489