Created attachment 1343750 [details] Graph of CRI-O memory usage over 7 runs Description of problem: There may be a potential memory/goroutine leak going on, as we see the memory usage consistently increase over time. Version-Release number of selected component (if applicable): atomic-openshift-3.7.0-0.143.7.git.0.ee33a8f.el7.x86_64 crio version 1.8.0-dev, commit: "9ec09fa3ae89a5cda511a95bf51e3a8856a66632" runc version 1.0.0-rc4+dev, commit: 74a17296470088de3805e138d3d87c62e613dfc4 How reproducible: Every time Steps to Reproduce: 1. Run scale test involving large number of pods 2. Record memory usage over time. 3. Repeat Steps 1-2. Actual results: CRI-O memory usage continues to grow over time and does not appear to stabilize. Expected results: CRI-O memory usage should remain the same or return to low usage after test completion.
Created attachment 1343751 [details] CRI-O /metrics output on initial process start (before)
Created attachment 1343752 [details] CRI-O /metrics output steady state after many scale runs (after)
would be awesome to get goroutine stack dump on this (if you enabled --profile, http://localhost:6060/debug/pprof/goroutine?debug=3 should give you goroutines dump). I'll try to reproduce this myself on my host by starting/stopping a bunch of pods but it's not reproducing.
Created attachment 1344276 [details] Tar of 5 run heap dumps & metrics We can see that there seem to be progressively more IO wait goroutines in the heap coming from potentially the cmux vendor package? Additionally the duration of the goroutines progressively increases. In the metrics we see all memory related metrics increasing run after run, the open file descriptors seem to be heavily tied to the goroutine count.
Is this version of openshift using cadvisor? Cause that could explain the cmux overhead. As for open file number, I opened a PR to fix the systemd unit as well.
I'll look at the attached files in the following hours.
Yes, we are using cadvisor CRI-O compatible version of OpenShift.
Alright, I think cmux isn't really performing well then, we might need a patch. I'll debug this further, do you have access to a cluster that exhibits this issue now?
Yes, I'll reach out to your privately and give you the details.
Reproduced on my host with kubernetes. We're leaking goroutines in our multiplexed grpc/http server. I already have a fix I'm testing.
The fix is here https://github.com/kubernetes-incubator/cri-o/pull/1082 Needs a cadvisor patch as well to change where to find the info endpoint: diff --git a/container/crio/client.go b/container/crio/client.go index b588552..9dcafaa 100644 --- a/container/crio/client.go +++ b/container/crio/client.go @@ -24,7 +24,7 @@ import ( ) const ( - CrioSocket = "/var/run/crio.sock" + CrioSocket = "/var/run/crio/info.sock" maxUnixSocketPathSize = len(syscall.RawSockaddrUnix{}.Path) )
https://github.com/kubernetes-incubator/cri-o/pull/1082#issuecomment-340312000
This is the PR that should actually fixing this issue: https://github.com/kubernetes-incubator/cri-o/pull/1090 Sebastian is testing it out.
Created attachment 1345564 [details] Graph of CRI-O memory usage over several runs (after fix) Looks like this issue has been resolved. (see graph for verification) Thanks!
cri-o v1.0.2 has been released with the fix for this BZ
@Sebastian Jug help verify the bug, thanks
From email communication result, Sebastian Jug is working on reproducing and verify the bug.
Verified on a 4 nodes OCP 3.9 cluster m4.xlarge instance types in AWS EC2, 1 master/etcd, 2 compute nodes, 1 infra node. Executed 4 backto-back Node Vertical tests deploying 500 pause-pods on each compute, and monitoring cri-o process memory usage with pidstat from pbench. cri-o RSS memory usage started around 74MB and did not exceed 130-150 MB after 500 pods were deployed on the the compute node. When the pausepods were deleted, cri-o process memory returned back to the 74MB range # runc --version runc version spec: 1.0.0 # oc describe node | grep Runtime Container Runtime Version: cri-o://1.9.2 # openshift version openshift v3.9.0-0.41.0 kubernetes v1.9.1+a0ce1bc657 etcd 3.2.8 pidstat memory usage graph attached. pbench data in next private comment.
Created attachment 1396074 [details] cri-o pidstat memory usage on 4th run deploying 500 pausepods
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489