Commit pushed to master at https://github.com/openshift/origin-aggregated-logging https://github.com/openshift/origin-aggregated-logging/commit/4cb131929ba887aaea840c5357ad06e2fb750929 bug 1468987: kibana OOM The javascript engine V8 used by nodejs has heap split to 4 different spaces. Setting `max_old_space_size` to half of what the container has available so other heap spaces have some available memory. This should prevent the container from getting OOM killed. The issue occured originally with kibana-proxy but since both use nodejs, it is fixed here as well as a preventative measure.
See BZ https://bugzilla.redhat.com/show_bug.cgi?id=1465464 to track Kibana container restarts.
Verified with this command: $ for i in {1..300}; do curl --fail --max-time 10 -H "Authorization: Bearer `oc whoami -t`" https://{kibana-route}/elasticsearch/ -sk > /dev/null; done run it twice from oc client side, also tested with the {kibana-ops-route} twice $ for i in {1..300}; do curl --fail --max-time 10 -H "Authorization: Bearer `oc whoami -t`" https://{kibana-ops-route}/elasticsearch/ -sk > /dev/null; done Checked kibana pods' status are running, not OOMKilled,: # oc get po NAME READY STATUS RESTARTS AGE logging-curator-1-fw4z5 1/1 Running 0 10h logging-curator-ops-1-v87jy 1/1 Running 0 10h logging-deployer-jnvkm 0/1 Completed 0 10h logging-es-e5cm1fku-1-ydbmi 1/1 Running 0 10h logging-es-ops-8tly0jj8-1-zemjy 1/1 Running 0 10h logging-fluentd-5tayy 1/1 Running 0 10h logging-kibana-1-tfc65 2/2 Running 6 10h logging-kibana-ops-1-ctp3q 2/2 Running 5 10h kibana and kibana-ops pods restarted many times, because of kibana and kibana-proxy reach to OOMKilled, maybe it's related to https://bugzilla.redhat.com/show_bug.cgi?id=1465464. Containers: kibana: ........................... Port: Limits: memory: 736Mi Requests: memory: 736Mi State: Running Started: Wed, 26 Jul 2017 21:28:06 -0400 Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Wed, 26 Jul 2017 16:10:04 -0400 Finished: Wed, 26 Jul 2017 21:28:03 -0400 Ready: True Restart Count: 2 kibana-proxy: ........................... Port: 3000/TCP Limits: memory: 96Mi Requests: memory: 96Mi State: Running Started: Wed, 26 Jul 2017 15:01:22 -0400 Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Wed, 26 Jul 2017 12:10:45 -0400 Finished: Wed, 26 Jul 2017 15:01:20 -0400 Ready: True Restart Count: 3
Commits pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/711ba660dcfca9bb3739a5e45c4bc9a5f1e75cc1 bug 1468987: kibana_proxy OOM We currently set the memory allocated to the kibana-proxy container to be the same as `max_old_space_size` for nodejs. But in V8, the heap consists of multiple spaces. The old space has only memory ready to be GC and measuring the used heap by kibana-proxy code, there is at least additional 32MB needed in the code space when `max_old_space_size` peaks. Setting the default memory limit to 256MB here and also changing the default calculation of `max_old_space_size` in the image repository to be only half of what the container receives to allow some heap for other `spaces`. https://github.com/openshift/openshift-ansible/commit/099835cfd928e0bccf8c298d197ca06960bf954a Merge pull request #4761 from wozniakjan/logging_kibana_oom bug 1468987: kibana_proxy OOM
Don't we also need a fix to half max-old-space-size for the kibana-proxy as well?
I do not see that fix in the version of the auth-proxy on which we depend though it is in the upstream repo. Given we are able to validate the increased memory resolves the issue I am hesitant to do anything else at this time.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1828