1441369 – Kibana does not limit its heap size

Bug 1441369 - Kibana does not limit its heap size

Summary: Kibana does not limit its heap size

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	3.4.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	3.4.z
Assignee:	Jeff Cantrill
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-04-11 19:35 UTC by Wesley Hearn
Modified:	2018-04-18 05:58 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2017-05-18 09:28:31 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
kibana dc, pods info (33.95 KB, text/plain) 2017-05-02 02:57 UTC, Junqi Zhao	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:1235	0	normal	SHIPPED_LIVE	OpenShift Container Platform 3.5, 3.4, 3.3, and 3.1 bug fix update	2017-05-18 13:15:52 UTC

Description Wesley Hearn 2017-04-11 19:35:45 UTC

Description of problem:
Kibana does not pass a limit to NodeJS to limit the heap size so if you setup a memory limit around Kibana it will continue to grow its memory usage and end up getting OOM Killed by OpenShift/cgroups.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Deploy logging 3.4 and later
2. Setup containers.resources.limits.memory on the kibana pod(not kibana-proxy)
3. Wait a couple of hours and notice it gets OOM Killed

Actual results:
Kibana gets OOM Killed

Expected results:
Kibana not get OOM Killed

Additional info:
In my research I came across this github issue that covers the details of what we are hitting https://github.com/elastic/kibana/issues/5170#issuecomment-157655647

Comment 1 Rich Megginson 2017-04-11 19:40:23 UTC

One way to fix this is to add the ability to set the NODE_OPTIONS in the ocp kibana dc.  You can already do this in the origin kibana, but the ocp and origin kibana run.sh are out of sync.

Another way would be to add explicit tuning parameters for max-old-space-size and other kibana/nodejs options we think would be useful to expose to users.

How else might we solve this?

Comment 3 Junqi Zhao 2017-05-02 02:56:49 UTC

@Rich,

Verified with the latest images, checked kibana dc, pods info and pods log, the 
memory limit for kibana pods is 736Mi, for kibana-proxy is 96Mi, I am curious about the values, are we use them as default values?

More info please see the attached file.
*******************************************************************************
# oc get dc ${KIBANA_DC} -o yaml | grep resources -A 5 -B 5
  replicas: 1
  selector:
    component: kibana
    provider: openshift
  strategy:
    resources: {}
    rollingParams:
      intervalSeconds: 1
      maxSurge: 25%
      maxUnavailable: 25%
      timeoutSeconds: 600
--
              divisor: "0"
              resource: limits.memory
        image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-kibana:3.4.1
        imagePullPolicy: Always
        name: kibana
        resources:
          limits:
            memory: 736Mi
        terminationMessagePath: /dev/termination-log
        volumeMounts:
        - mountPath: /etc/kibana/keys
--
        name: kibana-proxy
        ports:
        - containerPort: 3000
          name: oaproxy
          protocol: TCP
        resources:
          limits:
            memory: 96Mi
        terminationMessagePath: /dev/termination-log


# oc get pod ${KIBANA_POD} -o yaml | grep resources -A 5 -B 5
          divisor: "0"
          resource: limits.memory
    image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-kibana:3.4.1
    imagePullPolicy: Always
    name: kibana
    resources:
      limits:
        memory: 736Mi
      requests:
        memory: 736Mi
    securityContext:
--
    name: kibana-proxy
    ports:
    - containerPort: 3000
      name: oaproxy
      protocol: TCP
    resources:
      limits:
        memory: 96Mi
      requests:
        memory: 96Mi
    securityContext:

*******************************************************************************

# docker images | grep logging
openshift3/logging-kibana          3.4.1               8030fdf2193c        2 days ago          338.8 MB
openshift3/logging-deployer        3.4.1               8a54858599c2        2 days ago          857.5 MB
openshift3/logging-auth-proxy      3.4.1               f2750505bbf8        3 days ago          215 MB
openshift3/logging-elasticsearch   3.4.1               35b49fb0d73f        4 days ago          399.6 MB
openshift3/logging-fluentd         3.4.1               284080ecaf28        9 days ago          232.7 MB
openshift3/logging-curator         3.4.1               b8da2d97e305        9 days ago          244.5 MB

Comment 4 Junqi Zhao 2017-05-02 02:57:29 UTC

Created attachment 1275542 [details]
kibana dc, pods info

Comment 5 Junqi Zhao 2017-05-02 07:54:30 UTC

kibana pod log also shows max-old-space-size is 736Mi
#oc logs logging-kibana-1-g5ms9 -c kibana
Using NODE_OPTIONS: '--max-old-space-size=736' Memory setting is in MB

Comment 6 Rich Megginson 2017-05-02 13:21:57 UTC

 > Verified with the latest images, checked kibana dc, pods info and pods log, the 
memory limit for kibana pods is 736Mi, for kibana-proxy is 96Mi, I am curious about the values, are we use them as default values?

Yes, these are the default values.  @jcantrill can provide further explanation if needed.

Comment 7 Junqi Zhao 2017-05-03 00:13:39 UTC

Set it to VERIFIED according to Comment 3 and Comment 5.

Comment 8 Junqi Zhao 2017-05-03 00:23:46 UTC

@jcantrill

If you have time, could you please share details why we set default value as 736Mi for kibana container, and 96Mi for kibana-proxy container, since we usually set such values as n times of 128Mi

Comment 9 Wesley Hearn 2017-05-03 13:01:04 UTC

@Junqi Zhao

These values come from OPs. Since we have limited memory on the infra nodes it is a jigsaw puzzle getting everything running on them. Those values are 32Mi less then that OPs have set as the resource limits for Kibana.
https://github.com/openshift/openshift-tools/blob/prod/ansible/roles/openshift_logging/tasks/main.yml#L319

Comment 10 Junqi Zhao 2017-05-04 07:31:09 UTC

(In reply to Wesley Hearn from comment #9)
> @Junqi Zhao
> 
> These values come from OPs. Since we have limited memory on the infra nodes
> it is a jigsaw puzzle getting everything running on them. Those values are
> 32Mi less then that OPs have set as the resource limits for Kibana.
> https://github.com/openshift/openshift-tools/blob/prod/ansible/roles/
> openshift_logging/tasks/main.yml#L319

Thanks for your info, I have one question to bother you.
From the following lines
**********************************************************************
content:
      # kibana
      spec.template.spec.containers[0].resources.limits.memory: "768M"
      spec.template.spec.containers[0].resources.requests.memory: "96M"
      # kibana-proxy
      spec.template.spec.containers[1].resources.limits.memory: "128M"
      spec.template.spec.containers[1].resources.requests.memory: "32M"
**********************************************************************

I think the memory limit for kibana container is 768M, for kibana-proxy is 128M.
The request memory for kibana container should be 96M, for kibana-proxy is 32M.

We should see following info in kibana dc 
#########################################################################
        name: kibana
        resources:
          limits:
            memory: 768Mi
          requests:
            memory: 96Mi

--
        name: kibana-proxy
        resources:
          limits:
            memory: 128Mi
          requests:
            memory: 32Mi
#########################################################################

But I see the followings in kibana dc, the memory limit is not 768Mi and 128Mi,
and missed the resources.reqeust.memory

        name: kibana
        resources:
          limits:
            memory: 736Mi

--
        name: kibana-proxy
        resources:
          limits:
            memory: 96Mi
#########################################################################

Comment 11 Wesley Hearn 2017-05-04 13:32:00 UTC

The cluster you tested on is not an OPs cluster, as ours cannot access brew-pulp-docker01. So I cannot speak to anything about that cluster or how it was setup.

Comment 12 Junqi Zhao 2017-05-04 15:11:42 UTC

@Wesley Hearn,
Thanks a lot for your help.

@jeff

Could you help to check Comment 10, maybe my understand is wrong.

Thanks

Comment 13 openshift-github-bot 2017-05-12 19:20:33 UTC

Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/ba4c43fe61ca40b347a9f75891ba67ab36465871
bug 1441369. Kibana memory limits
bug 1439451. Kibana crash

(cherry picked from commit 66315ebbfcfda72d6f501c441359d92ec71af7d2)

Comment 15 errata-xmlrpc 2017-05-18 09:28:31 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1235

Note You need to log in before you can comment on or make changes to this bug.