Bug 1370115

Summary: ElasticSearch should auto tune its memory usage based on the container limit
Product: OpenShift Container Platform Reporter: Dan McPherson <dmcphers>
Component: LoggingAssignee: Jeff Cantrill <jcantril>
Status: CLOSED ERRATA QA Contact: chunchen <chunchen>
Severity: high Docs Contact:
Priority: high    
Version: 3.3.0CC: aos-bugs, jcantril, lvlcek, penli, rmeggins, tdawson, wsun, xiazhao
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: Auto-tune Elasticsearch memory heap usage based on container limit Reason: Elasticsearch recommends hard limits for proper usage and these limits may significantly exceed what is available to the container. Elasticsearch should limit itself from the onset. Result: The container runscript evaluates the available memory and sets the min and max heap size.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-27 09:46:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan McPherson 2016-08-25 11:04:09 UTC
You can accomplish this in a couple of ways:

1) Read from the cgroup value directly

CONTAINER_MEMORY_IN_BYTES=`cat /sys/fs/cgroup/memory/memory.limit_in_bytes`


or 2) Use the downward api

env:
  - name: MEM_LIMIT
    valueFrom:
      resourceFieldRef:
        resource: limits.memory

A typical approach is to tune the heap size to 60-70% of the limit value to allow room for native memory from the JVM.

Comment 1 Lukas Vlcek 2016-08-25 11:27:58 UTC
Elasticsearch recommendation for heap sizing can be found here: https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html#_give_less_than_half_your_memory_to_lucene

(i.e. giving JVM more than 50% is not optimal)

On top of that, I am not aware what is the range of limits.memory value in practice, but it would make sense to not go below 1GB and over 32GB. At least it would be good to log warning if it gets outside of these limits.

Comment 2 Dan McPherson 2016-08-25 11:47:38 UTC
@Lukas

There isn't any issue tuning to a recommended value.  We just need to make sure the container constrains itself within its specified limits.

Comment 3 Rich Megginson 2016-08-25 13:20:32 UTC
(In reply to Dan McPherson from comment #0)
> You can accomplish this in a couple of ways:
> 
> 1) Read from the cgroup value directly
> 
> CONTAINER_MEMORY_IN_BYTES=`cat /sys/fs/cgroup/memory/memory.limit_in_bytes`
> 

I suppose we could do this in the elasticsearch container run.sh?

> 
> or 2) Use the downward api
> 
> env:
>   - name: MEM_LIMIT
>     valueFrom:
>       resourceFieldRef:
>         resource: limits.memory
> 
> A typical approach is to tune the heap size to 60-70% of the limit value to
> allow room for native memory from the JVM.

Comment 5 openshift-github-bot 2016-08-25 22:56:44 UTC
Commit pushed to master at https://github.com/openshift/origin-aggregated-logging

https://github.com/openshift/origin-aggregated-logging/commit/efb942bd74622cd07c0be63344f4225a45405e32
adjust heap memory based on that allowed by container or recommended max

fixes bug 1370115 https://bugzilla.redhat.com/show_bug.cgi?id=1370115 by:
* comparing the requested memory to the max recommended by Elastic and
  that which is available to the container

Comment 6 Rich Megginson 2016-08-25 23:02:31 UTC
To ssh://rmeggins.redhat.com/rpms/logging-elasticsearch-docker
   0393033..311801e  rhaos-3.3-rhel-7 -> rhaos-3.3-rhel-7

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11661661

repositories = brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch:rhaos-3.3-rhel-7-docker-candidate-20160825225109, brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch:3.3.0-5, brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch:3.3.0, brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch:latest
osbs_build_id = logging-elasticsearch-docker-unknown-8

Comment 7 Peng Li 2016-08-26 09:07:20 UTC
image is not sync yet, will verify once it's ready.

Comment 8 Peng Li 2016-08-29 10:07:12 UTC
verified, test env list as below.

Comment 11 Xia Zhao 2016-08-31 02:47:31 UTC
See these lines with recent es pod (when --from-literal es-instance-ram=1G is set to logging deployer configmap):

$ oc logs -f logging-es-uc5hyl3m-1-ff3s2
Comparing the specificed RAM to the maximum recommended for ElasticSearch...
Inspecting the maximum RAM available...
ES_JAVA_OPTS: '-Des.path.home=/usr/share/java/elasticsearch -Des.config=/usr/share/java/elasticsearch/config/elasticsearch.yml -Xms128M -Xmx512m'

Comment 13 errata-xmlrpc 2016-09-27 09:46:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933