Bug 1488509 - OCP 3.6: openshift_logging_es_cpu_limit value not honored
Summary: OCP 3.6: openshift_logging_es_cpu_limit value not honored
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.6.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.7.0
Assignee: Jeff Cantrill
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-05 14:38 UTC by Mike Fiedler
Modified: 2017-10-16 18:55 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-16 18:55:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mike Fiedler 2017-09-05 14:38:23 UTC
Description of problem:

Deployed logging with the byo/openshift-cluster/openshift-logging.yml playbook using an inventory which specified

openshift_logging_es_cpu_limit=4000m
openshift_logging_es_memory_limit=9Gi

After the install, the elasticsearch deployment config contained:

        resources:
          limits:
            cpu: "1"
            memory: 9Gi

The memory limit was honored and the cpu limit was not.

Version-Release number of selected component (if applicable): Logging v3.6.173.0.27.   openshift-ansible is 3.6.173.0.27-1.git.0.be1701e.el7


How reproducible:  Always


Steps to Reproduce:
1.  Deploy logging with the inventory below, adjusting hostnames as needed
2.  oc get dc to list the elasticsearch deployment configs
3.  oc get dc <dc-name> -o yaml

Actual results:

          limits:
            cpu: "1"
            memory: 9Gi

Expected results:

          limits:
            cpu: "4"
            memory: 9Gi

Additional info:

[oo_first_master]
ip-172-31-10-130

[oo_first_master:vars]
openshift_deployment_type=openshift-enterprise
openshift_release=v3.6.0

openshift_logging_install_logging=true
openshift_logging_master_url=https://ec2-54-191-194-206.us-west-2.compute.amazonaws.com:8443
openshift_logging_master_public_url=https://ec2-54-191-194-206.us-west-2.compute.amazonaws.com:8443
openshift_logging_kibana_hostname=kibana.0901-t58.qe.rhcloud.com
openshift_logging_namespace=logging
openshift_logging_image_prefix=registry.ops.openshift.com/openshift3/
openshift_logging_image_version=v3.6.173.0.27
openshift_logging_es_cluster_size=3
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_size=40Gi
openshift_logging_fluentd_use_journal=true
openshift_logging_use_mux=true
openshift_logging_mux_client_mode=maximal
openshift_logging_use_ops=false

openshift_logging_es_cpu_limit=4000m
openshift_logging_fluentd_cpu_limit=500m
openshift_logging_mux_cpu_limit=1000m
openshift_logging_kibana_cpu_limit=200m
openshift_logging_kibana_proxy_cpu_limit=100m
openshift_logging_es_memory_limit=9Gi
openshift_logging_fluentd_memory_limit=512Mi
openshift_logging_mux_memory_limit=2Gi
openshift_logging_kibana_memory_limit=1Gi
openshift_logging_kibana_proxy_memory_limit=256Mi

openshift_logging_mux_file_buffer_storage_type=pvc
openshift_logging_mux_file_buffer_pvc_name=logging-mux-pvc
openshift_logging_mux_file_buffer_pvc_size=4Gi




Description of problem:

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Noriko Hosoi 2017-09-05 20:47:34 UTC
(In reply to Mike Fiedler from comment #0)
> 
> openshift_logging_es_cpu_limit=4000m
> openshift_logging_es_memory_limit=9Gi

Could you try these parameters instead?

openshift_logging_elasticsearch_cpu_limit
openshift_logging_elasticsearch_memory_limit

Comment 2 Mike Fiedler 2017-09-05 20:51:38 UTC
I will try those on the next deployment.  Putting this link here for reference in case this needs to be a doc update:

https://docs.openshift.com/container-platform/3.6/install_config/aggregate_logging.html#aggregate-logging-ansible-variables

Comment 3 Noriko Hosoi 2017-09-05 21:15:50 UTC
Sorry, Mike.  You are right (and please ignore my comment #c1).

The Elasticsearch CPU limit is not allowed to update.  The doc should be updated that way?

(openshift-ansible/roles/openshift_logging_elasticsearch/defaults/main.yml)
openshift_logging_elasticsearch_cpu_limit: 1000m
openshift_logging_elasticsearch_memory_limit: \
  "{{ openshift_logging_es_memory_limit | default('1Gi') }}"

Comment 4 Mike Fiedler 2017-09-05 23:16:34 UTC
I'd prefer it to be configurable but will defer to others who might know a reason to hard code it.   According to the ES docs [1] the thread pool is dynamically adjusted based on the available processors and I've seen some ingestion improvements by raising the limit > 1.    I hope to have it quantified soon.

[1] - https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html

Comment 5 Peter Portante 2017-09-09 19:47:00 UTC
Why do we want to cap the CPU limit?  And what are we going to put in "requests"?  We should allow the user to specify the minimum CPU that ES should get, and support capping it, but by default I would recommend we don't cap it.

Comment 6 Jeff Cantrill 2017-10-16 18:55:16 UTC
Closing this issue as we intent to remove cpu limits to allow the cluster to provide infra components with as much cpu as it needed: https://github.com/openshift/openshift-ansible/pull/5748


Note You need to log in before you can comment on or make changes to this bug.