Description of problem: Deployed logging with the byo/openshift-cluster/openshift-logging.yml playbook using an inventory which specified openshift_logging_es_cpu_limit=4000m openshift_logging_es_memory_limit=9Gi After the install, the elasticsearch deployment config contained: resources: limits: cpu: "1" memory: 9Gi The memory limit was honored and the cpu limit was not. Version-Release number of selected component (if applicable): Logging v3.6.173.0.27. openshift-ansible is 3.6.173.0.27-1.git.0.be1701e.el7 How reproducible: Always Steps to Reproduce: 1. Deploy logging with the inventory below, adjusting hostnames as needed 2. oc get dc to list the elasticsearch deployment configs 3. oc get dc <dc-name> -o yaml Actual results: limits: cpu: "1" memory: 9Gi Expected results: limits: cpu: "4" memory: 9Gi Additional info: [oo_first_master] ip-172-31-10-130 [oo_first_master:vars] openshift_deployment_type=openshift-enterprise openshift_release=v3.6.0 openshift_logging_install_logging=true openshift_logging_master_url=https://ec2-54-191-194-206.us-west-2.compute.amazonaws.com:8443 openshift_logging_master_public_url=https://ec2-54-191-194-206.us-west-2.compute.amazonaws.com:8443 openshift_logging_kibana_hostname=kibana.0901-t58.qe.rhcloud.com openshift_logging_namespace=logging openshift_logging_image_prefix=registry.ops.openshift.com/openshift3/ openshift_logging_image_version=v3.6.173.0.27 openshift_logging_es_cluster_size=3 openshift_logging_es_pvc_dynamic=true openshift_logging_es_pvc_size=40Gi openshift_logging_fluentd_use_journal=true openshift_logging_use_mux=true openshift_logging_mux_client_mode=maximal openshift_logging_use_ops=false openshift_logging_es_cpu_limit=4000m openshift_logging_fluentd_cpu_limit=500m openshift_logging_mux_cpu_limit=1000m openshift_logging_kibana_cpu_limit=200m openshift_logging_kibana_proxy_cpu_limit=100m openshift_logging_es_memory_limit=9Gi openshift_logging_fluentd_memory_limit=512Mi openshift_logging_mux_memory_limit=2Gi openshift_logging_kibana_memory_limit=1Gi openshift_logging_kibana_proxy_memory_limit=256Mi openshift_logging_mux_file_buffer_storage_type=pvc openshift_logging_mux_file_buffer_pvc_name=logging-mux-pvc openshift_logging_mux_file_buffer_pvc_size=4Gi Description of problem: Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
(In reply to Mike Fiedler from comment #0) > > openshift_logging_es_cpu_limit=4000m > openshift_logging_es_memory_limit=9Gi Could you try these parameters instead? openshift_logging_elasticsearch_cpu_limit openshift_logging_elasticsearch_memory_limit
I will try those on the next deployment. Putting this link here for reference in case this needs to be a doc update: https://docs.openshift.com/container-platform/3.6/install_config/aggregate_logging.html#aggregate-logging-ansible-variables
Sorry, Mike. You are right (and please ignore my comment #c1). The Elasticsearch CPU limit is not allowed to update. The doc should be updated that way? (openshift-ansible/roles/openshift_logging_elasticsearch/defaults/main.yml) openshift_logging_elasticsearch_cpu_limit: 1000m openshift_logging_elasticsearch_memory_limit: \ "{{ openshift_logging_es_memory_limit | default('1Gi') }}"
I'd prefer it to be configurable but will defer to others who might know a reason to hard code it. According to the ES docs [1] the thread pool is dynamically adjusted based on the available processors and I've seen some ingestion improvements by raising the limit > 1. I hope to have it quantified soon. [1] - https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html
Why do we want to cap the CPU limit? And what are we going to put in "requests"? We should allow the user to specify the minimum CPU that ES should get, and support capping it, but by default I would recommend we don't cap it.
Closing this issue as we intent to remove cpu limits to allow the cluster to provide infra components with as much cpu as it needed: https://github.com/openshift/openshift-ansible/pull/5748