Sharding the .operations indices (and .orphan indices) using a primary shard count that matches the number of nodes in the cluster will provide for an even distribution of disk usage and load across cluster members to help with high operational logging rates. This has little effect in low logging rate situations.
See the following gist for an example: https://gist.github.com/portante/f8cfecad1c6b69cdc1736ce464501d6f
https://github.com/openshift/origin-aggregated-logging/pull/1019
Jeff, The fix is in logging-elasticsearch/images/v3.7.44-3. The default shards number are set to 3. But the shard number and replicas number couldn't be update as Environment variable. The changes have been updated to the template. but it wasn't load to elasticsearch. 1. oc set env -c elasticsearch dc/logging-es-data-master-e1rbo8ai PRIMARY_SHARDS=1 2. Check the template json file sh-4.2$ cat common.settings.operations.orphaned.json { "order": 5, "settings": { "index.refresh_interval": "5s", "index.number_of_replicas": 0, "index.number_of_shards": 1, "index.translog.flush_threshold_size": "256mb", "index.unassigned.node_left.delayed_timeout": "2m" }, "template": ".orphaned*" } 3. Check the template in ES sh-4.2$ curl -s -XGET --cacert /etc/elasticsearch/secret/admin-ca --cert /etc/elasticsearch/secret/admin-cert --key /etc/elasticsearch/secret/admin-key https://localhost:9200/_template/common.settings.operations.orphaned.json?pretty { "common.settings.operations.orphaned.json" : { "order" : 5, "template" : ".orphaned*", "settings" : { "index" : { "refresh_interval" : "5s", "unassigned" : { "node_left" : { "delayed_timeout" : "2m" } }, "number_of_shards" : "3", "translog" : { "flush_threshold_size" : "256mb" }, "number_of_replicas" : "0" } }, "mappings" : { }, "aliases" : { } } }
Deployed locally image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch:v3.7.44-3 Modified the DC to have: env: - name: PRIMARY_SHARDS value: "2" - name: REPLICA_SHARDS value: "1" Rolled out latest rsh into the pod $ QUERY=_template/common.settings.operations.orphaned.json?pretty es_util { "common.settings.operations.orphaned.json" : { "order" : 5, "template" : ".orphaned*", "settings" : { "index" : { "refresh_interval" : "5s", "unassigned" : { "node_left" : { "delayed_timeout" : "2m" } }, "number_of_shards" : "2", "translog" : { "flush_threshold_size" : "256mb" }, "number_of_replicas" : "1" } }, "mappings" : { }, "aliases" : { } } } Looks like the settings are applied as expected
@jeff, Are there persistent volume in your elasticsearch pod?
There are not. Are you suggesting adding storage alters the outcome?
@jeff, No, I am thinking why we have differnet results. Is that caused by persistent storage?
I don't see how persistent storage would make a difference unless there is some functionality where if the template is already present that ES is not accepting changes to over right what already exists. I would expect that to manifest in some error at start up.
*** Bug 1582225 has been marked as a duplicate of this bug. ***
The .operations.* and .orphaned.* indices can be changed by Environments REPLICA_SHARDS and PRIMARY_SHARDS when using openshift3/logging-elasticsearch/images/v3.9.40-1 Note that: 1) Please use same value for all Elasticsearch Deploymentconfig 2) Please run oc rollout latest $ES_Deployment_configure to apply the changes 3) To avoid confliction, Don't run oc rollout in mid-night when the index is creating. 4) The Environment variables may lose when re-run playbooks
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2337