1697728 – The elasticsearch pods weren't updated when redundancyPolicy changed

Bug 1697728 - The elasticsearch pods weren't updated when redundancyPolicy changed

Summary: The elasticsearch pods weren't updated when redundancyPolicy changed

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.2.0
Assignee:	ewolinet
QA Contact:	Anping Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-04-09 04:33 UTC by Qiaoling Tang
Modified:	2019-10-16 06:28 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-10-16 06:28:05 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift elasticsearch-operator pull 171	None	closed	Bug 1697728: updating the index template and existing indices replica count	2020-10-28 21:21:41 UTC
Github	openshift elasticsearch-operator pull 173	None	closed	Bug 1697728: Fix template replica update	2020-10-28 21:21:42 UTC
Red Hat Product Errata	RHBA-2019:2922	None	None	None	2019-10-16 06:28:22 UTC

Description Qiaoling Tang 2019-04-09 04:33:40 UTC

Description of problem:
hange redundancyPolicy in clusterlogging CR,  the index_settings  in cm/elasticsearch is updated accordingly. but the elasticsearch pod weren't redeployed.

Version-Release number of selected component (if applicable):
quay.io/openshift/origin-logging-elasticsearch5@sha256:aa20dce94aeb394ec3e8305539fb004ba54855269b85eeb6129b067fce027453
quay.io/openshift/origin-cluster-logging-operator@sha256:949ee74661a3bac7d08084d01ce1375ff51a04f97d28ff59d7e35f49e5065a15
quay.io/openshift/origin-elasticsearch-operator@sha256:094754d814bc586f7d365f675ca7d005318ad8fe66278e467215abd3bdd94760

How reproducible:
Always

Steps to Reproduce:
1. Deploy logging via OLM with 3 es node and FullRedundancy redundancyPolicy

2. check indices in ES pod and check index settings in elasticsearch configmap, the number of replica shard is 2
$ oc exec elasticsearch-cdm-bg2gr79r-1-78c754c47f-n7gv8 -- indices
Defaulting container name to elasticsearch.
Use 'oc describe pod/elasticsearch-cdm-bg2gr79r-1-78c754c47f-n7gv8 -n openshift-logging' to see all of the containers in this pod.
Tue Apr  9 03:19:54 UTC 2019
health status index                                                         uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .kibana                                                       -iTNBd4DSt-APZcAfCSYTQ   1   2          1            0          0              0
green  open   .operations.2019.04.09                                        JQ9zYJmKQFK7kjEjjRARXA   3   2      37782            0        104             34
green  open   .kibana.647a750f1787408bf50088234ec0edd5a6a9b2ac              1eomWYJdQpS83BgLs81uTQ   1   0          2            0          0              0
green  open   .searchguard                                                  LyrBy99XReqPvE88UVU7qg   1   2          5            1          0              0
green  open   project.test1.17c2b9af-5a73-11e9-8644-0aa491d4dce0.2019.04.09 qF5YUSi2TjSgXz-0SPL_UQ   3   2       1079            0          3              1
$ oc get cm elasticsearch -o yaml |grep index_settings -A 3
  index_settings: |2

    PRIMARY_SHARDS=3
    REPLICA_SHARDS=2

3. change redundancyPolicy to "SingleRedundancy" in clusterlogging CR, then check redundancyPolicy in elasticsearch CR and index settings in elasticsearch configmap, the number of replica shard is 1
$ oc get clusterlogging -o yaml |grep Policy
        redundancyPolicy: SingleRedundancy
$ oc get elasticsearch -o yaml |grep Policy
    redundancyPolicy: SingleRedundancy
$ oc get cm elasticsearch -o yaml |grep index_settings -A 3
  index_settings: |2

    PRIMARY_SHARDS=3
    REPLICA_SHARDS=1

4. create new index in ES, then check indices in ES pod, the number of replica shard in the new index is still 2, and it should be 1, see the index "project.test2.xxxxxxxx"
$ oc exec elasticsearch-cdm-bg2gr79r-1-78c754c47f-n7gv8 -- indices
Defaulting container name to elasticsearch.
Use 'oc describe pod/elasticsearch-cdm-bg2gr79r-1-78c754c47f-n7gv8 -n openshift-logging' to see all of the containers in this pod.
Tue Apr  9 03:27:22 UTC 2019
health status index                                                         uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .kibana                                                       -iTNBd4DSt-APZcAfCSYTQ   1   2          1            0          0              0
green  open   project.test2.22e02269-5a77-11e9-8644-0aa491d4dce0.2019.04.09 uQOMz71uRMKrNyopjyyNHA   3   2         45            0          0              0
green  open   .operations.2019.04.09                                        JQ9zYJmKQFK7kjEjjRARXA   3   2      52270            0        146             48
green  open   .kibana.647a750f1787408bf50088234ec0edd5a6a9b2ac              1eomWYJdQpS83BgLs81uTQ   1   0          3            0          0              0
green  open   .searchguard                                                  LyrBy99XReqPvE88UVU7qg   1   2          5            1          0              0
green  open   project.test1.17c2b9af-5a73-11e9-8644-0aa491d4dce0.2019.04.09 qF5YUSi2TjSgXz-0SPL_UQ   3   2       1526            0          3              1


Actual results:


Expected results:


Additional info:

Comment 2 Jeff Cantrill 2019-04-16 16:27:55 UTC

Moving to 4.2

Work around is to wait for the CLO to update the Elasticsearch CR and manually run the init script:

"oc exec -c elasticssearch $espod -- bash -c '$HOME/init.sh'"

Comment 3 Qiaoling Tang 2019-04-17 02:24:38 UTC

(In reply to Jeff Cantrill from comment #2)
> Moving to 4.2
> 
> Work around is to wait for the CLO to update the Elasticsearch CR and
> manually run the init script:
> 
> "oc exec -c elasticssearch $espod -- bash -c '$HOME/init.sh'"

The workaround doesn't work, the index.number_of_replicas in template isn't changed after running the init script.

Comment 4 ewolinet 2019-04-17 16:58:20 UTC

(In reply to Qiaoling Tang from comment #3)
> (In reply to Jeff Cantrill from comment #2)
> > Moving to 4.2
> > 
> > Work around is to wait for the CLO to update the Elasticsearch CR and
> > manually run the init script:
> > 
> > "oc exec -c elasticssearch $espod -- bash -c '$HOME/init.sh'"
> 
> The workaround doesn't work, the index.number_of_replicas in template isn't
> changed after running the init script.

I'm able to recreate this, but I think this issue is actually sourced from the fact that we use `sed -i` of the template files, so the first time it is run it will always have that value (not from the operator):

$ oc exec example-elasticsearch-cdm-vy14npva-1-55b6667ff5-5vxxv -c elasticsearch -- bash -c 'source /usr/share/java/elasticsearch/config/index_settings; echo $PRIMARY_SHARDS; echo $REPLICA_SHARDS'
2
1

$ oc exec example-elasticsearch-cdm-vy14npva-1-55b6667ff5-5vxxv -c elasticsearch -- bash -c 'cat $ES_HOME/index_templates/common.settings.project.template.json'
{
  "order": 5,
  "settings": {
    "index.refresh_interval": "5s",
    "index.number_of_replicas": 0,
    "index.number_of_shards": 1,
    "index.translog.flush_threshold_size": "256mb",
    "index.unassigned.node_left.delayed_timeout": "2m"
  },
  "template": "project*"

Comment 5 ewolinet 2019-04-17 17:38:14 UTC

https://github.com/openshift/origin-aggregated-logging/pull/1604
to fix workaround

Comment 8 ewolinet 2019-04-25 20:53:35 UTC

Workaround fix merged in, will need to wait for latest elasticsearch image to be published before it is available

Comment 12 Qiaoling Tang 2019-07-09 01:32:16 UTC

Thanks for the comment.

The templates on the cluster are not updated after changing the redundancy policy from zeroredundancy to singleredundancy.

$ oc exec elasticsearch-cdm-y65z0agi-2-795b95c7fb-pq5rh -- es_util --query=_template/common.* |jq
Defaulting container name to elasticsearch.
Use 'oc describe pod/elasticsearch-cdm-y65z0agi-2-795b95c7fb-pq5rh -n openshift-logging' to see all of the containers in this pod.
{
  "common.settings.operations.orphaned.json": {
    "order": 5,
    "template": ".orphaned*",
    "settings": {
      "index": {
        "refresh_interval": "5s",
        "unassigned": {
          "node_left": {
            "delayed_timeout": "2m"
          }
        },
        "number_of_shards": "3",
        "translog": {
          "flush_threshold_size": "256mb"
        },
        "number_of_replicas": "0"
      }
    },
    "mappings": {},
    "aliases": {}
  },
  "common.settings.operations.template.json": {
    "order": 5,
    "template": ".operations*",
    "settings": {
      "index": {
        "refresh_interval": "5s",
        "unassigned": {
          "node_left": {
            "delayed_timeout": "2m"
          }
        },
        "number_of_shards": "3",
        "translog": {
          "flush_threshold_size": "256mb"
        },
        "number_of_replicas": "0"
      }
    },
    "mappings": {},
    "aliases": {}
  },
  "common.settings.project.template.json": {
    "order": 5,
    "template": "project*",
    "settings": {
      "index": {
        "refresh_interval": "5s",
        "unassigned": {
          "node_left": {
            "delayed_timeout": "2m"
          }
        },
        "number_of_shards": "3",
        "translog": {
          "flush_threshold_size": "256mb"
        },
        "number_of_replicas": "0"
      }
    },
    "mappings": {},
    "aliases": {}
  },
  "common.settings.kibana.template.json": {
    "order": 0,
    "template": ".kibana*",
    "settings": {
      "index": {
        "number_of_shards": "3",
        "number_of_replicas": "0"
      }
    },
    "mappings": {},
    "aliases": {}
  }
}

$ oc get cm elasticsearch -oyaml |grep index_settings -A 3
  index_settings: |2

    PRIMARY_SHARDS=3
    REPLICA_SHARDS=1

Images:
ose-logging-elasticsearch5-v4.2.0-201907071316
ose-cluster-logging-operator-v4.2.0-201907071316
ose-elasticsearch-operator-v4.2.0-201907071316

Per comment 11, move this bug to ASSIGNED.

Comment 13 ewolinet 2019-07-09 14:58:08 UTC

I can recreate this with an image built from the master branch and see this fixed with https://github.com/openshift/elasticsearch-operator/pull/173

Comment 15 Anping Li 2019-09-03 09:16:53 UTC

Both the confimap and template are updated when i changed the redundancyPolicy.

Comment 16 errata-xmlrpc 2019-10-16 06:28:05 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922

Note You need to log in before you can comment on or make changes to this bug.