Hide Forgot
Description of problem: When the cluster.routing.allocation.disk.watermark.low or cluster.routing.allocation.disk.watermark.high reached in ES node, no status message in the Elasticsearch CR. Version-Release number of selected component (if applicable): quay.io/openshift/origin-elasticsearch-operator@sha256:094754d814bc586f7d365f675ca7d005318ad8fe66278e467215abd3bdd94760 How reproducible: Always Steps to Reproduce: 1. Deploy logging with below settings in ClusterLogging CR: spec: managementState: "Managed" logStore: type: "elasticsearch" elasticsearch: nodeCount: 2 redundancyPolicy: "SingleRedundancy" 2. use `dd` command to create some large files in ES pod's directory /elasticsearch/persistent/elasticsearch/data/nodes/0/indices/ 3. change cluster.routing.allocation.disk.watermark.low and cluster.routing.allocation.disk.watermark.high settings: $ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- es_util --query=_cluster/settings -X PUT -H 'Content-Type: application/json' -d' { "transient": { "cluster.routing.allocation.disk.watermark.low": "45%", "cluster.routing.allocation.disk.watermark.high": "55%" } } ' 4. check ES cluster settings: $ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- es_util --query=_cluster/settings |jq Defaulting container name to elasticsearch. Use 'oc describe pod/elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -n openshift-logging' to see all of the containers in this pod. { "persistent": { "discovery": { "zen": { "minimum_master_nodes": "2" } } }, "transient": { "cluster": { "routing": { "allocation": { "disk": { "watermark": { "low": "45%", "high": "55%" } } } } } } } check ES pod's storage usage: $ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- df -Th Defaulting container name to elasticsearch. Use 'oc describe pod/elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -n openshift-logging' to see all of the containers in this pod. Filesystem Type Size Used Avail Use% Mounted on overlay overlay 119G 59G 61G 49% / tmpfs tmpfs 64M 0 64M 0% /dev tmpfs tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup shm tmpfs 64M 0 64M 0% /dev/shm tmpfs tmpfs 3.9G 3.5M 3.9G 1% /run/secrets /dev/xvda3 xfs 119G 59G 61G 49% /etc/hosts tmpfs tmpfs 3.9G 28K 3.9G 1% /etc/openshift/elasticsearch/secret tmpfs tmpfs 3.9G 24K 3.9G 1% /run/secrets/kubernetes.io/serviceaccount tmpfs tmpfs 3.9G 0 3.9G 0% /proc/acpi tmpfs tmpfs 3.9G 0 3.9G 0% /proc/scsi tmpfs tmpfs 3.9G 0 3.9G 0% /sys/firmware 5. create a new index or delete an existing index in ES pod, wait for a while, check index in ES, the new index is in yellow stauts $ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- indices Defaulting container name to elasticsearch. Use 'oc describe pod/elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -n openshift-logging' to see all of the containers in this pod. Fri Apr 19 02:38:47 UTC 2019 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open .searchguard D9HRZtEZSjuXvKAAivDkjA 1 1 5 1 0 0 green open .kibana 3rh0UWVgRyqv9lHdYYMpjQ 1 1 1 0 0 0 green open .operations.2019.04.19 Xu2-ybTMTdO20K-3-Pba9Q 2 1 70568 0 140 66 green open project.qitang1.c3f4d3a2-6244-11e9-b5f6-0238fbfbbfe0.2019.04.19 V7lIOALGT7iz1V1V30OUww 2 1 2779 0 4 2 green open .kibana.647a750f1787408bf50088234ec0edd5a6a9b2ac 5OBfatibR0ePrle-JL9ZsQ 1 0 3 0 0 0 yellow open project.test2.9a8e477b-6245-11e9-b5f6-0238fbfbbfe0.2019.04.19 9_k1XbgpQSC_80AVfvo0Xw 2 1 1162 0 1 0 6. check ES pod logs: $ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- indices [2019-04-19T02:18:12,227][INFO ][o.e.c.s.ClusterSettings ] [elasticsearch-cdm-cp3ve3lp-1] updating [cluster.routing.allocation.disk.watermark.low] from [85%] to [45%] [2019-04-19T02:18:12,231][INFO ][o.e.c.s.ClusterSettings ] [elasticsearch-cdm-cp3ve3lp-1] updating [cluster.routing.allocation.disk.watermark.high] from [90%] to [55%] [2019-04-19T02:18:12,232][INFO ][o.e.c.s.ClusterSettings ] [elasticsearch-cdm-cp3ve3lp-1] updating [cluster.routing.allocation.disk.watermark.low] from [85%] to [45%] [2019-04-19T02:18:12,232][INFO ][o.e.c.s.ClusterSettings ] [elasticsearch-cdm-cp3ve3lp-1] updating [cluster.routing.allocation.disk.watermark.high] from [90%] to [55%] $ oc exec elasticsearch-cdm-cp3ve3lp-2-669f7fc5b9-zzfj8 -- logs [2019-04-19T02:18:12,265][INFO ][o.e.c.s.ClusterSettings ] [elasticsearch-cdm-cp3ve3lp-2] updating [cluster.routing.allocation.disk.watermark.low] from [85%] to [45%] [2019-04-19T02:18:12,265][INFO ][o.e.c.s.ClusterSettings ] [elasticsearch-cdm-cp3ve3lp-2] updating [cluster.routing.allocation.disk.watermark.high] from [90%] to [55%] [2019-04-19T02:18:12,266][INFO ][o.e.c.s.ClusterSettings ] [elasticsearch-cdm-cp3ve3lp-2] updating [cluster.routing.allocation.disk.watermark.low] from [85%] to [45%] [2019-04-19T02:18:12,266][INFO ][o.e.c.s.ClusterSettings ] [elasticsearch-cdm-cp3ve3lp-2] updating [cluster.routing.allocation.disk.watermark.high] from [90%] to [55%] [2019-04-19T02:18:26,316][INFO ][o.e.c.r.a.DiskThresholdMonitor] [elasticsearch-cdm-cp3ve3lp-2] low disk watermark [45%] exceeded on [wJwBeRgPQ9SP1RobhnQsOQ][elasticsearch-cdm-cp3ve3lp-1][/elasticsearch/persistent/elasticsearch/data/nodes/0] free: 61gb[51.2%], replicas will not be assigned to this node <---snip----> [2019-04-19T02:34:27,014][INFO ][o.e.c.r.a.DiskThresholdMonitor] [elasticsearch-cdm-cp3ve3lp-2] low disk watermark [45%] exceeded on [wJwBeRgPQ9SP1RobhnQsOQ][elasticsearch-cdm-cp3ve3lp-1][/elasticsearch/persistent/elasticsearch/data/nodes/0] free: 60.9gb[51.2%], replicas will not be assigned to this node 7. check Elasticsearch CR status, no message in conditions part: $ oc get elasticsearch -o yaml |grep status -A 8 status: clusterHealth: yellow conditions: [] nodes: - deploymentName: elasticsearch-cdm-cp3ve3lp-1 upgradeStatus: {} - deploymentName: elasticsearch-cdm-cp3ve3lp-2 upgradeStatus: {} pods: Actual results: Expected results: Additional info: Also, no message in Elasticsearch CR when high disk watermark exceeded on ES node.
https://github.com/openshift/elasticsearch-operator/pull/124
The issue isn't fixed.
I wonder if the image isn't being built... I can see the fix building an image from master: $ oc exec example-elasticsearch-cdm-ec18ta2a-1-7d449d65d7-xs4p6 -c elasticsearch -- es_util --query=_cluster/settings -XPUT -d '{"transient":{"cluster.routing.allocation.disk.watermark.low": "20%"}}' {"acknowledged":true,"persistent":{},"transient":{"cluster":{"routing":{"allocation":{"disk":{"watermark":{"low":"20%"}}}}}}} $ oc get elasticsearch example-elasticsearch -o yaml ... status: clusterHealth: green conditions: [] nodes: - conditions: - lastTransitionTime: 2019-04-23T21:25:38Z message: Disk storage usage for node is 27.3Gb (36.54%). Shards will be not be allocated on this node. reason: Disk Watermark Low status: "True" type: NodeStorage deploymentName: example-elasticsearch-cdm-ec18ta2a-1 upgradeStatus: {} pods: client: failed: [] notReady: [] ready: - example-elasticsearch-cdm-ec18ta2a-1-7d449d65d7-xs4p6 data: failed: [] notReady: [] ready: - example-elasticsearch-cdm-ec18ta2a-1-7d449d65d7-xs4p6 master: failed: [] notReady: [] ready: - example-elasticsearch-cdm-ec18ta2a-1-7d449d65d7-xs4p6 shardAllocationEnabled: all
Oh, yes, I checked the image ID, it didn't change.
*** Bug 1703133 has been marked as a duplicate of this bug. ***
I can get the status message in elasticsearch CR now. Status: Cluster Health: green Conditions: Nodes: Conditions: Last Transition Time: 2019-05-05T02:43:09Z Message: Disk storage usage for node is 300Mb (30.74%). Shards will be relocated from this node. Reason: Disk Watermark High Status: True Type: NodeStorage Deployment Name: elasticsearch-cdm-g342mj5c-1 Upgrade Status: Conditions: Last Transition Time: 2019-05-05T02:43:10Z Message: Disk storage usage for node is 300Mb (30.74%). Shards will be relocated from this node. Reason: Disk Watermark High Status: True Type: NodeStorage Deployment Name: elasticsearch-cdm-g342mj5c-2 Upgrade Status: Image: quay.io/openshift/origin-elasticsearch-operator@sha256:d1f9375315fe5544e2e5e748e64a11f4d0983a36fd1bed3f74072772ea65f05b
Verified with ose-elasticsearch-operator:v4.1.0-201905081021 Cluster Health: green Conditions: Nodes: Conditions: Last Transition Time: 2019-05-09T03:25:00Z Message: Disk storage usage for node is 844.5Mb (86.54%). Shards will be not be allocated on this node. Reason: Disk Watermark Low Status: True Type: NodeStorage Deployment Name: elasticsearch-cdm-s5up2ku2-1 Upgrade Status: Conditions: Last Transition Time: 2019-05-09T03:25:01Z Message: Disk storage usage for node is 863.4Mb (88.48%). Shards will be not be allocated on this node. Reason: Disk Watermark Low Status: True Type: NodeStorage Deployment Name: elasticsearch-cdm-s5up2ku2-2 Upgrade Status:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758