Bug 1701439 - No watermark status message in Elasticsearch CR.
Summary: No watermark status message in Elasticsearch CR.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.1.0
Assignee: ewolinet
QA Contact: Qiaoling Tang
URL:
Whiteboard:
: 1703133 (view as bug list)
Depends On:
Blocks: 1703133
TreeView+ depends on / blocked
 
Reported: 2019-04-19 03:17 UTC by Qiaoling Tang
Modified: 2019-06-04 10:47 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1703133 (view as bug list)
Environment:
Last Closed: 2019-06-04 10:47:47 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:47:55 UTC

Description Qiaoling Tang 2019-04-19 03:17:50 UTC
Description of problem:
When the cluster.routing.allocation.disk.watermark.low or  cluster.routing.allocation.disk.watermark.high reached in ES node, no status message in the Elasticsearch CR.

Version-Release number of selected component (if applicable):
quay.io/openshift/origin-elasticsearch-operator@sha256:094754d814bc586f7d365f675ca7d005318ad8fe66278e467215abd3bdd94760


How reproducible:
Always

Steps to Reproduce:
1. Deploy logging with below settings in ClusterLogging CR:
spec:
  managementState: "Managed"
  logStore:
    type: "elasticsearch"
    elasticsearch:
      nodeCount: 2
      redundancyPolicy: "SingleRedundancy"
2. use `dd` command to create some large files in ES pod's directory /elasticsearch/persistent/elasticsearch/data/nodes/0/indices/

3. change cluster.routing.allocation.disk.watermark.low and cluster.routing.allocation.disk.watermark.high settings:
$ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- es_util --query=_cluster/settings -X PUT -H 'Content-Type: application/json'  -d'
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.low": "45%",
    "cluster.routing.allocation.disk.watermark.high": "55%"
  }
}
'
4. check ES cluster settings:
$ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- es_util --query=_cluster/settings |jq
Defaulting container name to elasticsearch.
Use 'oc describe pod/elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -n openshift-logging' to see all of the containers in this pod.
{
  "persistent": {
    "discovery": {
      "zen": {
        "minimum_master_nodes": "2"
      }
    }
  },
  "transient": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "low": "45%",
              "high": "55%"
            }
          }
        }
      }
    }
  }
}

check ES pod's storage usage:
$ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- df -Th
Defaulting container name to elasticsearch.
Use 'oc describe pod/elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -n openshift-logging' to see all of the containers in this pod.
Filesystem     Type     Size  Used Avail Use% Mounted on
overlay        overlay  119G   59G   61G  49% /
tmpfs          tmpfs     64M     0   64M   0% /dev
tmpfs          tmpfs    3.9G     0  3.9G   0% /sys/fs/cgroup
shm            tmpfs     64M     0   64M   0% /dev/shm
tmpfs          tmpfs    3.9G  3.5M  3.9G   1% /run/secrets
/dev/xvda3     xfs      119G   59G   61G  49% /etc/hosts
tmpfs          tmpfs    3.9G   28K  3.9G   1% /etc/openshift/elasticsearch/secret
tmpfs          tmpfs    3.9G   24K  3.9G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs          tmpfs    3.9G     0  3.9G   0% /proc/acpi
tmpfs          tmpfs    3.9G     0  3.9G   0% /proc/scsi
tmpfs          tmpfs    3.9G     0  3.9G   0% /sys/firmware

5. create a new index or delete an existing index in ES pod, wait for a while, check index in ES, the new index is in yellow stauts
$ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- indices
Defaulting container name to elasticsearch.
Use 'oc describe pod/elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -n openshift-logging' to see all of the containers in this pod.
Fri Apr 19 02:38:47 UTC 2019
health status index                                                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .searchguard                                                    D9HRZtEZSjuXvKAAivDkjA   1   1          5            1          0              0
green  open   .kibana                                                         3rh0UWVgRyqv9lHdYYMpjQ   1   1          1            0          0              0
green  open   .operations.2019.04.19                                          Xu2-ybTMTdO20K-3-Pba9Q   2   1      70568            0        140             66
green  open   project.qitang1.c3f4d3a2-6244-11e9-b5f6-0238fbfbbfe0.2019.04.19 V7lIOALGT7iz1V1V30OUww   2   1       2779            0          4              2
green  open   .kibana.647a750f1787408bf50088234ec0edd5a6a9b2ac                5OBfatibR0ePrle-JL9ZsQ   1   0          3            0          0              0
yellow open   project.test2.9a8e477b-6245-11e9-b5f6-0238fbfbbfe0.2019.04.19   9_k1XbgpQSC_80AVfvo0Xw   2   1       1162            0          1              0

6. check ES pod logs:
$ oc exec elasticsearch-cdm-cp3ve3lp-1-769d6b6584-kgvmg -- indices
[2019-04-19T02:18:12,227][INFO ][o.e.c.s.ClusterSettings  ] [elasticsearch-cdm-cp3ve3lp-1] updating [cluster.routing.allocation.disk.watermark.low] from [85%] to [45%]
[2019-04-19T02:18:12,231][INFO ][o.e.c.s.ClusterSettings  ] [elasticsearch-cdm-cp3ve3lp-1] updating [cluster.routing.allocation.disk.watermark.high] from [90%] to [55%]
[2019-04-19T02:18:12,232][INFO ][o.e.c.s.ClusterSettings  ] [elasticsearch-cdm-cp3ve3lp-1] updating [cluster.routing.allocation.disk.watermark.low] from [85%] to [45%]
[2019-04-19T02:18:12,232][INFO ][o.e.c.s.ClusterSettings  ] [elasticsearch-cdm-cp3ve3lp-1] updating [cluster.routing.allocation.disk.watermark.high] from [90%] to [55%]

$ oc exec elasticsearch-cdm-cp3ve3lp-2-669f7fc5b9-zzfj8 -- logs
[2019-04-19T02:18:12,265][INFO ][o.e.c.s.ClusterSettings  ] [elasticsearch-cdm-cp3ve3lp-2] updating [cluster.routing.allocation.disk.watermark.low] from [85%] to [45%]
[2019-04-19T02:18:12,265][INFO ][o.e.c.s.ClusterSettings  ] [elasticsearch-cdm-cp3ve3lp-2] updating [cluster.routing.allocation.disk.watermark.high] from [90%] to [55%]
[2019-04-19T02:18:12,266][INFO ][o.e.c.s.ClusterSettings  ] [elasticsearch-cdm-cp3ve3lp-2] updating [cluster.routing.allocation.disk.watermark.low] from [85%] to [45%]
[2019-04-19T02:18:12,266][INFO ][o.e.c.s.ClusterSettings  ] [elasticsearch-cdm-cp3ve3lp-2] updating [cluster.routing.allocation.disk.watermark.high] from [90%] to [55%]
[2019-04-19T02:18:26,316][INFO ][o.e.c.r.a.DiskThresholdMonitor] [elasticsearch-cdm-cp3ve3lp-2] low disk watermark [45%] exceeded on [wJwBeRgPQ9SP1RobhnQsOQ][elasticsearch-cdm-cp3ve3lp-1][/elasticsearch/persistent/elasticsearch/data/nodes/0] free: 61gb[51.2%], replicas will not be assigned to this node
<---snip---->
[2019-04-19T02:34:27,014][INFO ][o.e.c.r.a.DiskThresholdMonitor] [elasticsearch-cdm-cp3ve3lp-2] low disk watermark [45%] exceeded on [wJwBeRgPQ9SP1RobhnQsOQ][elasticsearch-cdm-cp3ve3lp-1][/elasticsearch/persistent/elasticsearch/data/nodes/0] free: 60.9gb[51.2%], replicas will not be assigned to this node

7. check Elasticsearch CR status, no message in conditions part:
$ oc get elasticsearch -o yaml |grep status -A 8
  status:
    clusterHealth: yellow
    conditions: []
    nodes:
    - deploymentName: elasticsearch-cdm-cp3ve3lp-1
      upgradeStatus: {}
    - deploymentName: elasticsearch-cdm-cp3ve3lp-2
      upgradeStatus: {}
    pods:


Actual results:


Expected results:


Additional info:
Also, no message in Elasticsearch CR when high disk watermark exceeded on ES node.

Comment 3 Qiaoling Tang 2019-04-23 08:49:37 UTC
The issue isn't fixed.

Comment 4 ewolinet 2019-04-23 21:26:59 UTC
I wonder if the image isn't being built... I can see the fix building an image from master:

$ oc exec example-elasticsearch-cdm-ec18ta2a-1-7d449d65d7-xs4p6 -c elasticsearch -- es_util --query=_cluster/settings -XPUT -d '{"transient":{"cluster.routing.allocation.disk.watermark.low": "20%"}}'
{"acknowledged":true,"persistent":{},"transient":{"cluster":{"routing":{"allocation":{"disk":{"watermark":{"low":"20%"}}}}}}}

$ oc get elasticsearch example-elasticsearch -o yaml
...
status:
  clusterHealth: green
  conditions: []
  nodes:
  - conditions:
    - lastTransitionTime: 2019-04-23T21:25:38Z
      message: Disk storage usage for node is 27.3Gb (36.54%). Shards will be not
        be allocated on this node.
      reason: Disk Watermark Low
      status: "True"
      type: NodeStorage
    deploymentName: example-elasticsearch-cdm-ec18ta2a-1
    upgradeStatus: {}
  pods:
    client:
      failed: []
      notReady: []
      ready:
      - example-elasticsearch-cdm-ec18ta2a-1-7d449d65d7-xs4p6
    data:
      failed: []
      notReady: []
      ready:
      - example-elasticsearch-cdm-ec18ta2a-1-7d449d65d7-xs4p6
    master:
      failed: []
      notReady: []
      ready:
      - example-elasticsearch-cdm-ec18ta2a-1-7d449d65d7-xs4p6
  shardAllocationEnabled: all

Comment 5 Qiaoling Tang 2019-04-24 08:41:52 UTC
Oh, yes, I checked the image ID, it didn't change.

Comment 6 Lukas Vlcek 2019-04-25 15:21:25 UTC
*** Bug 1703133 has been marked as a duplicate of this bug. ***

Comment 7 Qiaoling Tang 2019-05-05 02:46:00 UTC
I can get the status message in elasticsearch CR now.

Status:
  Cluster Health:  green
  Conditions:
  Nodes:
    Conditions:
      Last Transition Time:  2019-05-05T02:43:09Z
      Message:               Disk storage usage for node is 300Mb (30.74%). Shards will be relocated from this node.
      Reason:                Disk Watermark High
      Status:                True
      Type:                  NodeStorage
    Deployment Name:         elasticsearch-cdm-g342mj5c-1
    Upgrade Status:
    Conditions:
      Last Transition Time:  2019-05-05T02:43:10Z
      Message:               Disk storage usage for node is 300Mb (30.74%). Shards will be relocated from this node.
      Reason:                Disk Watermark High
      Status:                True
      Type:                  NodeStorage
    Deployment Name:         elasticsearch-cdm-g342mj5c-2
    Upgrade Status:

Image: quay.io/openshift/origin-elasticsearch-operator@sha256:d1f9375315fe5544e2e5e748e64a11f4d0983a36fd1bed3f74072772ea65f05b

Comment 8 Qiaoling Tang 2019-05-09 03:26:28 UTC
Verified with ose-elasticsearch-operator:v4.1.0-201905081021

  Cluster Health:  green
  Conditions:
  Nodes:
    Conditions:
      Last Transition Time:  2019-05-09T03:25:00Z
      Message:               Disk storage usage for node is 844.5Mb (86.54%). Shards will be not be allocated on this node.
      Reason:                Disk Watermark Low
      Status:                True
      Type:                  NodeStorage
    Deployment Name:         elasticsearch-cdm-s5up2ku2-1
    Upgrade Status:
    Conditions:
      Last Transition Time:  2019-05-09T03:25:01Z
      Message:               Disk storage usage for node is 863.4Mb (88.48%). Shards will be not be allocated on this node.
      Reason:                Disk Watermark Low
      Status:                True
      Type:                  NodeStorage
    Deployment Name:         elasticsearch-cdm-s5up2ku2-2
    Upgrade Status:

Comment 10 errata-xmlrpc 2019-06-04 10:47:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.