Bug 1463046 - Change the Elasticsearch setting "node.max_local_storage_nodes" to 1 to prevent sharing EBS volumes
Change the Elasticsearch setting "node.max_local_storage_nodes" to 1 to preve...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging (Show other bugs)
3.5.1
All All
unspecified Severity high
: ---
: 3.5.z
Assigned To: Jeff Cantrill
Xia Zhao
:
: 1462281 (view as bug list)
Depends On: 1460564 1462277
Blocks: 1462281
  Show dependency treegraph
 
Reported: 2017-06-19 22:04 EDT by Jeff Cantrill
Modified: 2017-07-11 06:47 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Elasticsearch default value for sharing storage between ES instances was wrong Consequence: The incorrect default value allowed an ES pod starting up (when another ES pod was shutting down, e.g. during dc redeployments) to create a new location on the PV for managing the storage volume, duplicating data, and in some instances, potentially causing data loss. Fix: All ES pods now run with "node.max_local_storage_nodes" set to 1. Result: The ES pods starting up/shutting down will no longer share the same storage and prevent the data duplication and/or data loss.
Story Points: ---
Clone Of: 1462277
Environment:
Last Closed: 2017-07-11 06:47:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jeff Cantrill 2017-06-19 22:04:43 EDT
+++ This bug was initially created as a clone of Bug #1462277 +++

+++ This bug was initially created as a clone of Bug #1460564 +++

Change the setting for node.max_local_storage_nodes to 1 for all ES pods, as this would prevent us from seeing problems where two ES pods end up sharing the same EBS volume if one pod does not shut down properly.

For an example of this, see https://bugzilla.redhat.com/show_bug.cgi?id=1443350#c33

See discussion from https://discuss.elastic.co/t/multiple-folders-inside-nodes-folder/85358, and the documentation at https://www.elastic.co/guide/en/elasticsearch/reference/2.4/modules-node.html#max-local-storage-nodes.

--- Additional comment from Jeff Cantrill on 2017-06-19 21:57:37 EDT ---

merged in https://github.com/openshift/openshift-ansible/pull/4466/

--- Additional comment from Jeff Cantrill on 2017-06-19 22:03:29 EDT ---

Modifying this BZ to ref 3.4.1 as it clones the one for which comment 1 PR references the cloned BZ
Comment 1 Jeff Cantrill 2017-06-19 22:09:21 EDT
backport PR https://github.com/openshift/openshift-ansible/pull/4502
Comment 3 Xia Zhao 2017-06-30 01:50:24 EDT
The testing work is blocked by this new regression bug: https://bugzilla.redhat.com/show_bug.cgi?id=1466626
Comment 4 Jeff Cantrill 2017-06-30 13:18:09 EDT
*** Bug 1462281 has been marked as a duplicate of this bug. ***
Comment 5 Xia Zhao 2017-07-03 01:45:14 EDT
max_local_storage_nodes is 1 now
# oc get configmap logging-elasticsearch -o yaml | grep -i max_local_storage_nodes
      max_local_storage_nodes: 1

Testing env:
# openshift version
openshift v3.5.5.31
kubernetes v1.5.2+43a9be4
etcd 3.1.0

ansible version:
openshift-ansible-playbooks-3.5.91-1.git.0.28b3ddb.el7.noarch
worked around bug #1466626 by adding the configuration in https://github.com/openshift/openshift-ansible/pull/4657/files

Images from brew registry:
openshift3/logging-kibana    277c4a616a5a
openshift3/logging-elasticsearch    a7989e457354
openshift3/logging-fluentd    c09565262cad
openshift3/logging-curator    0aa259fbc36e
openshift3/logging-auth-proxy    d79212db0381
Comment 7 errata-xmlrpc 2017-07-11 06:47:38 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1640

Note You need to log in before you can comment on or make changes to this bug.