Bug 1460564 - Change the Elasticsearch setting "node.max_local_storage_nodes" to 1 to prevent sharing EBS volumes
Summary: Change the Elasticsearch setting "node.max_local_storage_nodes" to 1 to preve...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.6.0
Hardware: All
OS: All
unspecified
high
Target Milestone: ---
: 3.7.0
Assignee: Jeff Cantrill
QA Contact: Xia Zhao
URL:
Whiteboard:
Depends On:
Blocks: 1462277 1462281 1463046
TreeView+ depends on / blocked
 
Reported: 2017-06-12 02:38 UTC by Peter Portante
Modified: 2017-11-28 21:56 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Elasticsearch default value for sharing storage between ES instances was wrong Consequence: The incorrect default value allowed an ES pod starting up (when another ES pod was shutting down, e.g. during dc redeployments) to create a new location on the PV for managing the storage volume, duplicating data, and in some instances, potentially causing data loss. Fix: All ES pods now run with "node.max_local_storage_nodes" set to 1. Result: The ES pods starting up/shutting down will no longer share the same storage and prevent the data duplication and/or data loss.
Clone Of:
: 1462277 (view as bug list)
Environment:
Last Closed: 2017-11-28 21:56:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Peter Portante 2017-06-12 02:38:46 UTC
Change the setting for node.max_local_storage_nodes to 1 for all ES pods, as this would prevent us from seeing problems where two ES pods end up sharing the same EBS volume if one pod does not shut down properly.

For an example of this, see https://bugzilla.redhat.com/show_bug.cgi?id=1443350#c33

See discussion from https://discuss.elastic.co/t/multiple-folders-inside-nodes-folder/85358, and the documentation at https://www.elastic.co/guide/en/elasticsearch/reference/2.4/modules-node.html#max-local-storage-nodes.

Comment 1 openshift-github-bot 2017-06-16 22:05:45 UTC
Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/fd165fe201abb5fbd76306a16febaf1cb3c8ad0b
Ensure only one ES pod per PV

bug 1460564. Fixes [BZ #1460564](https://bugzilla.redhat.com/show_bug.cgi?id=1460564).

Unfortunately, the defaults for Elasticsearch prior to v5 allow more
than one "node" to access the same configured storage volume(s).

This change forces this value to 1 to ensure we don't have an ES pod
starting up accessing a volume while another ES pod is shutting down
when reploying. This can lead to "1" directories being created in
`/elasticsearch/persistent/${CLUSTER_NAME}/data/${CLUSTER_NAME}/nodes/`.
By default ES uses a "0" directory there when only one node is accessing
it.

Comment 3 Junqi Zhao 2017-06-26 03:05:00 UTC
max_local_storage_nodes is 1 now
# oc get configmap logging-elasticsearch -n logging -o yaml | grep -i max_local_storage_nodes
      max_local_storage_nodes: 1


Testing env:
# openshift version
openshift v3.6.122
kubernetes v1.6.1+5115d708d7
etcd 3.2.0

Images from brew registry
# docker images | grep logging
logging-kibana          v3.6                fd67e351dadf        2 days ago          342.4 MB
logging-elasticsearch   v3.6                1006eb106849        2 days ago          404.6 MB
logging-auth-proxy      v3.6                301fd39f57e0        2 days ago          229.6 MB
logging-fluentd         v3.6                dba31f5b54ba        2 days ago          232.5 MB
logging-curator         v3.6                a0148dd96b8d        2 weeks ago         221.5 MB

Comment 7 errata-xmlrpc 2017-11-28 21:56:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.