Bug 1371220
| Summary: | Scaling down ElasticSearch creates new node directories | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Eric Jones <erjones> |
| Component: | Logging | Assignee: | Luke Meyer <lmeyer> |
| Status: | CLOSED ERRATA | QA Contact: | Xia Zhao <xiazhao> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.2.1 | CC: | aos-bugs, ewolinet, jcantril, pportant, tdawson, wsun |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Enhancement | |
| Doc Text: |
Feature:
The EFK deployer now configures terminationGracePeriodSeconds for Elasticsearch and Fluentd pods.
Reason:
We observed that sometimes Elasticsearch in particular would end up in a state where it did not remove its node.lock at shutdown. Elasticsearch shuts down properly, this should be deleted, but if it takes too long to shut down, OpenShift will hard-kill it after 30 seconds by default. If the node.lock is not removed from persistent storage, then when the instance is started again Elasticsearch treats the data directory as locked and starts with a fresh data directory, effectively losing all its data.
Result:
The explicit terminationGracePeriodSeconds gives both Fluentd and Elasticsearch more time to flush data and terminate properly so that this situation should occur less often. It cannot be completely eliminated; for example if ES runs into an out-of-memory situation, it may be hung indefinitely and still end up being killed and leaving the node.lock file. But this extended termination time should make normal shutdown scenarios safer.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-10-27 15:43:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Eric Jones
2016-08-29 15:28:16 UTC
With this issue being a current problem, does this change the recommendation we provide in our documentation [0]? Keeping in mind that this recommendation has become the go to recommendation for most changes that need to be done with the EFK stack. [0] https://docs.openshift.com/enterprise/3.2/install_config/upgrading/manual_upgrades.html#manual-upgrading-efk-logging-stack We can probably recommend increasing the terminationGracePeriodSeconds in the Elasticsearch pod spec (within the DC). The default is 30 seconds, and if ES isn't able to finish its tasks during this time, it will then be issued a SIGKILL. If ES isn't able to release its locks it will create these other directories. I think Eric W. has covered all the right steps to take here. Any word on how this worked out in the field? Did the change in https://github.com/openshift/origin-aggregated-logging/pull/227 get into a release yet? If so, we should probably attach this bug to an errata or close it. I don't see any more helpful fix for this coming along. I didn't see it in there, but I'll sync it over now for the 3.3 and 3.4 deployer images Verified with this image, issue has been fixed: registry.ops.openshift.com/openshift3/logging-deployer 3.3.1 1e85b37518ba 14 hours ago 761.6 MB --Scaled down es cluster for 3 times, only 1 node created in pv: $ ls /elasticsearch/persistent/logging-es/data/logging-es/nodes/ 0 # openshift version openshift v3.3.1.3 kubernetes v1.3.0+52492b4 etcd 2.3.0+git Hey Eric Jones, do we have a kbase on what it looks like when node.lock is left behind in ES storage and what to do about it? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:2085 |