Back to bug 1371220

Who When What Removed Added
Jeff Cantrill 2016-08-29 16:10:06 UTC CC ewolinet, jcantril
Eric Jones 2016-08-29 18:58:05 UTC CC pportant
Flags needinfo?(pportant)
Jeff Cantrill 2016-08-29 19:37:47 UTC Keywords UpcomingRelease
Priority unspecified medium
Severity unspecified medium
Eric Jones 2016-08-29 20:54:08 UTC Priority medium urgent
Severity medium high
Peter Portante 2016-09-12 11:13:30 UTC Flags needinfo?(pportant)
Dan McPherson 2016-09-16 13:49:44 UTC Priority urgent medium
Severity high medium
Keywords UpcomingRelease
John Skeoch 2016-09-30 02:18:01 UTC QA Contact chunchen wsun
Wei Sun 2016-09-30 02:37:44 UTC QA Contact wsun xiazhao
Luke Meyer 2016-10-12 14:26:22 UTC Status NEW MODIFIED
Troy Dawson 2016-10-18 16:08:31 UTC Status MODIFIED ASSIGNED
Status ASSIGNED MODIFIED
CC tdawson
errata-xmlrpc 2016-10-18 16:09:12 UTC Status MODIFIED ON_QA
Wei Sun 2016-10-19 06:12:23 UTC Target Release --- 3.3.1
CC wsun
Xia Zhao 2016-10-19 11:40:47 UTC Status ON_QA VERIFIED
Target Release 3.3.1 ---
Luke Meyer 2016-10-21 20:46:45 UTC Doc Text Feature:
The EFK deployer now configures terminationGracePeriodSeconds for Elasticsearch and Fluentd pods.

Reason:
We observed that sometimes Elasticsearch in particular would end up in a state where it did not remove its node.lock at shutdown. Elasticsearch shuts down properly, this should be deleted, but if it takes too long to shut down, OpenShift will hard-kill it after 30 seconds by default. If the node.lock is not removed from persistent storage, then when the instance is started again Elasticsearch treats the data directory as locked and starts with a fresh data directory, effectively losing all its data.

Result:
The explicit terminationGracePeriodSeconds gives both Fluentd and Elasticsearch more time to flush data and terminate properly so that this situation should occur less often. It cannot be completely eliminated; for example if ES runs into an out-of-memory situation, it may be hung indefinitely and still end up being killed and leaving the node.lock file. But this extended termination time should make normal shutdown scenarios safer.
Doc Type If docs needed, set a value Enhancement
errata-xmlrpc 2016-10-25 10:50:28 UTC Status VERIFIED RELEASE_PENDING
Jeff Cantrill 2016-10-25 13:37:05 UTC Target Release --- 3.4.0
errata-xmlrpc 2016-10-27 15:43:15 UTC Status RELEASE_PENDING CLOSED
Resolution --- ERRATA
Last Closed 2016-10-27 11:43:15 UTC
Dan McPherson 2017-03-08 18:43:11 UTC Target Release 3.4.0

Back to bug 1371220