Bug 1506855

Summary:	Logging inside the ES pod can result in filling the local docker volume on the pod's host
Product:	OpenShift Container Platform	Reporter:	Peter Portante <pportant>
Component:	RFE	Assignee:	Jeff Cantrill <jcantril>
Status:	CLOSED DUPLICATE	QA Contact:	Xiaoli Tian <xtian>
Severity:	high	Docs Contact:
Priority:	high
Version:	3.7.0	CC:	anli, aos-bugs, jokerman, mmccomas, pweil, rmeggins
Target Milestone:	---
Target Release:	3.8.0
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-04-27 20:18:55 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Peter Portante 2017-10-27 02:00:27 UTC

Since 3.4 we have been directing Elasticsearch to only log to a file local to the ES pod.  This behavior was implemented to avoid a feedback loop where logs from the Elasticsearch pod itself caused additional logs to be generated.

However, we recently observed a problem that results from this state.  If an ES pod runs out of disk space for its PV, the Elasticsearch java process can generate a large volume of logs for every operation that fails because of a lack of diskspace.  This filled the local disk holding the docker volumes on the infrastructure node where the ES pod was running.

We propose three steps to address this problem:

1. Log to a location on the persistent volume
2. Use a log4j file handler that will
   rotate the file by size instead of date
   one that will also compress the rotated local file
   one that will also cap the total number of log files kept
3. Provide defaults for the log files of 10 MB cap, keeping at most 10

Comment 1 Jeff Cantrill 2017-10-27 13:15:09 UTC

Captured in card https://trello.com/c/KqWiDOHT/ as an RFE

Comment 2 Jeff Cantrill 2017-10-27 13:39:48 UTC

Targeting for 3.8 as this can be manually worked around by editing the configmap and pointing the log location to the PV

Comment 4 Jeff Cantrill 2018-04-27 20:18:55 UTC


*** This bug has been marked as a duplicate of bug 1568361 ***