Bug 1826861
Summary: | DiskPressure due to 80 GB /var/lib/fluentd | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Periklis Tsirakidis <periklis> |
Component: | Logging | Assignee: | Periklis Tsirakidis <periklis> |
Status: | CLOSED ERRATA | QA Contact: | Anping Li <anli> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.4 | CC: | aelganzo, aos-bugs, kgarriso, scuppett |
Target Milestone: | --- | ||
Target Release: | 4.4.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
On high incoming log rates Fluentd could possible flood the node's filesystem because the buffer queues were not limited.
Consequence:
A node under disk pressure could eventually crash the node and thus the applications would be rescheduled.
Fix:
The fluentd buffer queue per output is limited to a fixed amount of chunks (default 32).
Result:
Node disk pressure due to fluentd buffers should be omited by this fix.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-05-18 13:35:02 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1780698 | ||
Bug Blocks: | 1833226 |
Description
Periklis Tsirakidis
2020-04-22 16:27:41 UTC
For some reason the PR wasn't linked to this BZ, by the bot: https://github.com/openshift/cluster-logging-operator/pull/491 Verified in clusterlogging.4.4.0-202005072005. Turn off the ES, the directory size didn't increase once it reached 257M. After the ES is turn back. the directory size decreased. 56M /var/lib/fluentd/ 56M /var/lib/fluentd/clo_default_output_es 0 /var/lib/fluentd/retry_clo_default_output_es 121M /var/lib/fluentd/ 121M /var/lib/fluentd/clo_default_output_es 0 /var/lib/fluentd/retry_clo_default_output_es 257M /var/lib/fluentd/ 257M /var/lib/fluentd/clo_default_output_es 0 /var/lib/fluentd/retry_clo_default_output_es Fri May 8 06:46:21 EDT 2020 257M /var/lib/fluentd/ 257M /var/lib/fluentd/clo_default_output_es 0 /var/lib/fluentd/retry_clo_default_output_es Fri May 8 06:51:23 EDT 2020 16M /var/lib/fluentd/ 16M /var/lib/fluentd/clo_default_output_es 0 /var/lib/fluentd/retry_clo_default_output_es Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2133 |