Bug 1833226 - DiskPressure due to 80 GB /var/lib/fluentd
Summary: DiskPressure due to 80 GB /var/lib/fluentd
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.3.z
Assignee: Periklis Tsirakidis
QA Contact: Anping Li
URL:
Whiteboard:
Depends On: 1826861
Blocks: 1824427
TreeView+ depends on / blocked
 
Reported: 2020-05-08 06:11 UTC by OpenShift BugZilla Robot
Modified: 2020-05-27 17:01 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: On high incoming log rates Fluentd could possible flood the node's filesystem because the buffer queues were not limited. Consequence: A node under disk pressure could eventually crash the node and thus the applications would be rescheduled. Fix: The fluentd buffer queue per output is limited to a fixed amount of chunks (default 32). Result: Node disk pressure due to fluentd buffers should be omited by this fix.
Clone Of:
Environment:
Last Closed: 2020-05-27 17:00:46 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-logging-operator pull 510 0 None closed Bug 1833226: Fix disk pressure for fluentd buffer chunks 2021-02-11 14:47:36 UTC
Red Hat Product Errata RHBA-2020:2184 0 None None None 2020-05-27 17:01:01 UTC

Description OpenShift BugZilla Robot 2020-05-08 06:11:07 UTC
This is a clone of Bug #1826861. This is the description of that bug:
This bug was initially created as a copy of Bug #1780698

I am copying this bug because: 



Description of problem:
I have a cluster that was on 4.1.23 (upgraded continuously from about 4.1.4).

The upgrade to either 4.1.24 or 4.1.25 fails with a download error:

info: An upgrade is in progress. Unable to apply 4.1.25: could not download the update

Updates:

VERSION IMAGE
4.1.25  quay.io/openshift-release-dev/ocp-release@sha256:5f824fa3b3c44c6a78a5fc6a77a82edc47cf2b495bb6b2b31e3e0a4d3d77684b
4.1.24  quay.io/openshift-release-dev/ocp-release@sha256:6f87fb66dfa907db03981e69474ea3069deab66358c18d965f6331bd727ff23f

oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.23    True        True          44h     Unable to apply 4.1.25: could not download the update

All Cluster Operators show on 4.1.23.

oc adm must-gather is at https://drive.google.com/open?id=18mqD6BpEwAQbApb1_5MD9j-cMBckBIA6

I can provide a kubeconfig as well to poke around there.

Comment 1 Jeff Cantrill 2020-05-11 18:28:58 UTC
Awaiting cherry-pick approval from patch manager

Comment 4 Anping Li 2020-05-15 04:24:59 UTC
Verified on clusterlogging.4.3.20-202005141057

Comment 6 errata-xmlrpc 2020-05-27 17:00:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2184


Note You need to log in before you can comment on or make changes to this bug.