Bug 1894634 - Fluent stops sending logs even though logging stack seems functional
Summary: Fluent stops sending logs even though logging stack seems functional
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.7.0
Assignee: Jeff Cantrill
QA Contact: Giriyamma
Rolfe Dlugy-Hegwer
URL:
Whiteboard: logging-core
Depends On:
Blocks: 1894639
TreeView+ depends on / blocked
 
Reported: 2020-11-04 16:44 UTC by Jeff Cantrill
Modified: 2021-02-24 11:22 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
* Previously, Fluent stopped sending logs even though the logging stack seemed functional. Logs were not shipped to an endpoint for an extended period even when an endpoint came back up. This happened if the max backoff time was too long and the endpoint was down. The current release fixes this issue by lowering the max backoff time, so the logs are shipped sooner. (link:https://bugzilla.redhat.com/show_bug.cgi?id=1894634[*BZ#1894634*])
Clone Of:
: 1894639 (view as bug list)
Environment:
Last Closed: 2021-02-24 11:21:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-logging-operator pull 799 0 None closed Bug 1894634: lower default max_retry_wait 2021-02-08 23:50:26 UTC
Red Hat Product Errata RHBA-2021:0652 0 None None None 2021-02-24 11:22:12 UTC

Description Jeff Cantrill 2020-11-04 16:44:37 UTC
Description of problem:
Fluent seems functional but is not shipping logs and it has a log entry of:

failed to write data into buffer by buffer overflow action=:block

Once the pods are restarted logs start flowing again

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

failed to write data into buffer by buffer overflow action=:block

Comment 1 Jeff Cantrill 2020-11-04 16:55:22 UTC
This is modifiable via changes to CL api https://issues.redhat.com/browse/LOG-742

Comment 2 Jeff Cantrill 2020-11-18 20:24:51 UTC
Reopening to lower the latest defaults

Comment 4 Giriyamma 2020-11-30 05:44:56 UTC
Not seeing the log entry 'failed to write data into buffer by buffer overflow action=:block' in fluentd pods, not able to reproduce the issue.
Moving to 'Verified'.

cluster: 4.7.0-0.nightly-2020-11-29-133728 
clusterlogging.4.7.0-202011261728.p0
elasticsearch-operator.4.7.0-202011282020.p0

Comment 11 errata-xmlrpc 2021-02-24 11:21:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Errata Advisory for Openshift Logging 5.0.0), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0652


Note You need to log in before you can comment on or make changes to this bug.