Back to bug 1541429

Who When What Removed Added
Mike Fiedler 2018-02-02 14:38:03 UTC Whiteboard aos-scalability-39
Mike Fiedler 2018-02-02 14:40:23 UTC Summary logging-fluentd gets wedged in a state where retries do not succeed after bulk index queue full logging-fluentd gets wedged in a state where retries do not succeed after bulk index queue full error
Rich Megginson 2018-02-02 15:37:55 UTC CC pportant
Jeff Cantrill 2018-02-02 18:45:20 UTC Status NEW ASSIGNED
Target Release --- 3.9.0
Assignee jcantril rmeggins
Rich Megginson 2018-02-03 23:20:07 UTC CC mifiedle
Flags needinfo?(mifiedle)
Mike Fiedler 2018-02-08 15:58:55 UTC Flags needinfo?(mifiedle)
Jeff Cantrill 2018-02-08 21:45:47 UTC CC jcantril
Flags needinfo?(mifiedle)
Mike Fiedler 2018-02-09 01:42:16 UTC Flags needinfo?(mifiedle)
Mike Fiedler 2018-02-16 02:42:47 UTC Keywords TestBlocker
Jeff Cantrill 2018-02-26 15:53:26 UTC Target Release 3.9.0 3.9.z
Xiaoli Tian 2018-02-27 08:37:16 UTC CC xtian
Flags needinfo?(mifiedle)
Mike Fiedler 2018-02-27 13:01:04 UTC Flags needinfo?(mifiedle)
Anping Li 2018-03-09 15:32:15 UTC QA Contact anli mifiedle
Mike Fiedler 2018-03-21 19:12:25 UTC Keywords TestBlocker
Jeff Cantrill 2018-04-25 19:22:13 UTC Status ASSIGNED ON_QA
Flags needinfo?(mifiedle)
Mike Fiedler 2018-04-27 01:27:29 UTC Flags needinfo?(mifiedle) needinfo?(jcantril)
Mike Fiedler 2018-04-27 14:29:44 UTC Status ON_QA ASSIGNED
Jeff Cantrill 2018-05-11 14:40:09 UTC Status ASSIGNED MODIFIED
Flags needinfo?(jcantril)
Rich Megginson 2018-05-18 14:25:56 UTC Doc Text Cause: When fluentd submits a bulk index request to Elasticsearch and it is rejected, fluentd will wait, then resubmit the entire bulk request again, even if some of the individual operations in the bulk request succeeded. If the request is rejected again, fluentd will perform an exponential backoff, until it retries every 5 minutes.

Consequence: If Elasticsearch is very busy, the user will notice that records do not show up.

Fix: fluentd will go through the bulk index error response and process each individual response. If the response was successful, or a duplicate, fluentd will discard the record. If the response was a "hard" error, fluentd will store the record in a "dead letter queue" - a file that the user will need to examine and determine what can be done about the "bad" records. If the response was a "soft" error, fluentd will resubmit the record to be sent to Elasticsearch.

Result: fluentd does not overwhelm Elasticsearch with retries of operations which have already succeeded, reducing the processing load. fluentd does not keep retrying operations which will never succeed, thus keeping the other records flowing into Elasticsearch.
Doc Type If docs needed, set a value Bug Fix
Jatan Malde 2018-06-07 14:46:13 UTC CC jmalde
Jatan Malde 2018-06-11 08:33:52 UTC Flags needinfo?(rmeggins)
Rich Megginson 2018-06-11 15:29:28 UTC Flags needinfo?(rmeggins)
Rich Megginson 2018-06-11 17:50:28 UTC Fixed In Version logging-fluentd-container-v3.9.30-1
Samuel Munilla 2018-06-11 17:52:37 UTC CC smunilla
Rich Megginson 2018-06-11 17:53:14 UTC Status MODIFIED CLOSED
Resolution --- CURRENTRELEASE
Last Closed 2018-06-11 13:53:14 UTC
Jatan Malde 2018-06-29 12:40:02 UTC Status CLOSED ASSIGNED
Resolution CURRENTRELEASE ---
Flags needinfo?(rmeggins)
Keywords Reopened
Mike Fiedler 2018-06-29 14:00:46 UTC Status ASSIGNED CLOSED
Resolution --- CURRENTRELEASE
Last Closed 2018-06-11 13:53:14 UTC 2018-06-29 10:00:46 UTC
Jatan Malde 2018-06-29 14:17:58 UTC Blocks 1596716
Daein Park 2018-07-17 05:04:30 UTC CC dapark
Jatan Malde 2018-09-27 16:47:48 UTC Flags needinfo?(rmeggins)

Back to bug 1541429