Bug 1448951 - Fluentd stack traces complaining about undefined method 'status' for nil:NilClass
Summary: Fluentd stack traces complaining about undefined method 'status' for nil:NilC...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.4.1
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: 3.5.z
Assignee: Rich Megginson
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-08 18:56 UTC by Peter Portante
Modified: 2019-10-23 02:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-14 19:11:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github https://github.com/elastic elasticsearch-ruby issues 428 0 None None None 2017-10-06 23:59:10 UTC

Description Peter Portante 2017-05-08 18:56:13 UTC
Filed issue https://github.com/elastic/elasticsearch-ruby/issues/428 upstream.

This is probably hiding connection issues with Elasticsearch.

Comment 1 Rich Megginson 2017-06-16 17:04:17 UTC
I think this is either related to, or a dup of, https://bugzilla.redhat.com/show_bug.cgi?id=1399388

Comment 2 Jeff Cantrill 2017-09-13 17:54:44 UTC
Rich,

Is this resolved with our 3.6.1 changes to address some perf issues and dropped messages possibly?

Comment 3 Rich Megginson 2017-09-13 18:06:06 UTC
(In reply to Jeff Cantrill from comment #2)
> Rich,
> 
> Is this resolved with our 3.6.1 changes to address some perf issues and
> dropped messages possibly?

I don't know because it is incredibly difficult to reproduce this problem.

Comment 4 Jeff Cantrill 2017-10-06 15:07:18 UTC

*** This bug has been marked as a duplicate of bug 1489533 ***

Comment 5 Peter Portante 2017-10-06 23:59:10 UTC
(In reply to Jeff Cantrill from comment #4)
> 
> *** This bug has been marked as a duplicate of bug 1489533 ***

Earlier in comment 1 it was asserted this is might be a duplicate of, or at least related to, a different bug:

  https://bugzilla.redhat.com/show_bug.cgi?id=1399388

    Failed to ship logs by "Cannot get new connection from pool." to
    AWS Elasticsearch after start logging-fluentd pod for a while

  Resolution: change fluentd config to use:

    reload_connections false
    reload_on_failure false

This bug had been closed as a duplicate of:

  https://bugzilla.redhat.com/show_bug.cgi?id=1489533

  logging-fluentd needs to periodically reconnect to logging-mux
  or elasticsearch to help balance sessions

But it does not appear to be related to either bug described above.  Instead, this bug is likely related to the proper use of the Ruby API stack to talk to Elasticsearch 2.x.  We might be able to close this bug as resolved by that work to correct the use of the proper ruby gems.

Comment 6 Rich Megginson 2017-10-07 01:30:07 UTC
I think if we fix https://bugzilla.redhat.com/show_bug.cgi?id=1489533 by making the reload behavior work for our case, then this bug goes away.

Then, there may still be some underlying bug in elasticsearch-ruby, or in fluent-plugin-elasticsearch, related to connection reload, but we won't hit it because we won't be using that mechanism.

Otherwise, what is our resolution to this bug?  Are we going to submit a PR to fix https://github.com/elastic/elasticsearch-ruby/issues/428?

Comment 7 Peter Portante 2017-10-09 14:24:01 UTC
(In reply to Rich Megginson from comment #6)
> I think if we fix https://bugzilla.redhat.com/show_bug.cgi?id=1489533 by
> making the reload behavior work for our case, then this bug goes away.

Not sure I follow.  This BZ references an issue upstream which references to the use of gems:

  fluentd-0.12.29
  fluent-plugin-elasticsearch-1.9.1
  elasticsearch-api-1.0.18
  elasticsearch-transport-1.0.18

I believe this stack works for our 3.2 and 3.3 product versions, since they are based on 1.5.x of Elasticsearch, but won't work properly for 3.4 and later.

Since we have already corrected this stack to use the proper versions intended for Elasticsearch 2.x with releases 3.4.z and later, I am guessing this bug can be closed as a duplicate whatever BZ drove those original changes.

> Then, there may still be some underlying bug in elasticsearch-ruby, or in
> fluent-plugin-elasticsearch, related to connection reload, but we won't hit
> it because we won't be using that mechanism.

FWIW, the code path where we hit this bug does not appear to be on a reload case.  This looks like a simple bug in the error handling logic, where it expects a "response" object when a "ServerError" is raised, but no object is present.


Note You need to log in before you can comment on or make changes to this bug.