Bug 1616352 - logging-fluentd needs to periodically reconnect to logging-mux or elasticsearch to help balance sessions
Summary: logging-fluentd needs to periodically reconnect to logging-mux or elasticsear...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.10.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.10.z
Assignee: Jeff Cantrill
QA Contact: Mike Fiedler
URL:
Whiteboard:
Depends On: 1489533
Blocks: 1616354
TreeView+ depends on / blocked
 
Reported: 2018-08-15 16:45 UTC by Rich Megginson
Modified: 2018-09-22 04:56 UTC (History)
8 users (show)

Fixed In Version: openshift3/logging-fluentd:v3.10.34-1
Doc Type: Enhancement
Doc Text:
Feature: Fluentd will now reconnect to Elasticsearch every 100 operations by default. Reason: If one Elasticsearch is started before the others in the cluster, the load balancer in the Elasticsearch service will connect to that one and that one only, and so will all of the Fluentd connecting to Elasticsearch. Result: By having Fluentd reconnect periodically, the load balancer will be able to spread the load evenly among all of the Elasticsearch in the cluster.
Clone Of: 1489533
: 1616354 (view as bug list)
Environment:
Last Closed: 2018-09-22 04:55:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github uken fluent-plugin-elasticsearch pull 459 0 None None None 2018-08-15 16:45:46 UTC
Red Hat Product Errata RHBA-2018:2660 0 None None None 2018-09-22 04:56:04 UTC

Comment 3 Anping Li 2018-09-11 13:48:35 UTC
@mifiedle, Could you help test this bug and point out the steps?  I have started 10 ocp-logtest pods with rate 600 on two nodes cluster.  But I haven't watched the sessions on the scaled mux pods.

Comment 4 Mike Fiedler 2018-09-11 19:02:36 UTC
The easiest way to test is is with the ss utility, but unfortunately our ES pod image does not have it.   I copy the utility there and run it.

1. From a linux system that has ss (from iproute package):  

oc scp /usr/sbin/ss -c elasticsearch <pod>:/tmp/ss

2. oc rsh into the ES pod

3. /tmp/ss -tnp | grep 9200 | awk {'print $5'} | cut -f4 -d":" | sort -u  This shows all client IPs with connections to ES

4. Send log traffic for a while and repeat step 3.   The list of connected clients should be different. 

5. Check on all ES servers and each should have roughly an equal number of clients

Comment 5 Anping Li 2018-09-13 06:24:04 UTC
Verified with logging:v3.10.45.  The ES had roughly an equal connections.

Comment 7 errata-xmlrpc 2018-09-22 04:55:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2660


Note You need to log in before you can comment on or make changes to this bug.