Bug 1492188

Summary: Add keepalive, max_retry_wait parameters to fluentd secure_forward configuration
Product: [oVirt] ovirt-engine-metrics Reporter: Rich Megginson <rmeggins>
Component: GenericAssignee: Shirly Radco <sradco>
Status: CLOSED CURRENTRELEASE QA Contact: Lukas Svaty <lsvaty>
Severity: medium Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: bugs, lveyde
Target Milestone: ovirt-4.1.7Keywords: ZStream
Target Release: 1.0.7.1Flags: rule-engine: ovirt-4.1+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-metrics-1.0.7.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-13 12:26:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Metrics RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1475135, 1493030    

Description Rich Megginson 2017-09-15 16:58:38 UTC
Description of problem:
When fluentd secure_forward connects to a remote secure_forward listener through a proxy, the proxy cannot load balance the connections until they attempt to reconnect.  There is a parameter `keepalive` which can be used to force fluentd secure_forward to reconnect periodically to help with load balancing.

https://github.com/tagomoris/fluent-plugin-secure-forward#secureforwardoutput

I suggest using a value of 5 minutes (300 seconds), and we can tune from there.

I guess this would need to be added to your ansible playbooks/templates for fluentd.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Rich Megginson 2017-09-25 02:05:51 UTC
Also set `max_retry_wait 300`
In my testing of ovirt -> logging, if there are connection problems, the ovirt fluentd will keep exponentially backing off until it is waiting for several hours.  I recommend a max_retry_wait of 300.

Comment 2 Lukas Svaty 2017-10-19 15:44:57 UTC
missing in ovirt-engine-metrics-1.0.7-1.el7ev.noarch

Comment 3 Sandro Bonazzola 2017-10-21 06:14:31 UTC
Lukas, note this has been fixed in 1.0.7.1 not 1.0.7-1

Comment 4 Lukas Svaty 2017-10-25 10:21:43 UTC
[root@/ ]# grep -R max_retry /etc/fluentd/
/etc/fluentd/config.d/30-source-forward.conf:  max_retry_wait 300s
/etc/fluentd/config.d/30-source-forward.conf:  max_retry_wait 300s

verified in ovirt-engine-metrics-1.0.7.1-1.el7ev.noarch