Description of problem: Version-Release number of selected component (if applicable): v4.1.17-201909171057-ose-logging-fluentd How reproducible: Always Steps to Reproduce: 1. Set clusterloging managementState: Unmanaged 2. Set Env to send logs to remote rsyslog 3. Check the fluentd logs Actual results: route to host - connect(2) for 172.30.230.143:9200 (Errno::EHOSTUNREACH)" 2019-09-20 09:52:29 +0000 [warn]: suppressed same stacktrace 2019-09-20 09:53:35 +0000 [warn]: [elasticsearch-apps] retry succeeded. chunk_id="592f902cdca8efe5bcc99fc24b39d0b8" 2019-09-20 10:01:54 +0000 [warn]: [elasticsearch-apps] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=107.87953163499697 slow_flush_log_threshold=20.0 plugin_id="elasticsearch-apps" 2019-09-20 10:01:54 +0000 [warn]: [elasticsearch-apps] buffer flush took longer time than slow_flush_log_threshold: elapsed_time=107.8780552820026 slow_flush_log_threshold=20.0 plugin_id="elasticsearch-apps" 2019-09-20 10:02:42 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:02:42 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:02:42 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:08:06 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:08:06 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:08:06 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:12:56 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:12:56 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:12:56 +0000 [warn]: out:syslog: invalid facility value DEVICE; reset to default local0 2019-09-20 10:18:40 +0000 [warn]: out:syslog: connection error by 172.30.19.219:514 :Broken pipe 2019-09-20 10:18:40 +0000 [warn]: failed to flush the buffer. retry_time=0 next_retry_seconds=2019-09-20 10:18:41 +0000 chunk="592f95447e27a4dfaf485dff1144f968" error_class=Errno::EPIPE error="Broken pipe" 2019-09-20 10:18:40 +0000 [warn]: /etc/fluent/plugin/out_syslog_buffered.rb:176:in `write' 2019-09-20 10:18:40 +0000 [warn]: /etc/fluent/plugin/out_syslog_buffered.rb:176:in `send_to_syslog' 2019-09-20 10:18:40 +0000 [warn]: /etc/fluent/plugin/out_syslog_buffered.rb:90:in `block in write' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/event.rb:323:in `each' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/event.rb:323:in `block in each' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/plugin/buffer/memory_chunk.rb:80:in `open' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/plugin/buffer/memory_chunk.rb:80:in `open' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/event.rb:322:in `each' 2019-09-20 10:18:40 +0000 [warn]: /etc/fluent/plugin/out_syslog_buffered.rb:89:in `write' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/compat/output.rb:131:in `write' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/plugin/output.rb:1125:in `try_flush' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/plugin/output.rb:1425:in `flush_thread_run' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/plugin/output.rb:454:in `block (2 levels) in start' 2019-09-20 10:18:40 +0000 [warn]: /opt/rh/rh-ruby25/root/usr/local/share/gems/gems/fluentd-1.5.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create' Expected results: Additional info:
I don't think this is a regression, or at least, it isn't a regression in our code. Apparently, it is quite common to get non-integer values for the SYSLOG_FACILITY field - on Fedora 30: >sudo journalctl -F SYSLOG_FACILITY | sort 0 1 10 24 3 4 5 9 AGENTS AUDIT BRIDGE BT CONCHECK CORE DEVICE etc. You can see that the troublesome DEVICE is in the list. I think the fix for this will be to change the error log level from `warn` to `debug`
I'm going to target this bz to 4.3, and clone for 4.2, 4.1, and 3.11
@anli - can you confirm that forward isn't working to send logs to rsyslog? If it isn't, then I think this is unrelated to warning message: https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/out_syslog.rb#L88 begin @packet.facility = Integer(record['systemd']['u']['SYSLOG_FACILITY']) rescue log.warn "out:syslog: invalid facility value #{record['systemd']['u']['SYSLOG_FACILITY']}; reset to default #{@facility}" @packet.facility = @facility end That is, it will issue the message, then continue with the default i.e. it should still be forwarding logs, despite the error message.
Yes, It is not related to the message. the message can be sent to rsyslog in 4.1.18 and in 4.2. How could I deal with the bug in 4.2 and 4.3?
(In reply to Anping Li from comment #4) > Yes, It is not related to the message. the message can be sent to rsyslog in > 4.1.18 and in 4.2. How could I deal with the bug in 4.2 and 4.3? I intend to clone/backport this fix to 4.2, 4.1, and 3.11. I've already cloned this bz to 4.2: https://bugzilla.redhat.com/show_bug.cgi?id=1756072 Once this https://bugzilla.redhat.com/show_bug.cgi?id=1753936 bz is verified, I can merge the 4.2 fix, which will move https://bugzilla.redhat.com/show_bug.cgi?id=1756072 to MODIFIED, and then I will backport the fix to 4.1 and clone the bz for 4.1, etc.
Verified in openshift/ose-logging-fluentd@sha256:d5852e035184bf7f12e26d7fbd6340737bdabd54bc6d34af8375aabeaa50e3d2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062