Description of problem: Deploy logging 3.6 stacks on OCP 3.6, can't collect log entries due to the following fluentd error: 2017-08-16 04:52:01 -0400 [error]: Exception emitting record: "\x92" from ASCII-8BIT to UTF-8 2017-08-16 04:52:01 -0400 [warn]: emit transaction failed: error_class=Encoding::UndefinedConversionError error="\"\\x92\" from ASCII-8BIT to UTF-8" tag="journal.system" Issue repro regardless of whether this parameter is specified in inventory: openshift_logging_fluentd_journal_read_from_head=true [error]: Exception emitting record: "\x92" from ASCII-8BIT to UTF-8 Version-Release number of selected component (if applicable): logging-fluentd v3.6.173.0.5-6 95dede9f3cb2 9 hours ago 235.1 MB # openshift version openshift v3.6.173.0.5 kubernetes v1.6.1+5115d708d7 etcd 3.2.1 How reproducible: Always Steps to Reproduce: 1.Deploy logging 3.6 stacks on OCP 3.6 with the attached inventory file 2.Wait until EFK pods are running 3.Check fluentd logs Actual results: Can't collect log entries due to fluentd error Expected results: fluentd should work Additional info: full log of fluentd attached inventory file of logging deployment attached
Created attachment 1314021 [details] inventory file used for logging deployment
Created attachment 1314027 [details] fluentd log
This bz is currently blocking logging tests on OCP 3.6.0 envs.
the fluentd image in brew looks suspiciously small compared to the one freshly build from the branch rhaos-3.6-rhel-7. Perhaps incorrect build got pushed to brew? brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd v3.6 95dede9f3cb2 15 hours ago 235.1 MB local-reg:5000/openshift/logging-fluentd <none> 0ac973960bdb 4 hours ago 360.6 MB
Can we log into the system? I want to look at the journal and see if I can find which record is causing this problem.
*** Bug 1482532 has been marked as a duplicate of this bug. ***
Installing logging with openshift_logging_image_version=v3.6.173.0.5 - this problem is seen. Installing logging with openshift_logging_image_version=v3.6.171 - this problem is NOT seen.
(In reply to Mike Fiedler from comment #11) > Installing logging with openshift_logging_image_version=v3.6.173.0.5 - this > problem is seen. > > Installing logging with openshift_logging_image_version=v3.6.171 - this > problem is NOT seen. Right. Switching the buffer_type from "memory" to "file" happened after the version was bumped to v3.6.171.
This is easy to reproduce with flexy. I'm thinking that the fluentd dependencies are conflicting - they are not up to date - once the 3.6 puddle is rebuilt I can build and test a new fluentd image.
(In reply to Rich Megginson from comment #14) > This is easy to reproduce with flexy. > > I'm thinking that the fluentd dependencies are conflicting - they are not up > to date - once the 3.6 puddle is rebuilt I can build and test a new fluentd > image. This did not help :-( Now resorting to debugging the ruby code . . .
The bug was introduced in logging-fluentd:v3.6.173.0.5-6 - logging-fluentd:v3.6.173.0.5-5 and earlier work. These are the commits between -5 and -6: http://pkgs.devel.redhat.com/cgit/rpms/logging-fluentd-docker/log/?h=rhaos-3.6-rhel-7 Impl fluentd file buffer. remove USE_MUX_CLIENT; mux service always check for k8s metadata fluentd 0.12.39; k8s filter 0.28.0; viaq 0.0.5
The error doesn't appear to be related to systemd input or elasticsearch output - I tried fluentd secure_forward with file buffer -> mux es with file buffer in both fluentd and mux I see the conversion error. So it must have something to do with file buffer, but I just don't know what it could be.
Thanks, the fluentd is now back, checked with fluentd:v3.6.173.0.5-10 image that log entries can be collected and reflect on kibana. Set to verified. Image verified with: logging-fluentd v3.6.173.0.5-10 58ab4badc0b7 6 hours ago 235.1 MB
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3049