Bug 1816770 - Fluentd inserts \\'\\' into some log messages ingested from container logs
Summary: Fluentd inserts \\'\\' into some log messages ingested from container logs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.5
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
: 4.5.0
Assignee: Sergey Yedrikov
QA Contact: Qiaoling Tang
URL:
Whiteboard:
Depends On:
Blocks: 1816836
TreeView+ depends on / blocked
 
Reported: 2020-03-24 17:09 UTC by Sergey Yedrikov
Modified: 2020-08-04 18:06 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Incorrect line separator configured in Fluentd for container log lines split across multiple lines in CRI-O logs. Consequence: Fluentd inserts \\'\\' into some log messages ingested from container logs. Fix: Configure the correct (empty) line separator for Fluentd. Result: Fluentd no longer inserts \\'\\' into some log messages ingested from container logs.
Clone Of:
: 1816836 1816837 (view as bug list)
Environment:
Last Closed: 2020-08-04 18:06:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-logging-operator pull 430 0 None closed Bug 1816770: Fluentd inserts \'\' into some log messages ingested from container logs 2020-08-27 10:10:07 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-08-04 18:06:58 UTC

Description Sergey Yedrikov 2020-03-24 17:09:14 UTC
Description of problem:

CRI-O sometimes splits log lines that come out of containers into two lines in its own log:

2020-03-22T04:25:26.859339384+00:00 stdout F loader seq - 8edc9164e37340bb8b5608e5fea61cc8 - 0000104795 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
2020-03-22T04:25:26.859339384+00:00 stdout P loader seq 
2020-03-22T04:25:26.859419323+00:00 stdout F - 8edc9164e37340bb8b5608e5fea61cc8 - 0000104796 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

F stands for a Full and P stands for a Partial container log line. Partial log lines are continued on the next line of CRI-O log.

Fluentd reassembles partial CRI-O log lines into full ones with the help of the https://github.com/fluent-plugins-nursery/fluent-plugin-concat plugin: https://github.com/openshift/cluster-logging-operator/blob/master/pkg/generators/forwarding/fluentd/templates.go#L69:L75 . The separator for the two parts of the reassembled line is

separator \'\'

and Fluentd inserts that into the log messages that it forwards to Elastic; for example:

{"_index":".operations.2020.03.22","_type":"com.redhat.viaq.common","_id":"ZDY5ZjUwYTEtNGVlNi00MjQzLTgxZjgtNWMzOWIzNDYxNDIw","_score":2.0310788,"_source":{"pipeline_metadata":{"collector":{"received_at":"2020-03-22T04:25:26.859419323+00:00"}},"message":"loader seq \\'\\'- 8edc9164e37340bb8b5608e5fea61cc8 - 0000104796 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}}

Version-Release number of selected component (if applicable):
4.5

How reproducible:
Always, with enough logs generated by a container to trigger a split line output from CRI-o

Steps to Reproduce:
1. Run the following job
apiVersion: batch/v1
kind: Job
metadata:
  name: logging-load-driver
spec:
  template:
    spec:
      containers:
      - name: logging-load-driver
        image: quay.io/syedriko/logging-load-driver:latest
        command: ["/loader", "--msgpersec", "100", "--payload-gen", "fixed", "--distribution", "fixed", "100", "--report", "stderr", "--total-size", "10"]
      restartPolicy: Never
  backoffLimit: 1

, which wraps the loader from https://github.com/ViaQ/logging-load-driver

2. In Kibana, search for "message: \\'\\'"


Actual results:
loader seq \\'\\'- 8edc9164e37340bb8b5608e5fea61cc8 - 0000104796 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Expected results:
loader seq - 8edc9164e37340bb8b5608e5fea61cc8 - 0000104796 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Additional info:

Comment 1 Sergey Yedrikov 2020-03-24 20:39:17 UTC
More details on how to repro this issue:

Once the logging-load-driver job has completed

2. Create a route for Elasticsearch as described in https://docs.openshift.com/container-platform/4.3/logging/config/cluster-logging-elasticsearch.html#cluster-logging-elasticsearch-exposing_cluster-logging-elasticsearch.

3. Install https://github.com/taskrabbit/elasticsearch-dump as described in https://github.com/taskrabbit/elasticsearch-dump#installing.

4. Dump the logs generated by the logging-load-driver job, substituting the kubernetes.pod_name with the pod name of the completed job and your authorization token from "oc whoami -t":

./node_modules/elasticdump/bin/elasticdump --input=https://elasticsearch-openshift-logging.apps-crc.testing/.operations.* --output=$ --searchBody='{"_source": ["pipeline_metadata.collector.received_at", "message"], "query":{"bool": { "must": [{ "term": { "kubernetes.pod_name": "logging-load-driver-phwcz"}}]}}}' --headers='{"Authorization": "Bearer 1u9oYgxIhzQOYxoHYfjPXEROshmanQNjROCz3jNZ-Yg"}' > out

5. Run (this is not a joke) 
$ grep \\\\\\\\\'\\\\\\\\\' out
You're looking for entries likes this (note "loader seq \\'\\'"):

{"_index":".operations.2020.03.22","_type":"com.redhat.viaq.common","_id":"YmI2YTAyZTktYjY5OC00ZTQxLTg3YzQtNmQzODA0NmZiZGMz","_score":2.0310788,"_source":{"pipeline_metadata":{"collector":{"received_at":"2020-03-22T04:25:00.058292+00:00"}},"message":"loader seq \\'\\'- 8edc9164e37340bb8b5608e5fea61cc8 - 0000102090 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}}

whereas normal ones look like this

{"_index":".operations.2020.03.22","_type":"com.redhat.viaq.common","_id":"OTIwNDI1ZmYtODdlYy00ZjhiLWIzZDEtOWU5N2ZmMjEzMjg2","_score":2.0310788,"_source":{"pipeline_metadata":{"collector":{"received_at":"2020-03-22T04:11:40.262683+00:00"}},"message":"loader seq - 8edc9164e37340bb8b5608e5fea61cc8 - 0000021977 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}}

Comment 4 Qiaoling Tang 2020-03-27 09:05:26 UTC
Verified with quay.io/openshift/origin-cluster-logging-operator@sha256:8bd2ace2cec60113512bf5f2ff621ae2bbc3b16d7de00b844ea10e09f0b9cfd1

Comment 6 errata-xmlrpc 2020-08-04 18:06:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.