Bug 1816770

Summary: Fluentd inserts \\'\\' into some log messages ingested from container logs
Product: OpenShift Container Platform Reporter: Sergey Yedrikov <syedriko>
Component: LoggingAssignee: Sergey Yedrikov <syedriko>
Status: CLOSED ERRATA QA Contact: Qiaoling Tang <qitang>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.5CC: aos-bugs
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Incorrect line separator configured in Fluentd for container log lines split across multiple lines in CRI-O logs. Consequence: Fluentd inserts \\'\\' into some log messages ingested from container logs. Fix: Configure the correct (empty) line separator for Fluentd. Result: Fluentd no longer inserts \\'\\' into some log messages ingested from container logs.
Story Points: ---
Clone Of:
: 1816836 1816837 (view as bug list) Environment:
Last Closed: 2020-08-04 18:06:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1816836    

Description Sergey Yedrikov 2020-03-24 17:09:14 UTC
Description of problem:

CRI-O sometimes splits log lines that come out of containers into two lines in its own log:

2020-03-22T04:25:26.859339384+00:00 stdout F loader seq - 8edc9164e37340bb8b5608e5fea61cc8 - 0000104795 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
2020-03-22T04:25:26.859339384+00:00 stdout P loader seq 
2020-03-22T04:25:26.859419323+00:00 stdout F - 8edc9164e37340bb8b5608e5fea61cc8 - 0000104796 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

F stands for a Full and P stands for a Partial container log line. Partial log lines are continued on the next line of CRI-O log.

Fluentd reassembles partial CRI-O log lines into full ones with the help of the https://github.com/fluent-plugins-nursery/fluent-plugin-concat plugin: https://github.com/openshift/cluster-logging-operator/blob/master/pkg/generators/forwarding/fluentd/templates.go#L69:L75 . The separator for the two parts of the reassembled line is

separator \'\'

and Fluentd inserts that into the log messages that it forwards to Elastic; for example:

{"_index":".operations.2020.03.22","_type":"com.redhat.viaq.common","_id":"ZDY5ZjUwYTEtNGVlNi00MjQzLTgxZjgtNWMzOWIzNDYxNDIw","_score":2.0310788,"_source":{"pipeline_metadata":{"collector":{"received_at":"2020-03-22T04:25:26.859419323+00:00"}},"message":"loader seq \\'\\'- 8edc9164e37340bb8b5608e5fea61cc8 - 0000104796 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}}

Version-Release number of selected component (if applicable):
4.5

How reproducible:
Always, with enough logs generated by a container to trigger a split line output from CRI-o

Steps to Reproduce:
1. Run the following job
apiVersion: batch/v1
kind: Job
metadata:
  name: logging-load-driver
spec:
  template:
    spec:
      containers:
      - name: logging-load-driver
        image: quay.io/syedriko/logging-load-driver:latest
        command: ["/loader", "--msgpersec", "100", "--payload-gen", "fixed", "--distribution", "fixed", "100", "--report", "stderr", "--total-size", "10"]
      restartPolicy: Never
  backoffLimit: 1

, which wraps the loader from https://github.com/ViaQ/logging-load-driver

2. In Kibana, search for "message: \\'\\'"


Actual results:
loader seq \\'\\'- 8edc9164e37340bb8b5608e5fea61cc8 - 0000104796 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Expected results:
loader seq - 8edc9164e37340bb8b5608e5fea61cc8 - 0000104796 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Additional info:

Comment 1 Sergey Yedrikov 2020-03-24 20:39:17 UTC
More details on how to repro this issue:

Once the logging-load-driver job has completed

2. Create a route for Elasticsearch as described in https://docs.openshift.com/container-platform/4.3/logging/config/cluster-logging-elasticsearch.html#cluster-logging-elasticsearch-exposing_cluster-logging-elasticsearch.

3. Install https://github.com/taskrabbit/elasticsearch-dump as described in https://github.com/taskrabbit/elasticsearch-dump#installing.

4. Dump the logs generated by the logging-load-driver job, substituting the kubernetes.pod_name with the pod name of the completed job and your authorization token from "oc whoami -t":

./node_modules/elasticdump/bin/elasticdump --input=https://elasticsearch-openshift-logging.apps-crc.testing/.operations.* --output=$ --searchBody='{"_source": ["pipeline_metadata.collector.received_at", "message"], "query":{"bool": { "must": [{ "term": { "kubernetes.pod_name": "logging-load-driver-phwcz"}}]}}}' --headers='{"Authorization": "Bearer 1u9oYgxIhzQOYxoHYfjPXEROshmanQNjROCz3jNZ-Yg"}' > out

5. Run (this is not a joke) 
$ grep \\\\\\\\\'\\\\\\\\\' out
You're looking for entries likes this (note "loader seq \\'\\'"):

{"_index":".operations.2020.03.22","_type":"com.redhat.viaq.common","_id":"YmI2YTAyZTktYjY5OC00ZTQxLTg3YzQtNmQzODA0NmZiZGMz","_score":2.0310788,"_source":{"pipeline_metadata":{"collector":{"received_at":"2020-03-22T04:25:00.058292+00:00"}},"message":"loader seq \\'\\'- 8edc9164e37340bb8b5608e5fea61cc8 - 0000102090 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}}

whereas normal ones look like this

{"_index":".operations.2020.03.22","_type":"com.redhat.viaq.common","_id":"OTIwNDI1ZmYtODdlYy00ZjhiLWIzZDEtOWU5N2ZmMjEzMjg2","_score":2.0310788,"_source":{"pipeline_metadata":{"collector":{"received_at":"2020-03-22T04:11:40.262683+00:00"}},"message":"loader seq - 8edc9164e37340bb8b5608e5fea61cc8 - 0000021977 - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}}

Comment 4 Qiaoling Tang 2020-03-27 09:05:26 UTC
Verified with quay.io/openshift/origin-cluster-logging-operator@sha256:8bd2ace2cec60113512bf5f2ff621ae2bbc3b16d7de00b844ea10e09f0b9cfd1

Comment 6 errata-xmlrpc 2020-08-04 18:06:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409