Bug 1632361

Summary: [3.10] Fluentd cannot handle S2I Logs
Product: OpenShift Container Platform Reporter: Rich Megginson <rmeggins>
Component: LoggingAssignee: Rich Megginson <rmeggins>
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.10.0CC: anli, aos-bugs, nalentor, qitang, rmeggins, stwalter
Target Milestone: ---   
Target Release: 3.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: logging-fluentd-container-v3.10.51-1 Doc Type: Bug Fix
Doc Text:
Cause: When using docker with the journald log driver, all container logs, including system and plain docker container logs, are logged to the journal, and read by fluentd. Consequence: fluentd does not know how to handle these non-kubernetes container logs and throws exceptions. Fix: Treat non-kubernetes container logs as logs from other system services e.g. send them to the .operations.* index. Result: Logs from non-kubernetes containers are indexed correctly and do not cause any errors.
Story Points: ---
Clone Of: 1632130
: 1632364 (view as bug list) Environment:
Last Closed: 2018-11-11 16:39:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1632130    
Bug Blocks: 1632364    

Description Rich Megginson 2018-09-24 16:05:13 UTC
+++ This bug was initially created as a clone of Bug #1632130 +++

Description of problem:
Fluentd does not handle logs from "s2i" containers.
Fluentd expects all container logs from the journal have a CONTAINER_NAME value with a "k8s_" prefix - it doesn't know how to handle the "s2i_" prefix.

Version-Release number of selected component (if applicable): OCP 3.9.27

How reproducible:
Create S2I pod and look for logs in Kibana


Steps to Reproduce:
1. Create a project/namespace. 
2. Create a couple S2I pods in project along with other pods, can use console and pick from catalog. 
3. Later, go to Kibana with the user owner of the project or an user with cluster admin role.

Actual results: 
Logs will be visible for some pods, but not S2I pods.

Expected results:
Logs should be visible for all pods.


Additional info:

Abstract from our cluster's logs:

2018-09-19 07:48:17 -0400 [warn]: dump an error event: error_class=TypeError error="no implicit conversion of nil into String" location="/usr/share/gems/gems/fluent-plugin-viaq_data_model-0.0.14/lib/fluent/plugin/filter_viaq_data_model.rb:413:in `parse'" tag="journal.container._openshift_" time=1536076266 record={"PRIORITY"=>"6", "_TRANSPORT"=>"journal", "_PID"=>"10183", "_UID"=>"0", "_GID"=>"0", "_COMM"=>"dockerd-current", "_EXE"=>"/usr/bin/dockerd-current", "_CMDLINE"=>"/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --selinux-enabled --log-driver=journald --signature-verification=False --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/docker--vg-docker--pool --storage-opt dm.use_deferred_removal=true --mtu=8951 --add-registry registry.access.redhat.com --add-registry registry.access.redhat.com", "_CAP_EFFECTIVE"=>"1fffffffff", "_SYSTEMD_CGROUP"=>"/system.slice/docker.service", "_SYSTEMD_UNIT"=>"docker.service", "_SYSTEMD_SLICE"=>"system.slice", "_SELINUX_CONTEXT"=>"system_u:system_r:container_runtime_t:s0", "_BOOT_ID"=>"500dce08e3ef491b9107b3d389cd2966", "_MACHINE_ID"=>"cec1cf01289d4a53b9488a651ddca907", "_HOSTNAME"=>"node9.rhpds.internal", "CONTAINER_ID"=>"448aa02d4d4a", "CONTAINER_ID_FULL"=>"448aa02d4d4abf1a6a07590fb8a400196501a49eaf38173de0b89dca4f564c03", "CONTAINER_NAME"=>"s2i_registry_access_redhat_com_jboss_eap_7_eap70_openshift_sha256_7b4d8986212601403ca07b320fd5dadb9f624298251620b8f6e2b55f993d9124_d50b4d88", "MESSAGE"=>"[INFO] Downloading: http://nexus.opentlc-shared.svc.cluster.local:8081/repository/maven-all-public/org/jboss/spec/jboss-javaee-7.0/1.0.3.Final-redhat-2/jboss-javaee-7.0-1.0.3.Final-redhat-2.pom", "_SOURCE_REALTIME_TIMESTAMP"=>"1536076266567114", "pipeline_metadata"=>{"collector"=>{"ipaddr4"=>"10.1.21.57", "ipaddr6"=>"fe80::9889:7aff:fe35:4a94", "inputname"=>"fluent-plugin-systemd", "name"=>"fluentd", "received_at"=>"2018-09-19T11:48:17.193205+00:00", "version"=>"0.12.43 1.6.0"}}}

Comment 1 Rich Megginson 2018-09-25 17:31:45 UTC
PR merged upstream - waiting for next automated sync/build

Comment 3 Rich Megginson 2018-09-25 18:04:53 UTC
Setting to MODIFIED as per https://mojo.redhat.com/docs/DOC-1178565 "How do I get my Bugzilla Bug to VERIFIED?"

Comment 5 Qiaoling Tang 2018-10-16 06:29:01 UTC
Verified with logging-fluentd-v3.10.56-1

Steps to verify:
1. set journald as docker log-driver, then deploy logging
2. create some s2i pods under 'openshift' namespace to make the container_name have the string '_openshift_'
71000ebe481d        docker-registry.default.svc:5000/openshift/eap-app@sha256:bf5bd314b964c64cd28fb423a1e142d460468d10104c859e37bd691659ae0b7c       "/usr/local/s2i/run"     5 minutes ago       Up 4 minutes                            k8s_eap-app_eap-app-1-rcwgq_openshift_1a830019-d106-11e8-956b-42010af0000a_0
4fce4bce5e8d        registry.reg-aws.openshift.com:443/openshift3/ose-pod:v3.10.56                                                                   "/usr/bin/pod"           5 minutes ago       Up 5 minutes                            k8s_POD_eap-app-1-rcwgq_openshift_1a830019-d106-11e8-956b-42010af0000a_0
3. view logs in kibana, there are 
"kubernetes.namespace_name: openshift", "kubernetes.pod_name: eap-app-1-build" and "kubernetes.container_name: sti-build" in the log document. No log lost.

Comment 7 errata-xmlrpc 2018-11-11 16:39:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2709