Bug 1632648 - [3.11] Record is missing kubernetes field when use '--log-driver journald' in /etc/sysconfig/docker
Summary: [3.11] Record is missing kubernetes field when use '--log-driver journald' i...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.11.z
Assignee: Rich Megginson
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks: 1638899 1638900 1638902
TreeView+ depends on / blocked
 
Reported: 2018-09-25 10:03 UTC by Anping Li
Modified: 2019-03-08 10:26 UTC (History)
5 users (show)

Fixed In Version: openshift3/ose-logging-fluentd:v3.11.23-1
Doc Type: Bug Fix
Doc Text:
Cause: When using docker with log-driver journald, the setting in /etc/sysconfig/docker has changed to use --log-driver journald instead of --log-driver=journald Consequence: Fluentd cannot detect that journald is being used, so assumes json-file, and cannot read any kubernetes metadata because it doesn't look for the journald CONTAINER_NAME field. The result is lots of fluentd errors. Fix: Change the way Fluentd detects the docker log driver so that it looks for --log-driver journald in addition to --log-driver=journald. Result: Fluentd can detect the docker log driver, and can correctly process Kubernetes container logs.
Clone Of:
: 1638899 (view as bug list)
Environment:
Last Closed: 2018-11-20 03:10:43 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin-aggregated-logging pull 1365 0 'None' closed [release-3.11] Bug 1632648 - Record is missing kubernetes field when Docker log driver is journald 2020-05-07 14:17:55 UTC
Red Hat Product Errata RHBA-2018:3537 0 None None None 2018-11-20 03:11:19 UTC

Description Anping Li 2018-09-25 10:03:31 UTC
Description of problem:
When the Docker log driver is journald. the fluentd couldn't get the kubernetes fields for the container logs. all container records are sent to .orphaned index


Version-Release number of selected component (if applicable):
openshift3/logging-fluentd/images/v3.10.45-2
openshift3/ose-logging-fluentd/images/v3.11.14-1

How reproducible:
Always

Steps to Reproduce:
1. Deploy Openshift using jouranld log driver
 [OSEv3:vars]
   openshift_docker_log_driver=journald
2. Deploy logging and deploy applications in project

3. List the index
 https://localhost:9200/_cat/shards
.operations.2018.09.25 0 p STARTED 1118247  1.2gb 10.129.0.16 logging-es-data-master-dxqfct3g
.orphaned.2018.09.25   0 p STARTED   12632 11.4mb 10.129.0.16 logging-es-data-master-dxqfct3g
.kibana                0 p STARTED       1  3.2kb 10.129.0.16 logging-es-data-master-dxqfct3g
.searchguard           0 p STARTED       5 32.2kb 10.129.0.16 logging-es-data-master-dxqfct3g

4. Check the fluentd debug logs

2018-09-25 05:43:40 -0400 [error]: plugin/filter_viaq_data_model.rb:398:block in add_elasticsearch_index_name_field: record cannot use elasticsearch index name type project_full: record is missing kubernetes field: {"CONTAINER_TAG"=>"360c189e376f", "systemd"=>{"t"=>{"BOOT_ID"=>"2d2638ef904a43578373ae9374bdc19f", "CAP_EFFECTIVE"=>"1fffffffff", "CMDLINE"=>"/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --init-path=/usr/libexec/docker/docker-init-current --seccomp-profile=/etc/docker/seccomp.json --selinux-enabled --log-driver journald --signature-verification=False --storage-driver overlay2 --mtu=8951 --add-registry registry.redhat.io --block-registry registry.hacker.com --insecure-registry brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888 --insecure-registry virt-openshift-05.lab.eng.nay.redhat.com:5000 --insecure-registry virt-openshift-05.lab.eng.nay.redhat.com:5001 --insecure-registry registry.reg-aws.openshift.com:443 --insecure-registry asb-registry.usersys.redhat.com:5000 --add-registry registry.access.redhat.com", "COMM"=>"dockerd-current", "EXE"=>"/usr/bin/dockerd-current", "GID"=>"0", "MACHINE_ID"=>"c8bb65be446e4c10a95db25ee0c9565d", "PID"=>"4071", "SELINUX_CONTEXT"=>"system_u:system_r:container_runtime_t:s0", "SYSTEMD_CGROUP"=>"/system.slice/docker.service", "SYSTEMD_SLICE"=>"system.slice", "SYSTEMD_UNIT"=>"docker.service", "TRANSPORT"=>"journal", "UID"=>"0"}}, "level"=>"info", "message"=>"2018-09-25 09:43:40,139 - SVTLogger - INFO - centos-logtest-76krn : 23916 : kTwY7XX9N ftazV4wq2 IusGtcLAu 7hr1kIzF8 9SozNh3tW mKUPLIrer cVjcYfuRz CKdSu73ef 4TxEPnjs0 eGz55XNc9 lqqFfEKAg 3h7DvZnmb TXO2sV521 az2NbHaUA 6ha50o5pP MVzKooeDR iMIozQxC4 jtAYFJMrX KU4e7tssG OZdppkafl ", "hostname"=>"ip-172-18-4-174.ec2.internal", "pipeline_metadata"=>{"collector"=>{"ipaddr4"=>"10.129.0.18", "ipaddr6"=>"fe80::c0ac:94ff:fe05:6fd4", "inputname"=>"fluent-plugin-systemd", "name"=>"fluentd", "received_at"=>"2018-09-25T09:43:40.934757+00:00", "version"=>"0.12.43 1.6.0"}}, "@timestamp"=>"2018-09-25T09:43:40.139658+00:00"}
2018-09-25 05:43:41 -0400 [error]: plugin/filter_viaq_data_model.rb:398:block in add_elasticsearch_index_name_field: record cannot use elasticsearch index name type project_full: record is missing kubernetes field: {"CONTAINER_TAG"=>"360c189e376f", "systemd"=>{"t"=>{"BOOT_ID"=>"2d2638ef904a43578373ae9374bdc19f", "CAP_EFFECTIVE"=>"1fffffffff", "CMDLINE"=>"/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --init-path=/usr/libexec/docker/docker-init-current --seccomp-profile=/etc/docker/seccomp.json --selinux-enabled --log-driver journald --signature-verification=False --storage-driver overlay2 --mtu=8951 --add-registry registry.redhat.io --block-registry registry.hacker.com --insecure-registry brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888 --insecure-registry virt-openshift-05.lab.eng.nay.redhat.com:5000 --insecure-registry virt-openshift-05.lab.eng.nay.redhat.com:5001 --insecure-registry registry.reg-aws.openshift.com:443 --insecure-registry asb-registry.usersys.redhat.com:5000 --add-registry registry.access.redhat.com", "COMM"=>"dockerd-current", "EXE"=>"/usr/bin/dockerd-current", "GID"=>"0", "MACHINE_ID"=>"c8bb65be446e4c10a95db25ee0c9565d", "PID"=>"4071", "SELINUX_CONTEXT"=>"system_u:system_r:container_runtime_t:s0", "SYSTEMD_CGROUP"=>"/system.slice/docker.service", "SYSTEMD_SLICE"=>"system.slice", "SYSTEMD_UNIT"=>"docker.service", "TRANSPORT"=>"journal", "UID"=>"0"}}, "level"=>"info", "message"=>"2018-09-25 09:43:41,140 - SVTLogger - INFO - centos-logtest-76krn : 23917 : kTwY7XX9N ftazV4wq2 IusGtcLAu 7hr1kIzF8 9SozNh3tW mKUPLIrer cVjcYfuRz CKdSu73ef 4TxEPnjs0 eGz55XNc9 lqqFfEKAg 3h7DvZnmb TXO2sV521 az2NbHaUA 6ha50o5pP MVzKooeDR iMIozQxC4 jtAYFJMrX KU4e7tssG OZdppkafl ", "hostname"=>"ip-172-18-4-174.ec2.internal", "pipeline_metadata"=>{"collector"=>{"ipaddr4"=>"10.129.0.18", "ipaddr6"=>"fe80::c0ac:94ff:fe05:6fd4", "inputname"=>"fluent-plugin-systemd", "name"=>"fluentd", "received_at"=>"2018-09-25T09:43:41.980706+00:00", "version"=>"0.12.43 1.6.0"}}, "@timestamp"=>"2018-09-25T09:43:41.140979+00:00"}
Actual results:
No project.${projectnane}.${uuid}.${date} index in Elastcisearch

Expected results:
The container logs are hosted in indices named project.${projectnane}.${uuid}.${date}

Additional info:

Comment 1 Jeff Cantrill 2018-09-25 12:33:25 UTC
(In reply to Anping Li from comment #0)

> 
> Version-Release number of selected component (if applicable):
> openshift3/logging-fluentd/images/v3.10.45-2
> openshift3/ose-logging-fluentd/images/v3.11.14-1

Which version are you reporting the issue against?

Comment 2 Anping Li 2018-09-25 12:47:22 UTC
@jeff,  In v3.11,  if confirmed , the bug should be cloned to v3.10.

Comment 3 Rich Megginson 2018-09-25 13:10:52 UTC
Is this a regression?

Comment 4 Anping Li 2018-09-25 13:17:27 UTC
Yes, It work prior.

Comment 5 Rich Megginson 2018-09-25 15:03:51 UTC
I cannot reproduce.  I'm using the latest available 3.11 packages/images:

# docker images|grep fluentd
brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-logging-fluentd           v3.11               98f5f5f31f7d        36 hours ago        286 MB

NOTE WELL: I do not have the 3.10 image you have listed above:

openshift3/logging-fluentd/images/v3.10.45-2

I'm not sure why there would be a 3.10 and a 3.11 fluentd image on the system at the same time.  That seems suspicious to me.

Please reproduce, and please leave the system up and running for me to log into and examine.

This is what I have:

green open project.test-project.f6e6beb3-c0d2-11e8-8af8-5452005eafb9.2018.09.25 jbfxcnnXTSOxjBVg9THz7A 1 0   304 0 403.9kb 403.9kb


# oc exec -c elasticsearch $espod -- es_util --qu
ery=project.test-project.*/_search?q=kubernetes.namespace_name:test-project | jq
 . |more

  "hits": {
    "total": 374,
...
        "_index": "project.test-project.f6e6beb3-c0d2-11e8-8af8-5452005eafb9.201

Comment 6 Jeff Cantrill 2018-09-25 15:24:20 UTC
Moving to 3.11.z stream:

* see #c5 - unreproducible
* we recommend users to use JSON driver
* default cluster deployments use JSON driver
* Logging CI regularly tests this driver with no failures at present

This may have to be moved into the first z stream release if we are able to further isolate the issue.

Comment 8 Rich Megginson 2018-09-26 13:59:03 UTC
grr - the format of /etc/sysconfig/docker changed - instead of

OPTIONS=' --selinux-enabled   --log-driver=journald     --signature-verification=False'

it now uses

OPTIONS=' --selinux-enabled   --log-driver journald     --signature-verification=False'

Note the space instead of = after "--log-driver" :-(

Comment 9 openshift-github-bot 2018-09-26 16:18:31 UTC
Commits pushed to master at https://github.com/openshift/origin-aggregated-logging

https://github.com/openshift/origin-aggregated-logging/commit/b2915fb5d0f425000e1926e0a5aaa6b2b14bdea7
Bug 1632648 - Record is missing kubernetes field when Docker log driver is journald

https://bugzilla.redhat.com/show_bug.cgi?id=1632648
In /etc/sysconfig/docker, the log-driver may be specified
as --log-driver=journald or --log-driver journald delimited
by one or more whitespace

https://github.com/openshift/origin-aggregated-logging/commit/3ad88ef01ee1ee7f2ae1be30f87792fd9e3c0e48
Merge pull request #1364 from richm/bug-1632648

Bug 1632648 - Record is missing kubernetes field when Docker log driver is journald

Comment 10 Rich Megginson 2018-09-26 16:21:37 UTC
Workarounds:
- change the /etc/sysconfig/docker on each node - change it to use --log-driver=journald
- oc edit ds/logging-fluentd - add env USE_JOURNAL true e.g.

oc set env ds/logging-fluentd USE_JOURNAL=true

Comment 13 Qiaoling Tang 2018-11-02 06:00:32 UTC
Verified on logging-fluentd-v3.11.36-1.

Comment 15 errata-xmlrpc 2018-11-20 03:10:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3537


Note You need to log in before you can comment on or make changes to this bug.