Bug 1439504 - Kibana show raw json in "message" field for pods messages in json format
Summary: Kibana show raw json in "message" field for pods messages in json format
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.5.0
Hardware: Unspecified
OS: Linux
low
low
Target Milestone: ---
: 3.5.z
Assignee: Rich Megginson
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks: 1459385
TreeView+ depends on / blocked
 
Reported: 2017-04-06 06:38 UTC by Peng Li
Modified: 2020-12-14 08:27 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: When moving to use the journald log driver instead of the json-file log driver, the code that parses the journald k8s records was not preserving the "message" field created by the k8s meta filter plugin. Consequence: The raw json message was put back into the "message" field. Fix: Preserve the "message" field created when the k8s meta filter parses the JSON blob. Result: The parsed "message" field is preserved.
Clone Of:
: 1459385 (view as bug list)
Environment:
Last Closed: 2017-10-25 13:00:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kibana screenshot (263.38 KB, image/png)
2017-04-06 06:38 UTC, Peng Li
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:3049 0 normal SHIPPED_LIVE OpenShift Container Platform 3.6, 3.5, and 3.4 bug fix and enhancement update 2017-10-25 15:57:15 UTC

Description Peng Li 2017-04-06 06:38:45 UTC
Created attachment 1269179 [details]
kibana screenshot

Description of problem:
Kibana show raw json in "message" field for pods messages in json format, raw json is displayed in "message" field, should see the contents of the intended 'message' field, but not the raw log data in 'message', 


How reproducible:
always

Steps to Reproduce:
1.Deploy logging 3.5.0 using ansible and check all running
# oc get pod -n logging
NAME                          READY     STATUS    RESTARTS   AGE
logging-curator-1-01k6p       1/1       Running   0          34m
logging-es-y8jdd28k-1-d9h7q   1/1       Running   0          34m
logging-fluentd-1n24g         1/1       Running   0          34m
logging-kibana-1-dlq8l        2/2       Running   0          34m

2.Creaet a test project and test pod
$ oc new-project penli1
$ oc new-app --docker-image=docker.io/chunyunchen/log4j2-ex:latest
$ oc logs log4j2-ex-1-jccw1
(...)
{"@version":1,"@timestamp":"2017-04-06T01:14:07.000-0400","sequence":230,"loggerClassName":"org.jboss.logmanager.Logger","loggerName":"stdout","level":"INFO","threadName":"EJB default - 3","message":"timer: Hello from control: 1491455647000","threadId":121,"mdc":{},"ndc":"","log-handler":"CONSOLE"}
(...)

3.Access kibana in browser,check a message, raw message is there.

{
  "_index": "project.penli1.b6c9f50d-1a86-11e7-9c00-0ec2080bfa90.2017.04.06",
  "_type": "com.redhat.viaq.common",
  "_id": "AVtBrOdwQ1k_WKF8mEc1",
  "_score": null,
  "_source": {
    "@version": 1,
    "@timestamp": "2017-04-06T05:11:29.004993+00:00",
    "sequence": 55,
    "loggerClassName": "org.jboss.logmanager.Logger",
    "loggerName": "stdout",
    "level": "6",
    "threadName": "EJB default - 5",
    "message": "{\"@version\":1,\"@timestamp\":\"2017-04-06T01:11:29.002-0400\",\"sequence\":55,\"loggerClassName\":\"org.jboss.logmanager.Logger\",\"loggerName\":\"stdout\",\"level\":\"INFO\",\"threadName\":\"EJB default - 5\",\"message\":\"timer: Hello from control: 1491455489002\",\"threadId\":123,\"mdc\":{},\"ndc\":\"\",\"log-handler\":\"CONSOLE\"}",
    "threadId": 123,
    "log-handler": "CONSOLE",
    "docker": {
      "container_id": "88e8beaa2284c7306423b862490b7c5bd0a5471bd157803116698094d5bd3525"
    },

Actual results:
Raw json is displayed in "message" field

Expected results:
Should see the contents of the intended 'message' field, but not the raw log data in 'message', 

Version:
logging-kibana             3.5.0               7432ef245533        7 days ago          342.4 MB
logging-curator            3.5.0               6613f25bd818        7 days ago          211.1 MB
logging-elasticsearch      3.5.0               cc9a0c53d70e        7 days ago          399.4 MB
logging-fluentd            3.5.0               1b0f125d46a5        7 days ago          232.5 MB
logging-auth-proxy         3.5.0               139f7943475e        11 weeks ago        220 MB

Additional info:
screenshot attached.

Comment 1 Peter Portante 2017-04-06 12:34:23 UTC
I don't think this is a bug, as there is no facility present to process log4j data payloads in log messages, is there?

Comment 2 Rich Megginson 2017-04-06 14:45:47 UTC
This behavior is controlled by the `preserve_json_log` setting in the kubernetes metadata filter plugin: https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/blob/master/lib/fluent/plugin/filter_kubernetes_metadata.rb#L40

By default it is `true` which means the `message` field JSON blob will be preserved.  origin logging does not set this value: https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/configs.d/openshift/filter-k8s-meta.conf

which means we use the default `true`.

Note that this is the way it has always worked, so this isn't anything new, and it isn't a regression.

If you all agree, I would like to close this as NOTABUG, or perhaps move to a doc issue.

Comment 4 Rich Megginson 2017-04-06 15:30:03 UTC
(In reply to Peter Portante from comment #3)
> I think it is already trying to parse JSON, no? 
> https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/blob/
> master/lib/fluent/plugin/filter_kubernetes_metadata.rb#L327

Yes, it is.  From https://bugzilla.redhat.com/show_bug.cgi?id=1439504#c0:

  "_source": {
    "@version": 1,
    "@timestamp": "2017-04-06T05:11:29.004993+00:00",
    "sequence": 55,
    "loggerClassName": "org.jboss.logmanager.Logger",
    "loggerName": "stdout",
    "level": "6",
    "threadName": "EJB default - 5",
    "message": "{\"@version\":1,\"@timestamp\":\"2017-04-06T01:11:29.002-0400\",\"sequence\":55,\"loggerClassName\":\"org.jboss.logmanager.Logger\",\"loggerName\":\"stdout\",\"level\":\"INFO\",\"threadName\":\"EJB default - 5\",\"message\":\"timer: Hello from control: 1491455489002\",\"threadId\":123,\"mdc\":{},\"ndc\":\"\",\"log-handler\":\"CONSOLE\"}",

Fields such as "sequence", "loggerClassName", etc. are being added by the k8s meta plugin, by parsing the "message" field JSON blob, and adding the fields found there as top level fields in the log record.

It has always worked this way.

Comment 5 Peter Portante 2017-04-06 15:44:35 UTC
Huh, so then why are we not using https://github.com/ViaQ/elasticsearch-templates/blob/master/namespaces/pipeline_metadata.yml#L64 to store the original field?

Comment 6 Rich Megginson 2017-04-06 15:54:07 UTC
(In reply to Peter Portante from comment #5)
> Huh, so then why are we not using
> https://github.com/ViaQ/elasticsearch-templates/blob/master/namespaces/
> pipeline_metadata.yml#L64 to store the original field?

Sounds like a good RFE for 3.6:
* before the k8s plugin, save the "message" field in "pipeline_metadata.original_raw_message"
* in the k8s plugin, set `preserve_json_log false`

That would mean the above record would look like this:

  "_source": {
    "@version": 1,
    "@timestamp": "2017-04-06T05:11:29.004993+00:00",
    "sequence": 55,
    "loggerClassName": "org.jboss.logmanager.Logger",
    "loggerName": "stdout",
    "level": "6",
    "threadName": "EJB default - 5",
    "message": "timer: Hello from control: 1491455489002"
    "pipeline_metadata":{
        "original_raw_message": "{\"@version\":1,\"@timestamp\":\"2017-04-06T01:11:29.002-0400\",\"sequence\":55,\"loggerClassName\":\"org.jboss.logmanager.Logger\",\"loggerName\":\"stdout\",\"level\":\"INFO\",\"threadName\":\"EJB default - 5\",\"message\":\"timer: Hello from control: 1491455489002\",\"threadId\":123,\"mdc\":{},\"ndc\":\"\",\"log-handler\":\"CONSOLE\"}",
...

Comment 7 Peter Portante 2017-04-06 16:18:47 UTC
yes, that would be the ideal behavior.

Comment 8 Rich Megginson 2017-04-07 01:32:09 UTC
(In reply to Peter Portante from comment #7)
> yes, that would be the ideal behavior.

https://trello.com/c/GRzsA2tX/470-parse-json-valued-log-message-field-and-put-original-unparsed-message-in-pipeline-metadata-logging-cda

Comment 9 Peng Li 2017-04-07 05:45:01 UTC
@pportant @rmeggins

It's weird but we can see the contents of the intended 'message' field for 3.3, 3.4 and our early 3.5 testing, I attach a screen-shot with 3.4.1 logging for your reference. Though we never document this but previous version Logging has this capability.

And we have a test case which created in May 2016 for such behaviour.

Comment 14 Rich Megginson 2017-04-07 15:37:59 UTC
ok.  I think this is a regression, and I think it is because the default in 3.5 was switched from the json-file log driver to the journald log driver.  I think the problem is caused by the use of the journald log driver.

Comment 15 openshift-github-bot 2017-04-09 01:53:42 UTC
Commit pushed to master at https://github.com/openshift/origin-aggregated-logging

https://github.com/openshift/origin-aggregated-logging/commit/9a6274451d850c4d3e42ba35781b607659255c02
Bug 1439504 - Kibana show raw json in "message" field for pods messages in json format

https://bugzilla.redhat.com/show_bug.cgi?id=1439504
The problem is that the k8s record transformer filter was wiping
out the "message" field added by the k8s meta filter when reading
logs from the journal.  The fix is to preserve that field if present.

Do not set openshift_release in CI test if OS_VERSION is unset.
Instead, let ansible figure out the version from the installed software.

Comment 16 Rich Megginson 2017-04-10 17:33:05 UTC
Do we need this fix in 3.5?

Comment 17 Peng Li 2017-04-10 23:23:42 UTC
(In reply to Rich Megginson from comment #16)
> Do we need this fix in 3.5?

This is a low importance bug, logs can still be collected. I think 3.5.1 would be enough, no need to push to 3.5.0

Comment 18 Rich Megginson 2017-04-10 23:41:08 UTC
Then we still need a 3.5.x fix, which means the fix needs to be backported to release-1.5.

Comment 19 Rich Megginson 2017-04-10 23:43:43 UTC
https://github.com/openshift/origin-aggregated-logging/pull/375
PR for release-1.5

Comment 23 Xia Zhao 2017-06-05 10:01:43 UTC
Verified with 3.4.1 logging images, issue was fixed with these images:

openshift3/logging-deployer           3.4.1               b0b9b462f3e1 
openshift3/logging-elasticsearch      3.4.1               0b64a8a567d5

but the original issue still repro with v3.5 images, so the target release need to be adjusted to 3.4.1:

openshift3/logging-elasticsearch      v3.5                800c0bbdf62d

Comment 24 Xia Zhao 2017-06-05 10:21:13 UTC
@Rich,

Also want to double check that this issue is originally reported to 3.5, on 3.4 we never meet it, and feel confused on the built out images here are 3.4 ones instead of 3.5 ones, could you please help to take a further look here? Thanks!

Comment 25 Samuel Munilla 2017-06-05 17:39:21 UTC
Target release changed per xiazhao's suggestion :)

Comment 26 Rich Megginson 2017-06-05 17:48:13 UTC
(In reply to Samuel Munilla from comment #25)
> Target release changed per xiazhao's suggestion :)

This bug needs to be fixed in 3.4.x and 3.5.x.  This bz has already been acked for aos-3.5.x.  How should this bug be tracked?  For example, this fix updates the fluentd image.  There will be an errata to release these images?  Can we have a single errata for both 3.4.x and 3.5.x images?

Comment 28 Wei Sun 2017-06-07 02:07:43 UTC
Hi Rich and Samuel,do you think we could change this bug's target release to 3.5.1 and clone a new bug for 3.4.x?

Comment 29 Xia Zhao 2017-06-07 08:44:22 UTC
Issue exist with v3.5 images on OCP 3.5, waiting for new 3.5 images to be built out:

openshift3/logging-elasticsearch      v3.5                800c0bbdf62d

Comment 30 Xia Zhao 2017-06-07 09:38:48 UTC
As the supplementary to comment #29: The test was done on this openshift env:

# openshift version
openshift v3.5.5.23
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Comment 31 Rich Megginson 2017-06-07 22:11:37 UTC
smunilla - how do I make sure this bz is included in the next RHOSE image errata for 3.5.x?

Comment 32 Xia Zhao 2017-06-08 08:44:50 UTC
Test failed with openshift v3.5.5.24, still show raw json in message field for the json format pod logs like this:

{"@version":1,"@timestamp":"2017-06-08T04:38:17.001-0400","sequence":252,"loggerClassName":"org.jboss.logmanager.Logger","loggerName":"stdout","level":"INFO","threadName":"EJB default - 5","message":"timer: Hello from control: 1496911097001","threadId":145,"mdc":{},"ndc":"","log-handler":"CONSOLE"}

The expected result is only show the content of "message" field here:

timer: Hello from control: 1496911097001

Test env:
# openshift version
openshift v3.5.5.24
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Images tested with:
logging-elasticsearch   v3.5                6fed373197a3        16 hours ago        399.5 MB

Ansible version for logging components deployment:
openshift-ansible-playbooks-3.5.78-1.git.0.f7be576.el7.noarch

Comment 34 Peter Portante 2017-06-08 15:14:39 UTC
Does the same test work on 3.6?

The ruby code (https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/blob/master/lib/fluent/plugin/filter_kubernetes_metadata.rb#L209) seems to look for one of two fields, "MESSAGE" if use_journal is true, or "log" if use_journal is false.

The commit from 3.6 (https://github.com/openshift/origin-aggregated-logging/commit/9a6274451d850c4d3e42ba35781b607659255c02#diff-ba72fe0942c954035e59ed5405a66c97R13) seems to show a normalization of the "MESSAGE" and "log" fields into the "message" field.

So I am not sure how this is working at all.  I'll file an issue upstream as well.

Comment 35 Rich Megginson 2017-06-08 15:24:06 UTC
Please stop trying to test on 3.5

We have not yet built the 3.5 images

I will update this bug and put it on ON_QA when dev is ready for testing

Comment 36 Xia Zhao 2017-06-13 08:28:46 UTC
(In reply to Peter Portante from comment #34)
> Does the same test work on 3.6?

The 3.6 test on kibana UI was blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1458652

Comment 37 Xia Zhao 2017-07-18 09:10:04 UTC
(In reply to Peter Portante from comment #34)
> Does the same test work on 3.6?
> 
> The ruby code
> (https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/blob/
> master/lib/fluent/plugin/filter_kubernetes_metadata.rb#L209) seems to look
> for one of two fields, "MESSAGE" if use_journal is true, or "log" if
> use_journal is false.
> 
> The commit from 3.6
> (https://github.com/openshift/origin-aggregated-logging/commit/
> 9a6274451d850c4d3e42ba35781b607659255c02#diff-
> ba72fe0942c954035e59ed5405a66c97R13) seems to show a normalization of the
> "MESSAGE" and "log" fields into the "message" field.
> 
> So I am not sure how this is working at all.  I'll file an issue upstream as
> well.

Hi Peter,

Tested with v3.6 logging stacks and the kibana UI succeeded in showing the exact message instead of raw json in "message" field, so no issue happen here.

Test env:
# openshift version
openshift v3.6.152.0
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

The image tested with:
logging-kibana          v3.6                4832b09b1c65        2 hours ago         342.4 MB

Comment 38 Rich Megginson 2017-09-08 16:03:50 UTC
fixed in logging-fluentd-docker 3.5.0-28
https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=595280

Comment 39 Anping Li 2017-09-11 08:35:15 UTC
No such issue with openshift3/logging-fluentd/images/3.5.0-28, so move bug to verified.

Comment 41 errata-xmlrpc 2017-10-25 13:00:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3049


Note You need to log in before you can comment on or make changes to this bug.