Bug 1569825 - JSON payload processing of the log message payload if abused can cause logging to slow to a crawl
Summary: JSON payload processing of the log message payload if abused can cause loggin...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.6.0
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: 3.9.z
Assignee: Jeff Cantrill
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-20 03:25 UTC by Peter Portante
Modified: 2018-06-20 15:06 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: Allow users to disable JSON payload parsing Reason: Parsing each log message into JSON and attaching it to the final payload is an expensive operation Result: Fluentd can be configured to disable parsing of message payloads. This is the initial configuration change to deprecating the feature from the fluent-plugin-kubernetes_metadata_filter
Clone Of:
Environment:
Last Closed: 2018-06-06 15:46:20 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github fabric8io fluent-plugin-kubernetes_metadata_filter pull 122 0 None closed Turn off JSON payload merges by default 2020-11-23 17:07:07 UTC
Github openshift origin-aggregated-logging pull 1109 0 None closed bug 1569825. Turn off JSON parsing by default 2020-11-23 17:07:07 UTC
Red Hat Product Errata RHBA-2018:1796 0 None None None 2018-06-06 15:47:23 UTC

Description Peter Portante 2018-04-20 03:25:12 UTC
"JSON payload processing of the log message payload if abused can cause logging to slow to a crawl"

The problem stems from a feature of the k8s metadata fluentd filter [0] which will look for a JSON payload in the message field and load the fields found in the JSON document as fields of the log entry itself.

If the JSON payload is not well formed, where each log message can contribute a unique field name, Elasticsearch spends all of its time in "cluster state transitions" while it propagates the new files to all the members of the cluster tracking that index.  You can see this with INFO messages in the logs like, "".

We should either turn this feature off by default, or engineer a way to ensure the gratuitous field generation offered by Elasticsearch does not result unique fields being generated.

[0] See "merge_json_log" description at https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter

Comment 1 Rich Megginson 2018-04-20 15:20:57 UTC
Workaround:

1) edit the fluentd configmap e.g. oc edit cm/logging-fluentd

2) look for this line:

      @include configs.d/openshift/filter-k8s-meta.conf

3) replace it with this - be sure to preserve the indentation:

      <filter kubernetes.**>
        type kubernetes_metadata
        merge_json_log false
        kubernetes_url "#{ENV['K8S_HOST_URL']}"
        cache_size "#{ENV['K8S_METADATA_CACHE_SIZE'] || '1000'}"
        watch "#{ENV['K8S_METADATA_WATCH'] || 'false'}"
        bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token
        ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        include_namespace_metadata true
        use_journal "#{ENV['USE_JOURNAL'] || 'false'}"
        container_name_to_kubernetes_regexp '^(?<name_prefix>[^_]+)_(?<container_name>[^\._]+)(\.(?<container_hash>[^_]+))?_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_[^_]+_[^_]+$'
      </filter>

That is, add `merge_json_log false` to the 3.6 kubernetes_metadata filter configuration.  This is from https://github.com/openshift/origin-aggregated-logging/blob/release-3.6/fluentd/configs.d/openshift/filter-k8s-meta.conf - I'm assuming this is 3.6 because the bug was filed against version 3.6.0

Comment 2 Rich Megginson 2018-04-20 15:22:37 UTC
sorry, one more step - restart fluentd

4) oc delete pod -l component=fluentd

or scale up and scale down

oc label node -l logging-infra-fluentd=true --overwrite logging-infra-fluentd=false

then wait for all fluentd pods to terminate

then

oc label node -l logging-infra-fluentd=false --overwrite logging-infra-fluentd=true

Comment 3 openshift-github-bot 2018-04-28 01:46:26 UTC
Commits pushed to master at https://github.com/openshift/origin-aggregated-logging

https://github.com/openshift/origin-aggregated-logging/commit/8be71b5f3a5bb7f7d99d43309fdfb7aaab884e22
bug 1569825. Deprecate merge_json_payload

https://github.com/openshift/origin-aggregated-logging/commit/dec1ed51474f1db4ad5dee2f3d27181660dba26d
Merge pull request #1109 from jcantrill/1569825_disable_json_parsing

bug 1569825. Turn off JSON parsing by default

Comment 5 Jeff Cantrill 2018-05-02 14:48:25 UTC
This change provides the ability to disable JSON parsing by setting an environment variable.  The default is to remain on in order to avoid surprising consumers who depend on the functionality.

Comment 7 Anping Li 2018-05-31 03:38:01 UTC
The json playload was closed by default logging-fluentd/images/v3.9.30-2

Comment 9 errata-xmlrpc 2018-06-06 15:46:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1796

Comment 10 openshift-github-bot 2018-06-20 15:06:22 UTC
Commits pushed to master at https://github.com/openshift/origin-aggregated-logging

https://github.com/openshift/origin-aggregated-logging/commit/4481abeb2c3faedd363258ca0db199993bb7b091
bug 1569825. Deprecate merge_json_payload

https://github.com/openshift/origin-aggregated-logging/commit/b6147ebc993083f1982872b702be3d465d40898e
Merge pull request #1132 from openshift-cherrypick-robot/cherry-pick-1109-to-es5.x

[es5.x] bug 1569825. Turn off JSON parsing by default


Note You need to log in before you can comment on or make changes to this bug.