Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1590920 - Consider increasing OOTB memory limit for logging-fluentd to 768Mi or 1Gi
Consider increasing OOTB memory limit for logging-fluentd to 768Mi or 1Gi
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging (Show other bugs)
3.10.0
x86_64 Linux
unspecified Severity medium
: ---
: 3.11.0
Assigned To: Jeff Cantrill
Mike Fiedler
:
Depends On:
Blocks: 1600258
  Show dependency treegraph
 
Reported: 2018-06-13 12:22 EDT by Mike Fiedler
Modified: 2018-10-11 03:21 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: Increase out-of-the-box memory for fluentd Reason: More memory is required to support prometheus metrics and retry queues in order to avoid periodic restarts from OOM. Result: Fluentd pods avoid OOM restarts
Story Points: ---
Clone Of:
: 1600258 (view as bug list)
Environment:
Last Closed: 2018-10-11 03:20:33 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Days 1-2 of fluentd memory long run (86.92 KB, image/png)
2018-06-25 10:55 EDT, Mike Fiedler
no flags Details
Days 3-6 of fluentd memory (82.50 KB, image/png)
2018-06-25 10:56 EDT, Mike Fiedler
no flags Details
Day 6-8 of fluentd memory (83.55 KB, image/png)
2018-06-27 07:56 EDT, Mike Fiedler
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Github openshift/openshift-ansible/pull/9165 None None None 2018-07-11 15:19 EDT
Red Hat Product Errata RHBA-2018:2652 None None None 2018-10-11 03:21 EDT

  None (edit)
Description Mike Fiedler 2018-06-13 12:22:02 EDT
Description of problem:

See https://bugzilla.redhat.com/show_bug.cgi?id=1584665 for details and graphs.  Especially comments 18-21.

In 3.10, logging-fluentd uses increased memory compared to the same workload in 3.9.   For the purposes of that bz, the workload was 500 1K messages/second/node.

There is no 1 correct value - fluentd memory consumption will vary based on message rates, message sizes, number of namespaces and elasticsearch bulk rejects causing buffering.   Likely other factors as well.

For the SVT workloads of 500-750 messages/second/node for 1K-2K message, fluentd rss memory usage falls in the range of 450 - 600 MB.   The current 512MB limit causes fluentd oom_kill restarts after a period of time.

Consider giving fluentd some more head room over the 3.9 limit since it is known that usage has increased.  Perhaps 768Mi or 1Gi.


Version-Release number of selected component (if applicable): 3.10.0-0.64.0
Comment 1 Peter Portante 2018-06-13 17:56:25 EDT
Can we verify the memory usage is stable over a period of a week or more before considering this change?

Past observations have seen OOM kills with 1 Gi limits on starter clusters.

However, raising the limits to such a level is really not healthy, as it cuts deep into node resources, making the collector too invasive.

Getting the collector to stay bounded to 256 MB of memory at most would be ideal, and obviously smaller is better.
Comment 2 Rich Megginson 2018-06-13 18:16:18 EDT
(In reply to Peter Portante from comment #1)
> Can we verify the memory usage is stable over a period of a week or more
> before considering this change?
> 
> Past observations have seen OOM kills with 1 Gi limits on starter clusters.
> 
> However, raising the limits to such a level is really not healthy, as it
> cuts deep into node resources, making the collector too invasive.
> 
> Getting the collector to stay bounded to 256 MB of memory at most would be
> ideal, and obviously smaller is better.

How do we do that?  Decrease the queue/buffer sizes?
Comment 3 Peter Portante 2018-06-13 21:17:26 EDT
(In reply to Rich Megginson from comment #2)
> How do we do that?  Decrease the queue/buffer sizes?

One option is to move to a different collector technology, following the likes of rsyslog (preferred given we already have it packaged for our products), filebeat (Go based, but needs packaging work), or fluentbit (C based, but really new, untested, and not packaged for our products yet).
Comment 4 Mike Fiedler 2018-06-14 14:30:29 EDT
I have a long run started to address the request for a week's worth of data in comment 1.   Will report periodic data in this bz.
Comment 5 Jeff Cantrill 2018-06-14 14:35:47 EDT
Targeting at 3.10.z as this is not a blocker and can be tweaked during deployment.  Mike to soak test to confirm a value
Comment 6 Mike Fiedler 2018-06-20 08:39:19 EDT
My longrun cluster ran rock solid for 5 days (6/14-6/19) with logging-fluentd memory usage fluctuating between 445MB and 455MB.   No sign of growth outside of that range.  Shortly after midnight last night (6/20), the logging-fluentd pods started getting errors that the ES cluster is unreachable and their buffer files filled up (process memory increased to 815MB as well)

The ES pod shows Ready and there are no errors in either the ES pod log or in logging-es.log.   logging-fluentd log has:

2018-06-20 00:09:59 +0000 [warn]: Could not push logs to Elasticsearch, resetting connection and trying again. getaddrinfo: Name or service not known (SocketError)
2018-06-20 00:12:02 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-06-20 00:04:17 +0000 error_class="Fluent::ElasticsearchOutput::ConnectionFailure" error="Can not reach Elasticsearch cluster ({:host=>\"logging-es\", :port=>9200, :scheme=>\"https\", :user=>\"
fluentd\", :password=>\"obfuscated\"})!" plugin_id="elasticsearch-apps"

----------------------

I can still curl the ES svc IP, though.   Example:

NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
logging-es   ClusterIP   172.27.80.112   <none>        9200/TCP   5d

# curl --connect-timeout 2 -s -k --cert ./admin-cert --key ./admin-key https://172.27.80.112:9200/_cat/indices?v | grep logtest
green  open   project.logtest14.1f88933f-6fe9-11e8-936f-026aa585187c.2018.06.20   1   0       5421            0     11.3mb         11.3mb 
green  open   project.logtest3.1cacd73e-6fe9-11e8-936f-026aa585187c.2018.06.19    1   0    2238483            0      4.1gb          4.1gb 

<snip>

No bulk rejects:

host        bulk.completed bulk.rejected bulk.queue bulk.active bulk.queueSize 
172.23.0.14        8495957             0          0           0             50 


Ideas, Peter/Jeff/Rich?
Comment 7 Mike Fiedler 2018-06-20 10:04:08 EDT
curling with the logging-es service DNS name works as well:

# curl --connect-timeout 2 -s -k --cert ./admin-cert --key ./admin-key https://logging-es.openshift-logging.svc.cluster.local:9200/_cat/indices?v | grep logtest
green  open   project.logtest14.1f88933f-6fe9-11e8-936f-026aa585187c.2018.06.20   1   0       5421            0     11.3mb         11.3mb 
green  open   project.logtest3.1cacd73e-6fe9-11e8-936f-026aa585187c.2018.06.19    1   0    2238483            0      4.1gb          4.1gb
Comment 9 Rich Megginson 2018-06-20 10:30:16 EDT
(In reply to Mike Fiedler from comment #7)
> curling with the logging-es service DNS name works as well:
> 
> # curl --connect-timeout 2 -s -k --cert ./admin-cert --key ./admin-key
> https://logging-es.openshift-logging.svc.cluster.local:9200/_cat/indices?v |
> grep logtest
> green  open  
> project.logtest14.1f88933f-6fe9-11e8-936f-026aa585187c.2018.06.20   1   0   
> 5421            0     11.3mb         11.3mb 
> green  open  
> project.logtest3.1cacd73e-6fe9-11e8-936f-026aa585187c.2018.06.19    1   0   
> 2238483            0      4.1gb          4.1gb

Does it work if you do not use -k but instead use --cacert ./admin-ca ?
Comment 10 Mike Fiedler 2018-06-20 12:20:38 EDT
I see master API, controller and etcd restarts.  Guessing this is not logging.   Digging through those pod logs.
Comment 11 Mike Fiedler 2018-06-25 10:55 EDT
Created attachment 1454381 [details]
Days 1-2 of fluentd memory long run

Long run of 100 messages/second per node for 6 days.   This image is the grafana memory usage.

Initially there is high memory usage and fluctuation before it settles into a steady state.   The initial fluctuation is where a higher OOTB limit would be beneficial
Comment 12 Mike Fiedler 2018-06-25 10:56 EDT
Created attachment 1454382 [details]
Days 3-6 of fluentd memory

Days 3-6 - steady fluentd memory usage
Comment 13 Rich Megginson 2018-06-25 16:46:21 EDT
odd - so the first 8 hours saw a lot of memory growth, and a lot of memory churn, before finally tapering off and holding steady at around 310MB.

Was the logging rate, message size, etc. steady the entire time?

What log driver was used - json-file or journald?

What was elasticsearch doing during this time?  Do you have corresponding graphs of elasticsearch cpu and memory usage?
Comment 15 Mike Fiedler 2018-06-27 07:56 EDT
Created attachment 1455027 [details]
Day 6-8 of fluentd memory

Final fluentd graph after 3 more days.   There is a very small uptick in rss usage at one point but for the most part it is very steady.   I am wrapping up this run.
Comment 16 Peter Portante 2018-06-29 08:41:37 EDT
Did namespaces, pods, and containers get created and removed during this time period?
Comment 17 Jeff Cantrill 2018-06-29 12:13:34 EDT
(In reply to Peter Portante from comment #16)
> Did namespaces, pods, and containers get created and removed during this
> time period?

Is that relevant as long as the test is consistent between pre and post runs?
Comment 19 openshift-github-bot 2018-07-13 02:12:58 EDT
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/3430907e62a697916fc65f4f43b5bcd5fc5b1f4b
bug 1590920. Bump fluent default memory to 756M

https://github.com/openshift/openshift-ansible/commit/3f3c6743cdc6f71959cad628ac8c620dba2a4ab4
Merge pull request #9165 from jcantrill/1590920

 bug 1590920. Bump fluent default memory to 756M
Comment 20 Anping Li 2018-08-06 06:08:35 EDT
The fix is in ose-ansible/images/v3.11.0-0.11.0.0. so move to verified.
Comment 22 errata-xmlrpc 2018-10-11 03:20:33 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.