Hide Forgot
Description of problem: See https://bugzilla.redhat.com/show_bug.cgi?id=1584665 for details and graphs. Especially comments 18-21. In 3.10, logging-fluentd uses increased memory compared to the same workload in 3.9. For the purposes of that bz, the workload was 500 1K messages/second/node. There is no 1 correct value - fluentd memory consumption will vary based on message rates, message sizes, number of namespaces and elasticsearch bulk rejects causing buffering. Likely other factors as well. For the SVT workloads of 500-750 messages/second/node for 1K-2K message, fluentd rss memory usage falls in the range of 450 - 600 MB. The current 512MB limit causes fluentd oom_kill restarts after a period of time. Consider giving fluentd some more head room over the 3.9 limit since it is known that usage has increased. Perhaps 768Mi or 1Gi. Version-Release number of selected component (if applicable): 3.10.0-0.64.0
Can we verify the memory usage is stable over a period of a week or more before considering this change? Past observations have seen OOM kills with 1 Gi limits on starter clusters. However, raising the limits to such a level is really not healthy, as it cuts deep into node resources, making the collector too invasive. Getting the collector to stay bounded to 256 MB of memory at most would be ideal, and obviously smaller is better.
(In reply to Peter Portante from comment #1) > Can we verify the memory usage is stable over a period of a week or more > before considering this change? > > Past observations have seen OOM kills with 1 Gi limits on starter clusters. > > However, raising the limits to such a level is really not healthy, as it > cuts deep into node resources, making the collector too invasive. > > Getting the collector to stay bounded to 256 MB of memory at most would be > ideal, and obviously smaller is better. How do we do that? Decrease the queue/buffer sizes?
(In reply to Rich Megginson from comment #2) > How do we do that? Decrease the queue/buffer sizes? One option is to move to a different collector technology, following the likes of rsyslog (preferred given we already have it packaged for our products), filebeat (Go based, but needs packaging work), or fluentbit (C based, but really new, untested, and not packaged for our products yet).
I have a long run started to address the request for a week's worth of data in comment 1. Will report periodic data in this bz.
Targeting at 3.10.z as this is not a blocker and can be tweaked during deployment. Mike to soak test to confirm a value
My longrun cluster ran rock solid for 5 days (6/14-6/19) with logging-fluentd memory usage fluctuating between 445MB and 455MB. No sign of growth outside of that range. Shortly after midnight last night (6/20), the logging-fluentd pods started getting errors that the ES cluster is unreachable and their buffer files filled up (process memory increased to 815MB as well) The ES pod shows Ready and there are no errors in either the ES pod log or in logging-es.log. logging-fluentd log has: 2018-06-20 00:09:59 +0000 [warn]: Could not push logs to Elasticsearch, resetting connection and trying again. getaddrinfo: Name or service not known (SocketError) 2018-06-20 00:12:02 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2018-06-20 00:04:17 +0000 error_class="Fluent::ElasticsearchOutput::ConnectionFailure" error="Can not reach Elasticsearch cluster ({:host=>\"logging-es\", :port=>9200, :scheme=>\"https\", :user=>\" fluentd\", :password=>\"obfuscated\"})!" plugin_id="elasticsearch-apps" ---------------------- I can still curl the ES svc IP, though. Example: NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE logging-es ClusterIP 172.27.80.112 <none> 9200/TCP 5d # curl --connect-timeout 2 -s -k --cert ./admin-cert --key ./admin-key https://172.27.80.112:9200/_cat/indices?v | grep logtest green open project.logtest14.1f88933f-6fe9-11e8-936f-026aa585187c.2018.06.20 1 0 5421 0 11.3mb 11.3mb green open project.logtest3.1cacd73e-6fe9-11e8-936f-026aa585187c.2018.06.19 1 0 2238483 0 4.1gb 4.1gb <snip> No bulk rejects: host bulk.completed bulk.rejected bulk.queue bulk.active bulk.queueSize 172.23.0.14 8495957 0 0 0 50 Ideas, Peter/Jeff/Rich?
curling with the logging-es service DNS name works as well: # curl --connect-timeout 2 -s -k --cert ./admin-cert --key ./admin-key https://logging-es.openshift-logging.svc.cluster.local:9200/_cat/indices?v | grep logtest green open project.logtest14.1f88933f-6fe9-11e8-936f-026aa585187c.2018.06.20 1 0 5421 0 11.3mb 11.3mb green open project.logtest3.1cacd73e-6fe9-11e8-936f-026aa585187c.2018.06.19 1 0 2238483 0 4.1gb 4.1gb
(In reply to Mike Fiedler from comment #7) > curling with the logging-es service DNS name works as well: > > # curl --connect-timeout 2 -s -k --cert ./admin-cert --key ./admin-key > https://logging-es.openshift-logging.svc.cluster.local:9200/_cat/indices?v | > grep logtest > green open > project.logtest14.1f88933f-6fe9-11e8-936f-026aa585187c.2018.06.20 1 0 > 5421 0 11.3mb 11.3mb > green open > project.logtest3.1cacd73e-6fe9-11e8-936f-026aa585187c.2018.06.19 1 0 > 2238483 0 4.1gb 4.1gb Does it work if you do not use -k but instead use --cacert ./admin-ca ?
I see master API, controller and etcd restarts. Guessing this is not logging. Digging through those pod logs.
Created attachment 1454381 [details] Days 1-2 of fluentd memory long run Long run of 100 messages/second per node for 6 days. This image is the grafana memory usage. Initially there is high memory usage and fluctuation before it settles into a steady state. The initial fluctuation is where a higher OOTB limit would be beneficial
Created attachment 1454382 [details] Days 3-6 of fluentd memory Days 3-6 - steady fluentd memory usage
odd - so the first 8 hours saw a lot of memory growth, and a lot of memory churn, before finally tapering off and holding steady at around 310MB. Was the logging rate, message size, etc. steady the entire time? What log driver was used - json-file or journald? What was elasticsearch doing during this time? Do you have corresponding graphs of elasticsearch cpu and memory usage?
Created attachment 1455027 [details] Day 6-8 of fluentd memory Final fluentd graph after 3 more days. There is a very small uptick in rss usage at one point but for the most part it is very steady. I am wrapping up this run.
Did namespaces, pods, and containers get created and removed during this time period?
(In reply to Peter Portante from comment #16) > Did namespaces, pods, and containers get created and removed during this > time period? Is that relevant as long as the test is consistent between pre and post runs?
https://github.com/openshift/openshift-ansible/pull/9165
Commits pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/3430907e62a697916fc65f4f43b5bcd5fc5b1f4b bug 1590920. Bump fluent default memory to 756M https://github.com/openshift/openshift-ansible/commit/3f3c6743cdc6f71959cad628ac8c620dba2a4ab4 Merge pull request #9165 from jcantrill/1590920 bug 1590920. Bump fluent default memory to 756M
The fix is in ose-ansible/images/v3.11.0-0.11.0.0. so move to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652