Bug 1466005
Summary: | [DOCS] When Fluentd logger is unable to keep up with high amounts of logs, the cpu and memory limits are configurable. | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Noriko Hosoi <nhosoi> |
Component: | Documentation | Assignee: | Brandi Munilla <bmcelvee> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Junqi Zhao <juzhao> |
Severity: | medium | Docs Contact: | Vikram Goyal <vigoyal> |
Priority: | medium | ||
Version: | 3.6.0 | CC: | aos-bugs, a, bmcelvee, jcantril, jokerman, mmccomas, nhosoi, pportant, rhowe, rmeggins, xiazhao |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1445053 | Environment: | |
Last Closed: | 2017-08-09 20:33:51 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1445053 | ||
Bug Blocks: |
Description
Noriko Hosoi
2017-06-28 16:46:04 UTC
Hi Noriko and Xia, I open PR4837[1] with the new Tune Buffer Chunk Limit section. Please review when you get a chance. Thanks! [1]https://github.com/openshift/openshift-docs/pull/4837 @Brandi, Documentation is ok, but IMHO, I see the following: The memory limit is used to calculate the Fluentd buffer_queue_limit by dividing the resource memory limit by the number of output multiplied by the buffer_chunk_size From user perspective,user would be more clearly if we describe like this: buffer_queue_limit = resource memory limit / (number of output * buffer_chunk_size). Also, I think we should describe more clearly about "number of output", I think "number of elasticsearch pods output" is better. @Noriko, What do you think? @Junqi, @Brandi, sorry for this change at the last moment. Fluentd does not use memory for the buffer queue as of OCP 3.6. It switches to the file buffering to reduce the memory usage and prevent the data loss. On the Fluentd and Mux pod, permanent volume /var/lib/fluentd is supposed to be prepared, e.g., by pvc or hostmount. Then, the area is used for the file buffers. The buffer_type and buffer_path are configured in the fluentd config files as follows: $ egrep "buffer_type|buffer_path" *.conf es-copy-config.conf: buffer_type file es-copy-config.conf: buffer_path '/var/lib/fluentd/buffer-es-copy-config' es-ops-copy-config.conf: buffer_type file es-ops-copy-config.conf: buffer_path '/var/lib/fluentd/buffer-es-ops-copy-config' output-es-config.conf: buffer_type file output-es-config.conf: buffer_path '/var/lib/fluentd/buffer-output-es-config' output-es-ops-config.conf: buffer_type file output-es-ops-config.conf: buffer_path '/var/lib/fluentd/buffer-output-es-ops-config' The fluentd's buffer_chunk_limit is determined by the environment variable BUFFER_SIZE_LIMIT. The file buffer size per output is determined by the environment variable FILE_BUFFER_LIMIT. The permanent volume size has to be larger than FILE_BUFFER_LIMIT times number of output. For instance, if the fluentd outputs the log to 2 elasticsearch'es, the pod has to have larger disk space than (FILE_BUFFER_LIMIT * 2). fluentd's buffer_queue_limit is calculated as (FILE_BUFFER_LIMIT / BUFFER_SIZE_LIMIT). @Brandi We have to change this documentation based on Comment 5 @Junqi, @Noriko, I updated the PR to reflect the changes requested Comment 5. Please review for accuracy. Thanks! (In reply to Brandi from comment #7) > @Junqi, @Noriko, > > I updated the PR to reflect the changes requested Comment 5. Please review > for accuracy. > > Thanks! Ahhh, so sorry, @Brandi. I should have updated this bug the day before yesterday... The feature described in #c5 was not merged to 3.6, but deferred to 3.6.1. (;_;) Could you please backoff the fluentd changes in https://github.com/openshift/openshift-docs/pull/4837? Please keep the changes you made (I've reviewed them and added some comments) for the 3.6.1 release? Let me reset the status to ASSIGNED again (sorry...). Since the previous version was reviewed by @Junqi (See #c4), I think we could just ack it. But if you could give me one more chance, I'd appreciate it. Thanks! Thank you @Noriko! I saved the original changes in a separate file just in case we'll need them in the future. And thank you for your review. I'll have the PR ready for another review in just a few minutes. (In reply to Brandi from comment #9) > Thank you @Noriko! I saved the original changes in a separate file just in > case we'll need them in the future. > > And thank you for your review. I'll have the PR ready for another review in > just a few minutes. @Noriko, @Brandi Is The PR still https://github.com/openshift/openshift-docs/pull/4837 ? I think the documentation in PR 4837 is description for feature "Use `file` buffer instead of `memory` buffer for fluentd",(https://trello.com/c/XpreI533/509-5-use-file-buffer-instead-of-memory-buffer-for-fluentdloggingepic-ois-agl-perf), from Comment 5, it's deferred to 3.6.1 From Comment 8, I think the description should be like Comment 0, and there are some advices from my side, see Comment 4. @Noriko, am I right? (In reply to Junqi Zhao from comment #10) > (In reply to Brandi from comment #9) > > Thank you @Noriko! I saved the original changes in a separate file just in > > case we'll need them in the future. > > > > And thank you for your review. I'll have the PR ready for another review in > > just a few minutes. > > @Noriko, @Brandi > > Is The PR still https://github.com/openshift/openshift-docs/pull/4837 ? > I think the documentation in PR 4837 is description for feature "Use `file` > buffer instead of `memory` buffer for > fluentd",(https://trello.com/c/XpreI533/509-5-use-file-buffer-instead-of- > memory-buffer-for-fluentdloggingepic-ois-agl-perf), from Comment 5, it's > deferred to 3.6.1 > > From Comment 8, I think the description should be like Comment 0, and there > are some advices from my side, see Comment 4. > > @Noriko, am I right? Yes, you are right, @Junqi. The doc should not include the "file buffer" at all for 3.6... We have to wait for 3.6.1 to use the file buffer version doc. Thanks! Documentation is wrong, "tune-buffer-chunk-limit" should not be in this file, please see Comment 4, and Comment 11, your original file is right, description should be like content in Comment 0. Documentation is fine, set it to VERIFIED (In reply to Junqi Zhao from comment #16) > Documentation is fine, set it to VERIFIED +1 Thank you, @Brandi. Thank you, @Junqi. Commit pushed to master at https://github.com/openshift/openshift-docs https://github.com/openshift/openshift-docs/commit/e6198aea7bb54ce2be336dcaed460985ecce3292 Bug 1466005 Add Tune Buffer Chunk Limit Thank you @Noriko and @Junqi! Link to documentation on Customer Portal: https://access.redhat.com/documentation/en-us/openshift_container_platform/3.6/html-single/cluster_administration/#tune-buffer-chunk-limit |