It appears that master API service in 3.7 causes the fluentd k8s metadata plugin to kill fluentd. Debugging this on starter-ca-central-1 where we have the masters running OCP 1.7 and the compute and infra nodes running OCP 3.6, the fluentd processes not able to make progress collecting logs due to their inability to stay up and running. They restart every 5 seconds (within the container). When we comment out the use of the k8s-metadata plugin, which talks to the master API service, the fluentd process stays running without restarting.
(In reply to Peter Portante from comment #0) > It appears that master API service in 3.7 causes the fluentd k8s metadata > plugin to kill fluentd. > > Debugging this on starter-ca-central-1 where we have the masters running OCP > 1.7 and the compute and infra nodes running OCP 3.6, the fluentd processes > not able to make progress collecting logs due to their inability to stay up > and running. They restart every 5 seconds (within the container). When we > comment out the use of the k8s-metadata plugin, which talks to the master > API service, the fluentd process stays running without restarting. Are there any logs or error message from fluentd that will indicate why it is restarting?
Turns out the fluentd process was being almost immediately OOM killed. It had a memory limit of 512Mi, but was trying using close to 900 Mi at initial startup. This was due to the execution of the two "watcher" threads making the watch pods and namespaces API call. Without the watchers present, fluentd only consumes around 250 Mi of memory. We turned off the watcher threads via the "watch false" parameter to the k8s metadata plugin.
Commits pushed to master at https://github.com/openshift/origin-aggregated-logging https://github.com/openshift/origin-aggregated-logging/commit/0a51850877d576b72ad0cf8e664af912b2dd88f8 bug 1503404. Turn of api watch to reduce fluentd memory https://github.com/openshift/origin-aggregated-logging/commit/877d84296ce113fbafca6177612741054ed5a584 Merge pull request #751 from jcantrill/1503404_turn_of_watch Automatic merge from submit-queue. bug 1503404. Turn of api watch to reduce fluentd memory This PR: * Turns off watch completely of pod and namespaces * Allows configuration of watch enablement via env var * Allows configuration of cache size via env var
The fix isn't in the latest logging-fluentd:v3.7.0-0.189.0.0. Waiting the image. #oc get pod logging-fluentd-ctgv6 -o json |grep image "image": "openshift3/logging-fluentd:v3.7.0-0.189.0.0", #oc exec logging-fluentd-ctgv6 cat /etc/fluent/configs.d/openshift/filter-k8s-meta.conf ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt include_namespace_id true use_journal "#{ENV['USE_JOURNAL'] || 'false'}"
Not sure what happened with the 189 build - strange - fix is in logging-fluentd:v3.7.0-0.190.0.0
The fix include_namespace_metadata true is in v3.7.0-0.190.0.0. the usage memory is less than v3.7.0-0.189.0.0 For v3.7.0-0.190.0.0 sh-4.2# free -m total used free shared buff/cache available Mem: 3790 778 2167 0 844 2739 Swap: 0 0 0 sh-4.2# cat /etc/fluent/configs.d/openshift/filter-k8s-meta.conf <filter kubernetes.**> type kubernetes_metadata kubernetes_url "#{ENV['K8S_HOST_URL']}" cache_size "#{ENV['K8S_METADATA_CACHE_SIZE'] || '1000'}" watch "#{ENV['K8S_METADATA_WATCH'] || 'false'}" bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt include_namespace_metadata true use_journal "#{ENV['USE_JOURNAL'] || 'false'}" container_name_to_kubernetes_regexp '^(?<name_prefix>[^_]+)_(?<container_name>[^\._]+)(\.(?<container_hash>[^_]+))?_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_[^_]+_[^_]+$' </filter> For v3.7.0-0.189.0.0 cat /etc/fluent/configs.d/openshift/filter-k8s-meta.conf <filter kubernetes.**> type kubernetes_metadata kubernetes_url "#{ENV['K8S_HOST_URL']}" bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt include_namespace_id true use_journal "#{ENV['USE_JOURNAL'] || 'false'}" container_name_to_kubernetes_regexp '^(?<name_prefix>[^_]+)_(?<container_name>[^\._]+)(\.(?<container_hash>[^_]+))?_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_[^_]+_[^_]+$' </filter> sh-4.2# free -h total used free shared buff/cache available Mem: 3.7G 806M 2.1G 836K 840M 2.6G Swap: 0B 0B 0B
When use mux as crossroads,the fix is in mux pod. so move bug to verified. # oc rsh logging-fluentd-nwhpw sh-4.2# cat /etc/fluent/configs.d/openshift/filter-k8s-meta.conf <filter kubernetes.**> @type k8s_meta_filter_for_mux_client use_journal "#{ENV['USE_JOURNAL'] || 'false'}" </filter> sh-4.2# exit exit [root@openshift-181 ~]# oc rsh logging-mux-1-pljxb sh-4.2# cat /etc/fluent/configs.d/openshift/filter-k8s-meta.conf <filter kubernetes.**> type kubernetes_metadata kubernetes_url "#{ENV['K8S_HOST_URL']}" cache_size "#{ENV['K8S_METADATA_CACHE_SIZE'] || '1000'}" watch "#{ENV['K8S_METADATA_WATCH'] || 'false'}" bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt include_namespace_metadata true use_journal "#{ENV['USE_JOURNAL'] || 'false'}" container_name_to_kubernetes_regexp '^(?<name_prefix>[^_]+)_(?<container_name>[^\._]+)(\.(?<container_hash>[^_]+))?_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_[^_]+_[^_]+$' </filter>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188