Hide Forgot
+++ This bug was initially created as a clone of Bug #1684210 +++ +++ This bug was initially created as a clone of Bug #1684048 +++ Description of problem: In https://access.redhat.com/errata/RHBA-2018:3748 (https://access.redhat.com/errata/RHBA-2018:3750 for 3.10 and https://access.redhat.com/errata/RHBA-2018:3743 for 3.11) we introduced a change that impacts the behaviour of `fluentd` in terms of it's own logging. So instead of writing logs to STDOUT, `fluentd` is now writing by default to `/var/log/fluentd/fluentd.log`. Environment Variables to control log file location, log file size and log file age (log-rotation) have been made available to help control the new functionality LOGGING_FILE_PATH LOGGING_FILE_AGE LOGGING_FILE_SIZE Problem is, that LOGGING_FILE_AGE and LOGGING_FILE_SIZE don't appear to work, as `/var/log/fluentd/fluentd.log` keeps growing infinite and never gets rotated: # oc get ds logging-fluentd -o yaml apiVersion: extensions/v1beta1 kind: DaemonSet metadata: creationTimestamp: 2019-02-27T22:54:11Z generation: 1 labels: component: fluentd logging-infra: fluentd provider: openshift name: logging-fluentd namespace: logging resourceVersion: "90526" selfLink: /apis/extensions/v1beta1/namespaces/logging/daemonsets/logging-fluentd uid: 98b50d1a-3ae2-11e9-9e92-fa163e58e771 spec: revisionHistoryLimit: 10 selector: matchLabels: component: fluentd provider: openshift template: metadata: creationTimestamp: null labels: component: fluentd logging-infra: fluentd provider: openshift name: fluentd-elasticsearch spec: containers: - env: - name: K8S_HOST_URL value: https://kubernetes.default.svc.cluster.local - name: ES_HOST value: logging-es - name: ES_PORT value: "9200" - name: ES_CLIENT_CERT value: /etc/fluent/keys/cert - name: ES_CLIENT_KEY value: /etc/fluent/keys/key - name: ES_CA value: /etc/fluent/keys/ca - name: OPS_HOST value: logging-es - name: OPS_PORT value: "9200" - name: OPS_CLIENT_CERT value: /etc/fluent/keys/ops-cert - name: OPS_CLIENT_KEY value: /etc/fluent/keys/ops-key - name: OPS_CA value: /etc/fluent/keys/ops-ca - name: JOURNAL_SOURCE - name: JOURNAL_READ_FROM_HEAD - name: BUFFER_QUEUE_LIMIT value: "32" - name: BUFFER_SIZE_LIMIT value: 8m - name: FLUENTD_CPU_LIMIT valueFrom: resourceFieldRef: containerName: fluentd-elasticsearch divisor: "0" resource: limits.cpu - name: FLUENTD_MEMORY_LIMIT valueFrom: resourceFieldRef: containerName: fluentd-elasticsearch divisor: "0" resource: limits.memory - name: FILE_BUFFER_LIMIT value: 256Mi image: registry.access.redhat.com/openshift3/logging-fluentd:v3.9.68 imagePullPolicy: IfNotPresent name: fluentd-elasticsearch resources: limits: memory: 512Mi requests: cpu: 100m memory: 512Mi securityContext: privileged: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /run/log/journal name: runlogjournal - mountPath: /var/log name: varlog - mountPath: /var/lib/docker/containers name: varlibdockercontainers readOnly: true - mountPath: /etc/fluent/configs.d/user name: config readOnly: true - mountPath: /etc/fluent/keys name: certs readOnly: true - mountPath: /etc/docker-hostname name: dockerhostname readOnly: true - mountPath: /etc/localtime name: localtime readOnly: true - mountPath: /etc/sysconfig/docker name: dockercfg readOnly: true - mountPath: /etc/docker name: dockerdaemoncfg readOnly: true - mountPath: /etc/origin/node name: originnodecfg readOnly: true - mountPath: /var/lib/fluentd name: filebufferstorage dnsPolicy: ClusterFirst nodeSelector: logging-infra-fluentd: "true" restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: aggregated-logging-fluentd serviceAccountName: aggregated-logging-fluentd terminationGracePeriodSeconds: 30 volumes: - hostPath: path: /run/log/journal type: "" name: runlogjournal - hostPath: path: /var/log type: "" name: varlog - hostPath: path: /var/lib/docker/containers type: "" name: varlibdockercontainers - configMap: defaultMode: 420 name: logging-fluentd name: config - name: certs secret: defaultMode: 420 secretName: logging-fluentd - hostPath: path: /etc/hostname type: "" name: dockerhostname - hostPath: path: /etc/localtime type: "" name: localtime - hostPath: path: /etc/sysconfig/docker type: "" name: dockercfg - hostPath: path: /etc/origin/node type: "" name: originnodecfg - hostPath: path: /etc/docker type: "" name: dockerdaemoncfg - hostPath: path: /var/lib/fluentd type: "" name: filebufferstorage templateGeneration: 1 updateStrategy: rollingUpdate: maxUnavailable: 1 type: RollingUpdate No change applied to the DaemonSet, which means the default values from https://github.com/openshift/origin-aggregated-logging/tree/master/fluentd#configuration should apply. # ls -al /var/log/fluentd/fluentd.log -rw-r--r--. 1 root root 21993032 Feb 28 05:15 /var/log/fluentd/fluentd.log Version-Release number of selected component (if applicable): + registry.access.redhat.com/openshift3/logging-fluentd:v3.9.68 How reproducible: + Always Steps to Reproduce: 1. Install OpenShift Container Platform 3.9 - Cluster with latest available version 2. Deploy Aggregated Logging stack 3. Check if /var/log/fluentd/fluentd.log is created and rotated when it reaches specific size (default value is 1 MB) Actual results: # ls -al /var/log/fluentd/fluentd.log -rw-r--r--. 1 root root 21993032 Feb 28 05:15 /var/log/fluentd/fluentd.log File is not rotated, even with a size of about 20 MB Expected results: /var/log/fluentd/fluentd.log to be rotated when reaching 1 MB (default) respectively when reaching the value set in `LOGGING_FILE_SIZE` Additional info: Did not test/verify but the issue is potentially also with OpenShift Container Platform 3.10 and 3.11 --- Additional comment from Rich Megginson on 2019-02-28 16:15:25 UTC --- Could not reproduce with 3.11 - seems to be working fine. Will try 3.9 --- Additional comment from Rich Megginson on 2019-02-28 17:16:44 UTC --- 3.9 is definitely not working - but logging code is the same between 3.9 and 3.11 - strange . . . --- Additional comment from Rich Megginson on 2019-02-28 17:50:38 UTC --- https://github.com/openshift/origin-aggregated-logging/pull/1529 --- Additional comment from Rich Megginson on 2019-03-08 17:10:54 UTC --- https://github.com/openshift/origin-aggregated-logging/pull/1544
https://github.com/openshift/origin-aggregated-logging/pull/1545 Note - although we have not seen this issue in 3.11 to my knowledge, the image building code is still wrong and should be fixed
Verified in ose-logging-fluentd/images/v3.11.98-2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0636