Bug 1686941 - Newly introduced log-rotation in fluentd not working
Summary: Newly introduced log-rotation in fluentd not working
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.11.0
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 3.11.z
Assignee: Rich Megginson
QA Contact: Anping Li
URL:
Whiteboard:
Depends On: 1684048 1684210
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-08 17:14 UTC by Rich Megginson
Modified: 2019-04-11 05:38 UTC (History)
8 users (show)

Fixed In Version: ose-logging-fluentd:v3.11.97-1
Doc Type: Bug Fix
Doc Text:
Cause: The files that implement the new log rotation functionality were not being copied to the correct fluentd directory. Consequence: Fluentd was not using log rotation and its log files were not being rotated. Fix: Change the container build to inspect the fluentd gem to find out where to install the files. Result: The files that implement log rotation are copied to the correct directory for fluentd to use.
Clone Of: 1684210
Environment:
Last Closed: 2019-04-11 05:38:40 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift origin-aggregated-logging pull 1545 'None' 'closed' 'Bug 1686941 - Newly introduced log-rotation in fluentd not working' 2019-11-19 10:05:54 UTC
Red Hat Product Errata RHBA-2019:0636 None None None 2019-04-11 05:38:45 UTC

Description Rich Megginson 2019-03-08 17:14:43 UTC
+++ This bug was initially created as a clone of Bug #1684210 +++

+++ This bug was initially created as a clone of Bug #1684048 +++

Description of problem:

In https://access.redhat.com/errata/RHBA-2018:3748 (https://access.redhat.com/errata/RHBA-2018:3750 for 3.10 and https://access.redhat.com/errata/RHBA-2018:3743 for 3.11) we introduced a change that impacts the behaviour of `fluentd` in terms of it's own logging.

So instead of writing logs to STDOUT, `fluentd` is now writing by default to `/var/log/fluentd/fluentd.log`.

Environment Variables to control log file location, log file size and log file age (log-rotation) have been made available to help control the new functionality

LOGGING_FILE_PATH
LOGGING_FILE_AGE
LOGGING_FILE_SIZE

Problem is, that LOGGING_FILE_AGE and LOGGING_FILE_SIZE don't appear to work, as `/var/log/fluentd/fluentd.log` keeps growing infinite and never gets rotated:

# oc get ds logging-fluentd -o yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  creationTimestamp: 2019-02-27T22:54:11Z
  generation: 1
  labels:
    component: fluentd
    logging-infra: fluentd
    provider: openshift
  name: logging-fluentd
  namespace: logging
  resourceVersion: "90526"
  selfLink: /apis/extensions/v1beta1/namespaces/logging/daemonsets/logging-fluentd
  uid: 98b50d1a-3ae2-11e9-9e92-fa163e58e771
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      component: fluentd
      provider: openshift
  template:
    metadata:
      creationTimestamp: null
      labels:
        component: fluentd
        logging-infra: fluentd
        provider: openshift
      name: fluentd-elasticsearch
    spec:
      containers:
      - env:
        - name: K8S_HOST_URL
          value: https://kubernetes.default.svc.cluster.local
        - name: ES_HOST
          value: logging-es
        - name: ES_PORT
          value: "9200"
        - name: ES_CLIENT_CERT
          value: /etc/fluent/keys/cert
        - name: ES_CLIENT_KEY
          value: /etc/fluent/keys/key
        - name: ES_CA
          value: /etc/fluent/keys/ca
        - name: OPS_HOST
          value: logging-es
        - name: OPS_PORT
          value: "9200"
        - name: OPS_CLIENT_CERT
          value: /etc/fluent/keys/ops-cert
        - name: OPS_CLIENT_KEY
          value: /etc/fluent/keys/ops-key
        - name: OPS_CA
          value: /etc/fluent/keys/ops-ca
        - name: JOURNAL_SOURCE
        - name: JOURNAL_READ_FROM_HEAD
        - name: BUFFER_QUEUE_LIMIT
          value: "32"
        - name: BUFFER_SIZE_LIMIT
          value: 8m
        - name: FLUENTD_CPU_LIMIT
          valueFrom:
            resourceFieldRef:
              containerName: fluentd-elasticsearch
              divisor: "0"
              resource: limits.cpu
        - name: FLUENTD_MEMORY_LIMIT
          valueFrom:
            resourceFieldRef:
              containerName: fluentd-elasticsearch
              divisor: "0"
              resource: limits.memory
        - name: FILE_BUFFER_LIMIT
          value: 256Mi
        image: registry.access.redhat.com/openshift3/logging-fluentd:v3.9.68
        imagePullPolicy: IfNotPresent
        name: fluentd-elasticsearch
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 512Mi
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /run/log/journal
          name: runlogjournal
        - mountPath: /var/log
          name: varlog
        - mountPath: /var/lib/docker/containers
          name: varlibdockercontainers
          readOnly: true
        - mountPath: /etc/fluent/configs.d/user
          name: config
          readOnly: true
        - mountPath: /etc/fluent/keys
          name: certs
          readOnly: true
        - mountPath: /etc/docker-hostname
          name: dockerhostname
          readOnly: true
        - mountPath: /etc/localtime
          name: localtime
          readOnly: true
        - mountPath: /etc/sysconfig/docker
          name: dockercfg
          readOnly: true
        - mountPath: /etc/docker
          name: dockerdaemoncfg
          readOnly: true
        - mountPath: /etc/origin/node
          name: originnodecfg
          readOnly: true
        - mountPath: /var/lib/fluentd
          name: filebufferstorage
      dnsPolicy: ClusterFirst
      nodeSelector:
        logging-infra-fluentd: "true"
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: aggregated-logging-fluentd
      serviceAccountName: aggregated-logging-fluentd
      terminationGracePeriodSeconds: 30
      volumes:
      - hostPath:
          path: /run/log/journal
          type: ""
        name: runlogjournal
      - hostPath:
          path: /var/log
          type: ""
        name: varlog
      - hostPath:
          path: /var/lib/docker/containers
          type: ""
        name: varlibdockercontainers
      - configMap:
          defaultMode: 420
          name: logging-fluentd
        name: config
      - name: certs
        secret:
          defaultMode: 420
          secretName: logging-fluentd
      - hostPath:
          path: /etc/hostname
          type: ""
        name: dockerhostname
      - hostPath:
          path: /etc/localtime
          type: ""
        name: localtime
      - hostPath:
          path: /etc/sysconfig/docker
          type: ""
        name: dockercfg
      - hostPath:
          path: /etc/origin/node
          type: ""
        name: originnodecfg
      - hostPath:
          path: /etc/docker
          type: ""
        name: dockerdaemoncfg
      - hostPath:
          path: /var/lib/fluentd
          type: ""
        name: filebufferstorage
  templateGeneration: 1
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate

No change applied to the DaemonSet, which means the default values from https://github.com/openshift/origin-aggregated-logging/tree/master/fluentd#configuration should apply.

# ls -al /var/log/fluentd/fluentd.log
-rw-r--r--. 1 root root 21993032 Feb 28 05:15 /var/log/fluentd/fluentd.log 

Version-Release number of selected component (if applicable):

 + registry.access.redhat.com/openshift3/logging-fluentd:v3.9.68

How reproducible:

 + Always

Steps to Reproduce:
1. Install OpenShift Container Platform 3.9 - Cluster with latest available version
2. Deploy Aggregated Logging stack
3. Check if /var/log/fluentd/fluentd.log is created and rotated when it reaches specific size (default value is 1 MB)

Actual results:

# ls -al /var/log/fluentd/fluentd.log
-rw-r--r--. 1 root root 21993032 Feb 28 05:15 /var/log/fluentd/fluentd.log 

File is not rotated, even with a size of about 20 MB

Expected results:

/var/log/fluentd/fluentd.log to be rotated when reaching 1 MB (default) respectively when reaching the value set in `LOGGING_FILE_SIZE`

Additional info:

Did not test/verify but the issue is potentially also with OpenShift Container Platform 3.10 and 3.11

--- Additional comment from Rich Megginson on 2019-02-28 16:15:25 UTC ---

Could not reproduce with 3.11 - seems to be working fine.  Will try 3.9

--- Additional comment from Rich Megginson on 2019-02-28 17:16:44 UTC ---

3.9 is definitely not working - but logging code is the same between 3.9 and 3.11 - strange . . .

--- Additional comment from Rich Megginson on 2019-02-28 17:50:38 UTC ---

https://github.com/openshift/origin-aggregated-logging/pull/1529

--- Additional comment from Rich Megginson on 2019-03-08 17:10:54 UTC ---

https://github.com/openshift/origin-aggregated-logging/pull/1544

Comment 1 Rich Megginson 2019-03-08 17:20:49 UTC
https://github.com/openshift/origin-aggregated-logging/pull/1545

Note - although we have not seen this issue in 3.11 to my knowledge, the image building code is still wrong and should be fixed

Comment 3 Qiaoling Tang 2019-03-22 07:11:27 UTC
Verified in ose-logging-fluentd/images/v3.11.98-2

Comment 5 errata-xmlrpc 2019-04-11 05:38:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0636


Note You need to log in before you can comment on or make changes to this bug.