Bug 1851694

Summary:	Fluentd stuck in `Init:CrashLoopBackOff` when enable logforwarding and only deploy fluentd.
Product:	OpenShift Container Platform	Reporter:	Qiaoling Tang <qitang>
Component:	Logging	Assignee:	Periklis Tsirakidis <periklis>
Status:	CLOSED DUPLICATE	QA Contact:	Anping Li <anli>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	4.5	CC:	aos-bugs, periklis
Target Milestone:	---	Keywords:	Regression
Target Release:	4.5.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-06-30 06:01:47 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Qiaoling Tang 2020-06-28 09:32:25 UTC

Description of problem:
Deploy a log receiver, then create ClusterLogging, enable logforwarding and only deploy fluentd, the fluentd pods can't become running:

$ oc get pod
NAME                                       READY   STATUS                  RESTARTS   AGE
cluster-logging-operator-74cc99dfd-drts4   1/1     Running                 0          5h33m
fluentd-6mlbv                              0/1     Init:CrashLoopBackOff   6          10m
fluentd-7bzpz                              0/1     Init:CrashLoopBackOff   6          10m
fluentd-7c77w                              0/1     Init:CrashLoopBackOff   6          10m
fluentd-8zd7l                              0/1     Init:Error              7          10m
fluentd-czq9n                              0/1     Init:Error              7          10m
fluentd-fhp9v                              0/1     Init:CrashLoopBackOff   6          10m
fluentd-g47c9                              0/1     Init:CrashLoopBackOff   6          10m
fluentd-ht252                              0/1     Init:CrashLoopBackOff   6          10m
fluentd-vv4l5                              0/1     Init:CrashLoopBackOff   6          10m
fluentdserver-578777544c-b5nwq             1/1     Running                 0          11m

$ oc get clusterlogging -oyaml
  spec:
    collection:
      logs:
        fluentd: {}
        type: fluentd
    managementState: Managed
  status:
    collection:
      logs:
        fluentdStatus:
          clusterCondition:
            fluentd-6mlbv:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
            fluentd-7bzpz:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
            fluentd-7c77w:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
            fluentd-8zd7l:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
            fluentd-czq9n:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
            fluentd-fhp9v:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
            fluentd-g47c9:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
            fluentd-ht252:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
            fluentd-vv4l5:
            - lastTransitionTime: "2020-06-28T08:27:28Z"
              reason: PodInitializing
              status: "True"
              type: ContainerWaiting
          daemonSet: fluentd
          nodes:
            fluentd-6mlbv: ip-10-0-137-205.us-east-2.compute.internal
            fluentd-7bzpz: ip-10-0-153-51.us-east-2.compute.internal
            fluentd-7c77w: ip-10-0-142-205.us-east-2.compute.internal
            fluentd-8zd7l: ip-10-0-201-66.us-east-2.compute.internal
            fluentd-czq9n: ip-10-0-203-83.us-east-2.compute.internal
            fluentd-fhp9v: ip-10-0-183-222.us-east-2.compute.internal
            fluentd-g47c9: ip-10-0-162-142.us-east-2.compute.internal
            fluentd-ht252: ip-10-0-161-84.us-east-2.compute.internal
            fluentd-vv4l5: ip-10-0-192-161.us-east-2.compute.internal
          pods:
            failed: []
            notReady:
            - fluentd-6mlbv
            - fluentd-7bzpz
            - fluentd-7c77w
            - fluentd-8zd7l
            - fluentd-czq9n
            - fluentd-fhp9v
            - fluentd-g47c9
            - fluentd-ht252
            - fluentd-vv4l5
            ready: []
    curation: {}
    logStore: {}
    visualization: {}

$ oc get logforwarding -oyaml
  spec:
    outputs:
    - endpoint: fluentdserver.openshift-logging.svc:24224
      insecure: true
      name: fluentd-created-by-user
      type: forward
    pipelines:
    - inputSource: logs.app
      name: app-pipeline
      outputRefs:
      - fluentd-created-by-user
    - inputSource: logs.infra
      name: infra-pipeline
      outputRefs:
      - fluentd-created-by-user
    - inputSource: logs.audit
      name: audit-pipeline
      outputRefs:
      - fluentd-created-by-user
  status:
    lastUpdated: "2020-06-28T08:15:36Z"
    reason: ResourceName
    state: Accepted


Version-Release number of selected component (if applicable):
$ oc get csv
NAME                                           DISPLAY                  VERSION                 REPLACES   PHASE
clusterlogging.4.5.0-202006271533.p0           Cluster Logging          4.5.0-202006271533.p0              Succeeded
elasticsearch-operator.4.5.0-202006261904.p0   Elasticsearch Operator   4.5.0-202006261904.p0              Succeeded


How reproducible:
Always

Steps to Reproduce:
1. deploy log receiver
2. create logforwarding instance
3. create clusterlogging instance with:
apiVersion: "logging.openshift.io/v1"
kind: "ClusterLogging"
metadata:
  annotations:
    clusterlogging.openshift.io/logforwardingtechpreview: enabled
  name: "instance"
  namespace: "openshift-logging"
spec:
  managementState: "Managed"
  collection:
    logs:
      type: "fluentd"
      fluentd: {}


Actual results:


Expected results:


Additional info:
No such issue in 4.6

Comment 1 Periklis Tsirakidis 2020-06-30 06:01:47 UTC

@Qiaoling Tang

This is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1850076. Fix is waiting to be tested in the parent BZ for 4.6: https://bugzilla.redhat.com/show_bug.cgi?id=1849188

*** This bug has been marked as a duplicate of bug 1850076 ***