Created attachment 1797206 [details] fluentd conf file Description of problem: We are not able to send logs to external syslog server over tls by using ClusterLogForwarder instance. while the same configuration was working and able to receive logs with 5.0.x version operator. When I switched to tcp (without tls), I was able to receive buffered logs (contains tls logs) as I have used a different labels in ClusterLogForwarder file for tls configuration. Version-Release number of selected component (if applicable): [root@bastion rsyslog]# oc version Client Version: 4.8.0-rc.1 Server Version: 4.8.0-rc.1 Kubernetes Version: v1.21.0-rc.0+766a5fe [root@bastion rsyslog]# [root@bastion ~]# oc get csv NAME DISPLAY VERSION REPLACES PHASE cluster-logging.5.1.0-53 Cluster Logging 5.1.0-53 Succeeded elasticsearch-operator.5.1.0-74 OpenShift Elasticsearch Operator 5.1.0-74 Succeeded How reproducible: Configuration of external syslog server: global( DefaultNetstreamDriverCAFile="/root/log-forward-test/rsyslog/tls/ca.pem" DefaultNetstreamDriverCertFile="/root/log-forward-test/rsyslog/tls/server.crt" DefaultNetstreamDriverKeyFile="/root/log-forward-test/rsyslog/tls/server.key" ) module( load="imtcp" StreamDriver.Name = "gtls" StreamDriver.Mode = "1" StreamDriver.AuthMode = "anon" ) input( type="imtcp" port="6514" ) Same logs from all fluentd pods: [root@bastion rsyslog.d]# oc get po NAME READY STATUS RESTARTS AGE cluster-logging-operator-65467484b9-jbhwt 1/1 Running 0 25h elasticsearch-cdm-znqpbf3d-1-567f57d66f-75s5w 2/2 Running 0 25h elasticsearch-cdm-znqpbf3d-2-bcd665fb-qkltr 2/2 Running 0 25h elasticsearch-cdm-znqpbf3d-3-5558df5b79-j8dw9 2/2 Running 0 25h elasticsearch-im-app-27087210-8ddgl 0/1 Completed 0 12m elasticsearch-im-audit-27087210-wl862 0/1 Completed 0 12m elasticsearch-im-infra-27087210-4brk4 0/1 Completed 0 12m fluentd-5wmr5 1/1 Running 0 51m fluentd-l9p7l 1/1 Running 0 51m fluentd-m9vsz 1/1 Running 0 50m fluentd-mgwmw 1/1 Running 0 51m fluentd-pchjs 1/1 Running 0 51m fluentd-wt4md 1/1 Running 0 50m kibana-57968d8769-9jz2k 2/2 Running 0 25h stress-test-cpustresstest-cd4b5699f-62zp5 0/1 ImagePullBackOff 0 8h [root@bastion rsyslog.d]# oc logs fluentd-5wmr5 Setting each total_size_limit for 3 buffers to 2126773248 bytes Setting queued_chunks_limit_size for each buffer to 253 Setting chunk_limit_size for each buffer to 8388608 [root@bastion rsyslog.d]# oc logs fluentd-pchjs Setting each total_size_limit for 3 buffers to 2126773248 bytes Setting queued_chunks_limit_size for each buffer to 253 Setting chunk_limit_size for each buffer to 8388608 [root@bastion rsyslog]# oc describe secret tls-secret Name: tls-secret Namespace: openshift-logging Labels: <none> Annotations: <none> Type: Opaque Data ==== ca-bundle.crt: 3099 bytes tls.crt: 1277 bytes tls.key: 1704 bytes [root@bastion rsyslog]# TLS - CLF file ---------------------- apiVersion: logging.openshift.io/v1 kind: ClusterLogForwarder metadata: name: instance namespace: openshift-logging spec: outputs: - name: rsyslog-west type: syslog syslog: rfc: RFC5424 severity: informational url: 'tls://192.168.79.1:6514' secret: name: tls-secret pipelines: - name: syslog-west inputRefs: - infrastructure - application - audit outputRefs: - rsyslog-west - default labels: syslog: westtls ------------------------ TCP - CLF file ------------------- apiVersion: logging.openshift.io/v1 kind: ClusterLogForwarder metadata: name: instance namespace: openshift-logging spec: outputs: - name: rsyslog-west type: syslog syslog: rfc: RFC5424 severity: informational url: 'udp://192.168.79.1:514' pipelines: - name: syslog-west inputRefs: - infrastructure outputRefs: - rsyslog-west - default labels: syslog: west ------------------------ logs received when I switched to TCP ClusterLogForwarder --------------- Jul 2 10:43:28 master-2.m13lp83ocp.lnxne.boe fluentd kind:Event#011apiVersion:audit.k8s.io/v1#011level:info#011auditID:7a4dabae-85f5-4ecd-8c64-faa8e27c2300#011stage:ResponseComplete#011requestURI:/apis/local.storage.openshift.io/v1/namespaces/openshift-local-storage/localvolumes/lv-mon#011verb:get#011user:{"username"=>"system:serviceaccount:openshift-local-storage:local-storage-admin", "uid"=>"7140fda0-0b09-49fa-bba7-06b3db04c3d8", "groups"=>["system:serviceaccounts", "system:serviceaccounts:openshift-local-storage", "system:authenticated"], "extra"=>{"authentication.kubernetes.io/pod-name"=>["lv-mon-local-diskmaker-x5njt"], "authentication.kubernetes.io/pod-uid"=>["a756a6b0-b9cc-49c3-8251-9eb0fe6857be"]}}#011sourceIPs:["192.168.79.20"]#011userAgent:diskmaker/v0.0.0 (linux/s390x) kubernetes/$Format#011objectRef:{"resource"=>"localvolumes", "namespace"=>"openshift-local-storage", "name"=>"lv-mon", "apiGroup"=>"local.storage.openshift.io", "apiVersion"=>"v1"}#011responseStatus:{"code"=>200}#011requestReceivedTimestamp:2021-07-02T10:29:26.473737Z#011stageTimestamp:2021-07-02T10:29:26.493566Z#011annotations:{"authorization.k8s.io/decision"=>"allow", "authorization.k8s.io/reason"=>"RBAC: allowed by RoleBinding \"local-storage-operator.4.6.0-202103010126.p0-local-s-785d857cbd/openshift-local-storage\" of Role \"local-storage-operator.4.6.0-202103010126.p0-local-s-785d857cbd\" to ServiceAccount \"local-storage-admin/openshift-local-storage\""}#011k8s_audit_level:Metadata#011message:#011hostname:master-2.m13lp83ocp.lnxne.boe#011pipeline_metadata:{"collector"=>{"ipaddr4"=>"192.168.79.23", "inputname"=>"fluent-plugin-systemd", "name"=>"fluentd", "received_at"=>"2021-07-02T10:29:26.500461+00:00", "version"=>"1.7.4 1.6.0"}}#011@timestamp:2021-07-02T10:29:26.473737+00:00#011viaq_index_name:audit-write#011viaq_msg_id:NWNkNGM3NzQtMjc2ZC00ZjZmLTk1YmMtNTlkNjcxNDNlMzlm#011openshift:{"labels"=>{"syslog"=>"westtls"}} . . . . . . Jul 2 10:44:08 worker-1.m13lp83ocp.lnxne.boe fluentd _STREAM_ID:f508be6ad5694ed7ad246dd65ecba6da#011_SYSTEMD_INVOCATION_ID:b2b57df83bc541a099b53663118139bc#011systemd:{"t"=>{"BOOT_ID"=>"e6a07c4fb99542aea11274f36341278e", "CAP_EFFECTIVE"=>"ffffffffff", "CMDLINE"=>"kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/var/lib/kubelet/kubeconfig --container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock --runtime-cgroups=/system.slice/crio.service --node-labels=node-role.kubernetes.io/worker,node.openshift.io/os_id=rhcos --node-ip=192.168.79.25 --minimum-container-ttl-duration=6m0s --volume-plugin-dir=/etc/kubernetes/kubelet-plugins/volume/exec --cloud-provider= --pod-infra-container-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2c2a1c5f73eb01d39a5584338c1988098d71b1bc51ca2d002e80751419180f54 --system-reserved=cpu=500m,memory=1Gi --v=2", "COMM"=>"kubelet", "EXE"=>"/usr/bin/kubelet", "GID"=>"0", "MACHINE_ID"=>"43f75d2b3b62419b8367ba46d2a2a007", "PID"=>"1678", "SELINUX_CONTEXT"=>"system_u:system_r:container_runtime_t:s0", "STREAM_ID"=>"f508be6ad5694ed7ad246dd65ecba6da", "SYSTEMD_CGROUP"=>"/system.slice/kubelet.service", "SYSTEMD_INVOCATION_ID"=>"b2b57df83bc541a099b53663118139bc", "SYSTEMD_SLICE"=>"system.slice", "SYSTEMD_UNIT"=>"kubelet.service", "TRANSPORT"=>"stdout", "UID"=>"0"}, "u"=>{"SYSLOG_FACILITY"=>"3", "SYSLOG_IDENTIFIER"=>"hyperkube"}}#011level:info#011message:W0702 10:44:06.861641 1678 conversion.go:111] Could not get instant cpu stats: cumulative stats decrease#011hostname:worker-1.m13lp83ocp.lnxne.boe#011pipeline_metadata:{"collector"=>{"ipaddr4"=>"192.168.79.25", "inputname"=>"fluent-plugin-systemd", "name"=>"fluentd", "received_at"=>"2021-07-02T10:44:07.259577+00:00", "version"=>"1.7.4 1.6.0"}}#011@timestamp:2021-07-02T10:44:06.866710+00:00#011viaq_index_name:infra-write#011viaq_msg_id:NTk5YmM3MzktNWZhOC00ZmQ3LTkxNDEtN2UyYzM2ZWFiZWQw#011openshift:{"labels"=>{"syslog"=>"west"}} ---------------- Attached fluentd.conf file extracted from below command. oc extract cm/fluentd Additional info: Please let me know if must-gather data is required.
From a discussion with Anping: There is a potential fix in flight: https://github.com/openshift/cluster-logging-operator/pull/1083/files
Closing this as per 5.x issues should be opened on https://issues.redhat.com/browse/LOG as Bug Tickets. Please use Bugzilla only if affected version is 4.6.z