Description of problem: Deploy logging using rsyslog as log collector, check the logs in Kibana, the pipeline_metadata.collector.ipaddr4 and pipeline_metadata.collector.ipaddr6 are not correct. The value of pipeline_metadata.collector.ipaddr4 if the node's ip, and the pipeline_metadata.collector.ipaddr6 is ::ffff:node-ip. "pipeline_metadata": { "collector": { "original_raw_message": "{\"message\": \"MERGE_JSON_LOG=true\", \"level\": \"debug\",\" Layer1\": \"layer1 0\", \"layer2\": {\"name\":\"Layer2 1\", \"tips\":\"decide by PRESERVE_JSON_LOG\"}, \"StringNumber\":\"10\", \"Number\": 10,\"foo.bar\":\"dotstring\",\"{foobar}\":\"bracestring\",\"[foobar]\":\"bracket string\", \"foo:bar\":\"colonstring\", \"empty1\":\"\", \"empty2\":{}}", "name": "rsyslog", "inputname": "imfile", "received_at": "2019-08-01T03:13:43.675936+00:00", "ipaddr4": "10.0.173.53", "ipaddr6": "::ffff:10.0.173.53", "version": "8.37.0-9.el7 " } }, $ oc get pod -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cluster-logging-operator-5c4ddbd945-ngrkq 1/1 Running 0 73m 10.131.0.16 ip-10-0-173-53.eu-west-2.compute.internal <none> <none> curator-1564629000-5l56x 0/1 Completed 0 8m32s 10.129.2.19 ip-10-0-139-143.eu-west-2.compute.internal <none> <none> elasticsearch-cdm-r5mrwvfm-1-6c69fd7654-chz27 2/2 Running 0 56m 10.129.2.12 ip-10-0-139-143.eu-west-2.compute.internal <none> <none> elasticsearch-cdm-r5mrwvfm-2-6889bb955-bkt5t 2/2 Running 0 71m 10.128.2.13 ip-10-0-152-86.eu-west-2.compute.internal <none> <none> kibana-5cbd5cc9c9-4pxpg 2/2 Running 0 56m 10.128.2.14 ip-10-0-152-86.eu-west-2.compute.internal <none> <none> rsyslog-2lz5b 2/2 Running 0 13m 10.128.0.50 ip-10-0-157-116.eu-west-2.compute.internal <none> <none> rsyslog-654vx 2/2 Running 0 13m 10.129.2.18 ip-10-0-139-143.eu-west-2.compute.internal <none> <none> rsyslog-cvgb4 2/2 Running 0 13m 10.131.0.29 ip-10-0-173-53.eu-west-2.compute.internal <none> <none> rsyslog-nl5n7 2/2 Running 0 13m 10.130.0.52 ip-10-0-161-167.eu-west-2.compute.internal <none> <none> rsyslog-s5pl5 2/2 Running 0 13m 10.129.0.44 ip-10-0-132-129.eu-west-2.compute.internal <none> <none> rsyslog-tq5np 2/2 Running 0 13m 10.128.2.19 ip-10-0-152-86.eu-west-2.compute.internal <none> <none> $ oc get node -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-132-129.eu-west-2.compute.internal Ready master 135m v1.14.0+c569285e9 10.0.132.129 <none> Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa) 4.18.0-80.7.1.el8_0.x86_64 cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev ip-10-0-139-143.eu-west-2.compute.internal Ready worker 127m v1.14.0+c569285e9 10.0.139.143 <none> Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa) 4.18.0-80.7.1.el8_0.x86_64 cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev ip-10-0-152-86.eu-west-2.compute.internal Ready worker 127m v1.14.0+c569285e9 10.0.152.86 <none> Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa) 4.18.0-80.7.1.el8_0.x86_64 cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev ip-10-0-157-116.eu-west-2.compute.internal Ready master 135m v1.14.0+c569285e9 10.0.157.116 <none> Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa) 4.18.0-80.7.1.el8_0.x86_64 cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev ip-10-0-161-167.eu-west-2.compute.internal Ready master 134m v1.14.0+c569285e9 10.0.161.167 <none> Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa) 4.18.0-80.7.1.el8_0.x86_64 cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev ip-10-0-173-53.eu-west-2.compute.internal Ready worker 127m v1.14.0+c569285e9 10.0.173.53 <none> Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa) 4.18.0-80.7.1.el8_0.x86_64 cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev Version-Release number of selected component (if applicable): ose-logging-rsyslog-v4.2.0-201907311819 ose-cluster-logging-operator-v4.2.0-201907311819 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.0-0.nightly-2019-07-31-162901 True False 82m Cluster version is 4.2.0-0.nightly-2019-07-31-162901 How reproducible: Always Steps to Reproduce: 1.deploy logging using rsyslog as log collector 2.check log in Kibana 3. Actual results: Expected results: Additional info:
"pipeline_metadata": { "collector": { "ipaddr4": "10.0.173.53", $ oc get node -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIMEip-10-0-173-53.eu-west-2.compute.internal Ready worker 127m v1.14.0+c569285e9 10.0.173.53 <none> Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa) 4.18.0-80.7.1.el8_0.x86_64 cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev As you can see, the ipaddr4 field is reporting the INTERNAL-IP of the node. This value comes from the Kubernetes Downward API NODE_IPV4: - name: NODE_IPV4 valueFrom: fieldRef: apiVersion: v1 fieldPath: status.hostIP which is specified in the operator here: https://github.com/openshift/cluster-logging-operator/blob/master/pkg/k8shandler/rsyslog.go#L349 {Name: "NODE_IPV4", ValueFrom: &v1.EnvVarSource{FieldRef: &v1.ObjectFieldSelector{FieldPath: "status.hostIP"}}}, the downward api is described here: https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#capabilities-of-the-downward-api * status.hostIP - the node’s IP, available since v1.7.0-alpha.1 I thought since rsyslog is a daemonset, it would be most useful to report the IP address of the node where it is running. Which IP address are you expecting to be reported?
note that fluentd does the same thing with NODE_IPV4: https://github.com/openshift/cluster-logging-operator/blob/master/pkg/k8shandler/fluentd.go#L233 but it ignores it https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/run.sh#L71 So perhaps the bug should be "the fluentd pipeline_metadata.collector.ipaddr4 and pipeline_metadata.collector.ipaddr6 are not correct"
We discussed this at the logging scrum today. Since we cannot get the IPv6 address of the node, we should simply omit that field. For rsyslog, this means we should set IPADDR6="" in cluster-logging-operator files/rsyslog/rsyslog.sh then, in 65-viaq-formatting.conf, do this: if strlen(`echo $IPADDR6`) > 0 then { set $!pipeline_metadata!collector!ipaddr6 = `echo $IPADDR6`; } That way, if we figure out how to get the node IPv6 addr, we can change it in rsyslog.sh, and rsyslog will just work. For fluentd, it is much trickier. We'll have to change the viaq plugin to also conditionally set ipaddr6 https://github.com/ViaQ/fluent-plugin-viaq_data_model/blob/master/lib/fluent/plugin/filter_viaq_data_model.rb#L370 which is set here: https://github.com/ViaQ/fluent-plugin-viaq_data_model/blob/master/lib/fluent/plugin/filter_viaq_data_model.rb#L206 from the environment variable. The environment variable is set here: https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/run.sh#L72 So we'll need to change run.sh to set IPADDR6="" like rsyslog. Then, change the plugin at line 206 like this: if ENV['IPADDR6'] && ENV['IPADDR6'].length > 0 @ipaddr6 = ENV['IPADDR6'] else @ipaddr6 = nil end then around 370 like this: if @ipaddr6 record['pipeline_metadata'][pipeline_type]['ipaddr6'] = @ipaddr6 end
Do we want to include the fix for fluentd in this bug or open a separate bug?
(In reply to Noriko Hosoi from comment #7) > Do we want to include the fix for fluentd in this bug or open a separate bug? I would include it in the fix for this bz.
Verified with ose-logging-rsyslog/images/v4.2.0-201908091819 and ose-logging-fluentd-v4.2.0-201908091819
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062