Bug 1735548

Summary:	The pipeline_metadata.collector.ipaddr6 is not correct when using rsyslog as log collector.
Product:	OpenShift Container Platform	Reporter:	Qiaoling Tang <qitang>
Component:	Logging	Assignee:	Jeff Cantrill <jcantril>
Status:	CLOSED ERRATA	QA Contact:	Anping Li <anli>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	4.2.0	CC:	aos-bugs, nhosoi, rmeggins
Target Milestone:	---
Target Release:	4.3.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-01-23 11:04:29 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Qiaoling Tang 2019-08-01 05:05:33 UTC

Description of problem:
Deploy logging using rsyslog as log collector, check the logs in Kibana, the pipeline_metadata.collector.ipaddr4 and pipeline_metadata.collector.ipaddr6 are not correct. The value of pipeline_metadata.collector.ipaddr4  if the node's ip, and the pipeline_metadata.collector.ipaddr6 is ::ffff:node-ip.

    "pipeline_metadata": {
      "collector": {
        "original_raw_message": "{\"message\": \"MERGE_JSON_LOG=true\", \"level\": \"debug\",\" Layer1\": \"layer1 0\", \"layer2\": {\"name\":\"Layer2 1\", \"tips\":\"decide by PRESERVE_JSON_LOG\"}, \"StringNumber\":\"10\", \"Number\": 10,\"foo.bar\":\"dotstring\",\"{foobar}\":\"bracestring\",\"[foobar]\":\"bracket string\", \"foo:bar\":\"colonstring\", \"empty1\":\"\", \"empty2\":{}}",
        "name": "rsyslog",
        "inputname": "imfile",
        "received_at": "2019-08-01T03:13:43.675936+00:00",
        "ipaddr4": "10.0.173.53",
        "ipaddr6": "::ffff:10.0.173.53",
        "version": "8.37.0-9.el7 "
      }
    },

$ oc get pod -owide
NAME                                            READY   STATUS      RESTARTS   AGE     IP            NODE                                         NOMINATED NODE   READINESS GATES
cluster-logging-operator-5c4ddbd945-ngrkq       1/1     Running     0          73m     10.131.0.16   ip-10-0-173-53.eu-west-2.compute.internal    <none>           <none>
curator-1564629000-5l56x                        0/1     Completed   0          8m32s   10.129.2.19   ip-10-0-139-143.eu-west-2.compute.internal   <none>           <none>
elasticsearch-cdm-r5mrwvfm-1-6c69fd7654-chz27   2/2     Running     0          56m     10.129.2.12   ip-10-0-139-143.eu-west-2.compute.internal   <none>           <none>
elasticsearch-cdm-r5mrwvfm-2-6889bb955-bkt5t    2/2     Running     0          71m     10.128.2.13   ip-10-0-152-86.eu-west-2.compute.internal    <none>           <none>
kibana-5cbd5cc9c9-4pxpg                         2/2     Running     0          56m     10.128.2.14   ip-10-0-152-86.eu-west-2.compute.internal    <none>           <none>
rsyslog-2lz5b                                   2/2     Running     0          13m     10.128.0.50   ip-10-0-157-116.eu-west-2.compute.internal   <none>           <none>
rsyslog-654vx                                   2/2     Running     0          13m     10.129.2.18   ip-10-0-139-143.eu-west-2.compute.internal   <none>           <none>
rsyslog-cvgb4                                   2/2     Running     0          13m     10.131.0.29   ip-10-0-173-53.eu-west-2.compute.internal    <none>           <none>
rsyslog-nl5n7                                   2/2     Running     0          13m     10.130.0.52   ip-10-0-161-167.eu-west-2.compute.internal   <none>           <none>
rsyslog-s5pl5                                   2/2     Running     0          13m     10.129.0.44   ip-10-0-132-129.eu-west-2.compute.internal   <none>           <none>
rsyslog-tq5np                                   2/2     Running     0          13m     10.128.2.19   ip-10-0-152-86.eu-west-2.compute.internal    <none>           <none>

$ oc get node -owide
NAME                                         STATUS   ROLES    AGE    VERSION             INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                   KERNEL-VERSION               CONTAINER-RUNTIME
ip-10-0-132-129.eu-west-2.compute.internal   Ready    master   135m   v1.14.0+c569285e9   10.0.132.129   <none>        Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa)   4.18.0-80.7.1.el8_0.x86_64   cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev
ip-10-0-139-143.eu-west-2.compute.internal   Ready    worker   127m   v1.14.0+c569285e9   10.0.139.143   <none>        Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa)   4.18.0-80.7.1.el8_0.x86_64   cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev
ip-10-0-152-86.eu-west-2.compute.internal    Ready    worker   127m   v1.14.0+c569285e9   10.0.152.86    <none>        Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa)   4.18.0-80.7.1.el8_0.x86_64   cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev
ip-10-0-157-116.eu-west-2.compute.internal   Ready    master   135m   v1.14.0+c569285e9   10.0.157.116   <none>        Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa)   4.18.0-80.7.1.el8_0.x86_64   cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev
ip-10-0-161-167.eu-west-2.compute.internal   Ready    master   134m   v1.14.0+c569285e9   10.0.161.167   <none>        Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa)   4.18.0-80.7.1.el8_0.x86_64   cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev
ip-10-0-173-53.eu-west-2.compute.internal    Ready    worker   127m   v1.14.0+c569285e9   10.0.173.53    <none>        Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa)   4.18.0-80.7.1.el8_0.x86_64   cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev


Version-Release number of selected component (if applicable):
ose-logging-rsyslog-v4.2.0-201907311819
ose-cluster-logging-operator-v4.2.0-201907311819
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-07-31-162901   True        False         82m     Cluster version is 4.2.0-0.nightly-2019-07-31-162901


How reproducible:
Always

Steps to Reproduce:
1.deploy logging using rsyslog as log collector
2.check log in Kibana
3.

Actual results:


Expected results:


Additional info:

Comment 1 Rich Megginson 2019-08-01 22:03:31 UTC

    "pipeline_metadata": {
      "collector": {
        "ipaddr4": "10.0.173.53",

$ oc get node -owide
NAME                                         STATUS   ROLES    AGE    VERSION             INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                   KERNEL-VERSION               CONTAINER-RUNTIMEip-10-0-173-53.eu-west-2.compute.internal    Ready    worker   127m   v1.14.0+c569285e9   10.0.173.53    <none>        Red Hat Enterprise Linux CoreOS 42.80.20190731.2 (Ootpa)   4.18.0-80.7.1.el8_0.x86_64   cri-o://1.14.10-0.5.dev.rhaos4.2.gitcf4220b.el8-dev

As you can see, the ipaddr4 field is reporting the INTERNAL-IP of the node.

This value comes from the Kubernetes Downward API NODE_IPV4:

        - name: NODE_IPV4
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP

which is specified in the operator here: https://github.com/openshift/cluster-logging-operator/blob/master/pkg/k8shandler/rsyslog.go#L349

		{Name: "NODE_IPV4", ValueFrom: &v1.EnvVarSource{FieldRef: &v1.ObjectFieldSelector{FieldPath: "status.hostIP"}}},

the downward api is described here: https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#capabilities-of-the-downward-api

  * status.hostIP - the node’s IP, available since v1.7.0-alpha.1

I thought since rsyslog is a daemonset, it would be most useful to report the IP address of the node where it is running.  Which IP address are you expecting to be reported?

Comment 2 Rich Megginson 2019-08-01 23:29:47 UTC

note that fluentd does the same thing with NODE_IPV4: https://github.com/openshift/cluster-logging-operator/blob/master/pkg/k8shandler/fluentd.go#L233

but it ignores it https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/run.sh#L71

So perhaps the bug should be "the fluentd pipeline_metadata.collector.ipaddr4 and pipeline_metadata.collector.ipaddr6 are not correct"

Comment 6 Rich Megginson 2019-08-02 16:00:01 UTC

We discussed this at the logging scrum today.  Since we cannot get the IPv6 address of the node, we should simply omit that field.

For rsyslog, this means we should set

IPADDR6=""

in cluster-logging-operator files/rsyslog/rsyslog.sh

then, in 65-viaq-formatting.conf, do this:

if strlen(`echo $IPADDR6`) > 0 then {
    set $!pipeline_metadata!collector!ipaddr6 = `echo $IPADDR6`;
}

That way, if we figure out how to get the node IPv6 addr, we can change it in rsyslog.sh, and rsyslog will just work.

For fluentd, it is much trickier.  We'll have to change the viaq plugin to also conditionally set ipaddr6 https://github.com/ViaQ/fluent-plugin-viaq_data_model/blob/master/lib/fluent/plugin/filter_viaq_data_model.rb#L370
which is set here: https://github.com/ViaQ/fluent-plugin-viaq_data_model/blob/master/lib/fluent/plugin/filter_viaq_data_model.rb#L206 from the environment variable.  The environment variable is set here: https://github.com/openshift/origin-aggregated-logging/blob/master/fluentd/run.sh#L72

So we'll need to change run.sh to set IPADDR6="" like rsyslog.  Then, change the plugin at line 206 like this:

if ENV['IPADDR6'] && ENV['IPADDR6'].length > 0
      @ipaddr6 = ENV['IPADDR6']
else
      @ipaddr6 = nil
end

then around 370 like this:

if @ipaddr6
      record['pipeline_metadata'][pipeline_type]['ipaddr6'] = @ipaddr6
end

Comment 7 Noriko Hosoi 2019-08-02 16:35:46 UTC

Do we want to include the fix for fluentd in this bug or open a separate bug?

Comment 8 Rich Megginson 2019-08-02 16:44:08 UTC

(In reply to Noriko Hosoi from comment #7)
> Do we want to include the fix for fluentd in this bug or open a separate bug?

I would include it in the fix for this bz.

Comment 10 Qiaoling Tang 2019-08-12 02:29:14 UTC

Verified with ose-logging-rsyslog/images/v4.2.0-201908091819 and ose-logging-fluentd-v4.2.0-201908091819

Comment 14 errata-xmlrpc 2020-01-23 11:04:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062