1911477 – Using legacy Log Forwarding is not sending logs to the internal Elasticsearch

Bug 1911477 - Using legacy Log Forwarding is not sending logs to the internal Elasticsearch

Summary: Using legacy Log Forwarding is not sending logs to the internal Elasticsearch

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	4.5.z
Assignee:	Jeff Cantrill
QA Contact:	Anping Li
Docs Contact:
URL:
Whiteboard:	logging-core
Depends On:	1928949
Blocks:
TreeView+	depends on / blocked

Reported:	2020-12-29 16:42 UTC by Oscar Casal Sanchez
Modified:	2024-06-13 23:49 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	* Previously, if you enabled legacy log forwarding, logs were not sent to managed storage. This issue occurred because the generated log forwarding configuration improperly chose between either log forwarding or legacy log forwarding. The current release fixes this issue. If the `ClusterLogging` CR defines a `logstore`, logs are sent to managed storage. Additionally, if legacy log forwarding is enabled, logs are sent to legacy log forwarding regardless of whether managed storage is enabled. (link:https://bugzilla.redhat.com/show_bug.cgi?id=1911477[1911477])
Clone Of:
Clones:	1921263 (view as bug list)
Environment:
Last Closed:	2021-03-25 12:31:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift cluster-logging-operator pull 868	None	open	Bug 1911477: Write to default logstore when legacy forwarding enabled	2021-02-15 08:34:20 UTC
Red Hat Knowledge Base (Solution)	5768901	None	None	None	2021-02-03 01:43:36 UTC
Red Hat Product Errata	RHBA-2021:0842	None	None	None	2021-03-25 12:31:42 UTC

Description Oscar Casal Sanchez 2020-12-29 16:42:21 UTC

[Description of problem]
In previous version, for example 3.x and until 4.3. When using secure_forward following the documentation [1] the logs were sent to the internal Elasticsearch and to the external instance.

Now, in 4.5 it's only sending to the external instance and it doesn't send more to the internal Elasticsearch.

Verifying the documentation, it has not changed [2]. Then, it's expected that it works like it did in the past:

- Sending logs to the internal Elasticsearch
- Sending logs to the external instance configured in the secure-forward configmap

One thing has changed in the configuration generated for fluentd. In 4.3 the fluentd configuration after configuring secure_forward following the documentation is like this:

~~~
$ oc rsh <fluentd pod> cat /etc/fluent/fluent.conf
...
<label @_LOGS_APP>
        <match **>
        @type copy

                <store>
                @type relabel
                @label @CLO_DEFAULT_APP_PIPELINE
        </store>

                <store>
                @type relabel
                @label @_LEGACY_SECUREFORWARD
        </store>

</match>
<label @_LOGS_INFRA>
        <match **>
        @type copy

                <store>
                @type relabel
                @label @CLO_DEFAULT_INFRA_PIPELINE
        </store>

                <store>
                @type relabel
                @label @_LEGACY_SECUREFORWARD
        </store>

</match>
</label>

# Relabel specific pipelines to multiple, outputs (e.g. ES, kafka stores)

<label @CLO_DEFAULT_APP_PIPELINE>
        <match **>
        @type copy

                <store>
                @type relabel
                @label @CLO_DEFAULT_OUTPUT_ES
        </store>
</match>
</label>

<label @CLO_DEFAULT_INFRA_PIPELINE>
        <match **>
        @type copy

                <store>
                @type relabel
                @label @CLO_DEFAULT_OUTPUT_ES
        </store>
</match>
</label>
...
~~~

As we can see above, it's sending to the CLO_DEFAULT and to the LEGACY_SECUREFORWARD, but, the configuration in OCP 4.5 generated after configuring the secure forward is like this:

~~~
$ oc rsh <fluentd pod> cat /etc/fluent/fluent.conf
...
<label @_LOGS_APP>
  <match **>
    @type copy


    <store>
      @type relabel
      @label @_LEGACY_SECUREFORWARD
    </store>

  </match>
</label>
<label @_LOGS_AUDIT>
  <match **>
    @type copy


    <store>
      @type relabel
      @label @_LEGACY_SECUREFORWARD
    </store>

  </match>
</label>
<label @_LOGS_INFRA>
  <match **>
    @type copy


    <store>
      @type relabel
      @label @_LEGACY_SECUREFORWARD
    </store>

  </match>
</label>
...
~~~

As we can see, it's only relabeling like "_LEGACY_SECUREFORWARD", but it's not possible to see the relabeling to CLO_DEFAULT_XXX.


[Version-Release number of selected component (if applicable):]

Version used for OCP 4.5
~~~
$ oc version
Client Version: 4.5.23
Server Version: 4.5.23
$ oc get csv -n openshift-logging
NAME                                           DISPLAY                  VERSION                 REPLACES   PHASE
clusterlogging.4.5.0-202012120433.p0           Cluster Logging          4.5.0-202012120433.p0              Failed
elasticsearch-operator.4.5.0-202012120433.p0   Elasticsearch Operator   4.5.0-202012120433.p0              Succeeded
~~~

Version used for OCP 4.3:
~~~
$ oc version
Client Version: 4.3.38
Server Version: 4.3.40
Kubernetes Version: v1.16.2+853223d
$ oc get csv -n openshift-logging
NAME                                            DISPLAY                  VERSION                  REPLACES   PHASE
clusterlogging.4.3.40-202010141211.p0           Cluster Logging          4.3.40-202010141211.p0              Succeeded
elasticsearch-operator.4.3.40-202010141211.p0   Elasticsearch Operator   4.3.40-202010141211.p0              Succeeded
~~~

[How reproducible]
Always


Steps to Reproduce:
1. Install Cluster Logging
2. Configure secure_forward
3. Logs are not sent to the internal Elasticsearch

[Actual results]
Logs are not sent to the internal Elasticsearch

[Expected results]
Logs should be sent to the internal Elasticsearch at the same time that to the external instance configured in the secure-forward configmap


We are aware that this is deprecated, but in 4.3 the documentation is saying the same that in 4.4 and 4.5 and it was working in 4.3 and previous versions, the same that in 3.x. Then, it's expected that it continues working and the logs are sent in parallel to the internal Elasticsearch, to the external instance configured.


[1] https://docs.openshift.com/container-platform/4.3/logging/config/cluster-logging-external.html
[2] https://docs.openshift.com/container-platform/4.3/logging/config/cluster-logging-external.html#cluster-logging-collector-fluentd_cluster-logging-external

Comment 6 weiguo fan 2021-01-28 04:09:54 UTC

Hi, Team,

We verified that we can workaround the issue with the following steps.
Cloud Red Hat support this as an official workaround until the problem is fixed.

Step1: Set the clusterlogging's spec.managementState to "Unmanaged".

       $ oc patch clusterlogging instance -n openshift-logging --type='json' -p='[{"op": "replace", "path": "/spec/managementState", "value":"Unmanaged"}]'

Step2: Edit the fleutnd comfigmap as the following.

~~~~~~~~~~~~~~~~~~~~~~~~~~
$ oc edit configmap fleutnd -n openshift-logging
....
    <label @_LOGS_APP>
      <match **>
        @type copy
        <store>                             <====== Add those lines
          @type relabel                     <======
          @label @CLO_DEFAULT_APP_PIPELINE  <======
        </store>                            <======
        <store>
          @type relabel
          @label @_LEGACY_SYSLOG
        </store>
      </match>
    </label>
    <label @_LOGS_AUDIT>
      <match **>
        @type copy

        <store>
          @type relabel
          @label @_LEGACY_SYSLOG
        </store>
      </match>
    </label>
    <label @_LOGS_INFRA>
      <match **>
        @type copy
        <store>                              <====== Add those lines
          @type relabel                      <======
          @label @CLO_DEFAULT_INFRA_PIPELINE <======
        </store>                             <======
        <store>
          @type relabel
          @label @_LEGACY_SYSLOG
        </store>
      </match>
    </label>
...
~~~~~~~~~~~~~~~~~~~~~~~~~

Step3: Delete all existing fluentd Pods to restart fluentd.

       $ oc delete pods -n openshift-logging -l component=fluentd

Comment 7 Masaki Furuta ( RH ) 2021-01-28 07:38:11 UTC

(Reply from comment # 6 to weiguo fans)
Adding need info, as per NEC's double check to the engineering team to see if NEC's workaround is suitable from POV of the RH engineering team.

/Masaki

Comment 10 Jeff Cantrill 2021-01-29 14:44:46 UTC

(In reply to Masaki Furuta from comment #7)
> (Reply from comment # 6 to weiguo fans)
> Adding need info, as per NEC's double check to the engineering team to see
> if NEC's workaround is suitable from POV of the RH engineering team.

Yes.  This is exactly the same change as referenced in the associated fix.

Comment 18 Anping Li 2021-03-16 05:48:42 UTC

Verified on
clusterserviceversion.operators.coreos.com/clusterlogging.4.5.0-202103150243.p0
clusterserviceversion.operators.coreos.com/elasticsearch-operator.4.5.0-202103150243.p0

Comment 20 errata-xmlrpc 2021-03-25 12:31:38 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5.36 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0842

Note You need to log in before you can comment on or make changes to this bug.