Bug 1911477 - Using legacy Log Forwarding is not sending logs to the internal Elasticsearch
Summary: Using legacy Log Forwarding is not sending logs to the internal Elasticsearch
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 4.5.z
Assignee: Jeff Cantrill
QA Contact: Anping Li
URL:
Whiteboard: logging-core
Depends On: 1928949
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-29 16:42 UTC by Oscar Casal Sanchez
Modified: 2024-06-13 23:49 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
* Previously, if you enabled legacy log forwarding, logs were not sent to managed storage. This issue occurred because the generated log forwarding configuration improperly chose between either log forwarding or legacy log forwarding. The current release fixes this issue. If the `ClusterLogging` CR defines a `logstore`, logs are sent to managed storage. Additionally, if legacy log forwarding is enabled, logs are sent to legacy log forwarding regardless of whether managed storage is enabled. (link:https://bugzilla.redhat.com/show_bug.cgi?id=1911477[*1911477*])
Clone Of:
: 1921263 (view as bug list)
Environment:
Last Closed: 2021-03-25 12:31:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-logging-operator pull 868 0 None open Bug 1911477: Write to default logstore when legacy forwarding enabled 2021-02-15 08:34:20 UTC
Red Hat Knowledge Base (Solution) 5768901 0 None None None 2021-02-03 01:43:36 UTC
Red Hat Product Errata RHBA-2021:0842 0 None None None 2021-03-25 12:31:42 UTC

Description Oscar Casal Sanchez 2020-12-29 16:42:21 UTC
[Description of problem]
In previous version, for example 3.x and until 4.3. When using secure_forward following the documentation [1] the logs were sent to the internal Elasticsearch and to the external instance.

Now, in 4.5 it's only sending to the external instance and it doesn't send more to the internal Elasticsearch.

Verifying the documentation, it has not changed [2]. Then, it's expected that it works like it did in the past:

- Sending logs to the internal Elasticsearch
- Sending logs to the external instance configured in the secure-forward configmap

One thing has changed in the configuration generated for fluentd. In 4.3 the fluentd configuration after configuring secure_forward following the documentation is like this:

~~~
$ oc rsh <fluentd pod> cat /etc/fluent/fluent.conf
...
<label @_LOGS_APP>
        <match **>
        @type copy

                <store>
                @type relabel
                @label @CLO_DEFAULT_APP_PIPELINE
        </store>

                <store>
                @type relabel
                @label @_LEGACY_SECUREFORWARD
        </store>

</match>
<label @_LOGS_INFRA>
        <match **>
        @type copy

                <store>
                @type relabel
                @label @CLO_DEFAULT_INFRA_PIPELINE
        </store>

                <store>
                @type relabel
                @label @_LEGACY_SECUREFORWARD
        </store>

</match>
</label>

# Relabel specific pipelines to multiple, outputs (e.g. ES, kafka stores)

<label @CLO_DEFAULT_APP_PIPELINE>
        <match **>
        @type copy

                <store>
                @type relabel
                @label @CLO_DEFAULT_OUTPUT_ES
        </store>
</match>
</label>

<label @CLO_DEFAULT_INFRA_PIPELINE>
        <match **>
        @type copy

                <store>
                @type relabel
                @label @CLO_DEFAULT_OUTPUT_ES
        </store>
</match>
</label>
...
~~~

As we can see above, it's sending to the CLO_DEFAULT and to the LEGACY_SECUREFORWARD, but, the configuration in OCP 4.5 generated after configuring the secure forward is like this:

~~~
$ oc rsh <fluentd pod> cat /etc/fluent/fluent.conf
...
<label @_LOGS_APP>
  <match **>
    @type copy


    <store>
      @type relabel
      @label @_LEGACY_SECUREFORWARD
    </store>

  </match>
</label>
<label @_LOGS_AUDIT>
  <match **>
    @type copy


    <store>
      @type relabel
      @label @_LEGACY_SECUREFORWARD
    </store>

  </match>
</label>
<label @_LOGS_INFRA>
  <match **>
    @type copy


    <store>
      @type relabel
      @label @_LEGACY_SECUREFORWARD
    </store>

  </match>
</label>
...
~~~

As we can see, it's only relabeling like "_LEGACY_SECUREFORWARD", but it's not possible to see the relabeling to CLO_DEFAULT_XXX.


[Version-Release number of selected component (if applicable):]

Version used for OCP 4.5
~~~
$ oc version
Client Version: 4.5.23
Server Version: 4.5.23
$ oc get csv -n openshift-logging
NAME                                           DISPLAY                  VERSION                 REPLACES   PHASE
clusterlogging.4.5.0-202012120433.p0           Cluster Logging          4.5.0-202012120433.p0              Failed
elasticsearch-operator.4.5.0-202012120433.p0   Elasticsearch Operator   4.5.0-202012120433.p0              Succeeded
~~~

Version used for OCP 4.3:
~~~
$ oc version
Client Version: 4.3.38
Server Version: 4.3.40
Kubernetes Version: v1.16.2+853223d
$ oc get csv -n openshift-logging
NAME                                            DISPLAY                  VERSION                  REPLACES   PHASE
clusterlogging.4.3.40-202010141211.p0           Cluster Logging          4.3.40-202010141211.p0              Succeeded
elasticsearch-operator.4.3.40-202010141211.p0   Elasticsearch Operator   4.3.40-202010141211.p0              Succeeded
~~~

[How reproducible]
Always


Steps to Reproduce:
1. Install Cluster Logging
2. Configure secure_forward
3. Logs are not sent to the internal Elasticsearch

[Actual results]
Logs are not sent to the internal Elasticsearch

[Expected results]
Logs should be sent to the internal Elasticsearch at the same time that to the external instance configured in the secure-forward configmap


We are aware that this is deprecated, but in 4.3 the documentation is saying the same that in 4.4 and 4.5 and it was working in 4.3 and previous versions, the same that in 3.x. Then, it's expected that it continues working and the logs are sent in parallel to the internal Elasticsearch, to the external instance configured.


[1] https://docs.openshift.com/container-platform/4.3/logging/config/cluster-logging-external.html
[2] https://docs.openshift.com/container-platform/4.3/logging/config/cluster-logging-external.html#cluster-logging-collector-fluentd_cluster-logging-external

Comment 6 weiguo fan 2021-01-28 04:09:54 UTC
Hi, Team,

We verified that we can workaround the issue with the following steps.
Cloud Red Hat support this as an official workaround until the problem is fixed.

Step1: Set the clusterlogging's spec.managementState to "Unmanaged".

       $ oc patch clusterlogging instance -n openshift-logging --type='json' -p='[{"op": "replace", "path": "/spec/managementState", "value":"Unmanaged"}]'

Step2: Edit the fleutnd comfigmap as the following.

~~~~~~~~~~~~~~~~~~~~~~~~~~
$ oc edit configmap fleutnd -n openshift-logging
....
    <label @_LOGS_APP>
      <match **>
        @type copy
        <store>                             <====== Add those lines
          @type relabel                     <======
          @label @CLO_DEFAULT_APP_PIPELINE  <======
        </store>                            <======
        <store>
          @type relabel
          @label @_LEGACY_SYSLOG
        </store>
      </match>
    </label>
    <label @_LOGS_AUDIT>
      <match **>
        @type copy

        <store>
          @type relabel
          @label @_LEGACY_SYSLOG
        </store>
      </match>
    </label>
    <label @_LOGS_INFRA>
      <match **>
        @type copy
        <store>                              <====== Add those lines
          @type relabel                      <======
          @label @CLO_DEFAULT_INFRA_PIPELINE <======
        </store>                             <======
        <store>
          @type relabel
          @label @_LEGACY_SYSLOG
        </store>
      </match>
    </label>
...
~~~~~~~~~~~~~~~~~~~~~~~~~

Step3: Delete all existing fluentd Pods to restart fluentd.

       $ oc delete pods -n openshift-logging -l component=fluentd

Comment 7 Masaki Furuta ( RH ) 2021-01-28 07:38:11 UTC
(Reply from comment # 6 to weiguo fans)
Adding need info, as per NEC's double check to the engineering team to see if NEC's workaround is suitable from POV of the RH engineering team.

/Masaki

Comment 10 Jeff Cantrill 2021-01-29 14:44:46 UTC
(In reply to Masaki Furuta from comment #7)
> (Reply from comment # 6 to weiguo fans)
> Adding need info, as per NEC's double check to the engineering team to see
> if NEC's workaround is suitable from POV of the RH engineering team.

Yes.  This is exactly the same change as referenced in the associated fix.

Comment 18 Anping Li 2021-03-16 05:48:42 UTC
Verified on
clusterserviceversion.operators.coreos.com/clusterlogging.4.5.0-202103150243.p0
clusterserviceversion.operators.coreos.com/elasticsearch-operator.4.5.0-202103150243.p0

Comment 20 errata-xmlrpc 2021-03-25 12:31:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5.36 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0842


Note You need to log in before you can comment on or make changes to this bug.