Bug 1924138 - Fluentd giving error when sending logs to one external Elasticsearch using PKI user authentication
Summary: Fluentd giving error when sending logs to one external Elasticsearch using PK...
Keywords:
Status: CLOSED DUPLICATE of bug 1899334
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Jeff Cantrill
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-02 17:04 UTC by Oscar Casal Sanchez
Modified: 2024-03-25 18:05 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-02 21:56:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5820131 0 None None None 2021-02-19 19:42:42 UTC

Description Oscar Casal Sanchez 2021-02-02 17:04:39 UTC
[Description of problem]
After configuring the Logging stack using the ClusterLogForwarder API for sending logs to one external elasticsearch using PKI user authentication [1] it's possible to see the next error in the fluentd pods:

https://www.elastic.co/guide/en/elasticsearch/reference/current/pki-realm.html

~~~
2021-01-22 08:17:06 +0000 [warn]: [elasticsearch_onprem_secure] failed to flush the buffer. retry_time=10 next_retry_seconds=2021-01-22 08:22:02 +0000 chunk="5b68f908f8fc670c5f84991151a12440" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch.example.com\", :port=>9200, :scheme=>\"https\", :user=>\"fluentd\", :password=>\"obfuscated\"}): [401] {\"error\":{\"root_cause\":[{\"type\":\"security_exception\",\"reason\":\"unable to authenticate user [fluentd] for REST request [/_bulk]\",\"header\":{\"WWW-Authenticate\":[\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\",\"Bearer realm=\\\"security\\\"\",\"ApiKey\"]}}],\"type\":\"security_exception\",\"reason\":\"unable to authenticate user [fluentd] for REST request [/_bulk]\",\"header\":{\"WWW-Authenticate\":[\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\",\"Bearer realm=\\\"security\\\"\",\"ApiKey\"]}},\"status\":401}"
~~~


[Version-Release number of selected component (if applicable):]

OCP 4.6
clusterlogging.4.6.0-202011221454.p0


[How reproducible]

Always

Steps to Reproduce:
1. Deploy an Elasticsearch using PKI user authentication [1] 
2. Configure ClusterLogForwarder for sending the logs to the external Elasticsearch creating the secret providing the CA, tls.cert and tls.key

~~~
spec:
  outputs:
  - name: elasticsearch-secure
    secret:
      name: external-tls-secret
    type: elasticsearch
    url: https://elasticsearch.example.com:9200
  pipelines:
  - inputRefs:
    - application
    - audit
    labels:
      logs: application
    name: application-logs
    outputRefs:
    - elasticsearch-secure
  - inputRefs:
    - infrastructure
    labels:
      logs: audit-infra
    name: infrastructure-audit-logs
    outputRefs:
    - elasticsearch-secure
~~~

3. Check the fluentd pods logs with the error that it's not able to auth to the Elasticsearch server:

~~~
2021-01-22 08:17:06 +0000 [warn]: [elasticsearch_onprem_secure] failed to flush the buffer. retry_time=10 next_retry_seconds=2021-01-22 08:22:02 +0000 chunk="5b68f908f8fc670c5f84991151a12440" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch.example.com\", :port=>9200, :scheme=>\"https\", :user=>\"fluentd\", :password=>\"obfuscated\"}): [401] {\"error\":{\"root_cause\":[{\"type\":\"security_exception\",\"reason\":\"unable to authenticate user [fluentd] for REST request [/_bulk]\",\"header\":{\"WWW-Authenticate\":[\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\",\"Bearer realm=\\\"security\\\"\",\"ApiKey\"]}}],\"type\":\"security_exception\",\"reason\":\"unable to authenticate user [fluentd] for REST request [/_bulk]\",\"header\":{\"WWW-Authenticate\":[\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\",\"Bearer realm=\\\"security\\\"\",\"ApiKey\"]}},\"status\":401}"
~~~

4. Check that it's possible to reach the external Elasticsearch and even push data using curl from inside a fluentd pods and using the certificates provided:

~~~
$ oc rsh <fluentd pod>
$ server=elasticsearch.example.com:9200
$ cd /var/run/ocp-collector/secrets/external-tls-secret/
$ curl https://$server/_cat/health?v --key tls.key --cacert ca-bundle.crt --cert tls.crt
epoch      timestamp cluster      status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1612262302 10:38:22  azch-cluster green           3         3    450 225    0    0        0             0   
$ curl -XPUT https://$server/test -H 'Content-Type: application/json' -d '                      {
   "settings" : {
     "number_of_shards" : 3,
     "number_of_replicas" : 1
 }
}
' --key tls.key --cacert ca-bundle.crt --cert tls.crt
(...)
{"acknowledged":true,"shards_acknowledged":true,"index":"test"}
~~~

5. Check the fluentd configuration and observer that exists a default user and passw in the definition:

~~~
<label @ELASTICSEARCH__SECURE>
  <match retry_retry_elasticsearch_secure>
    @type copy
    <store>
      @type elasticsearch
      @id retry_elasticsearch_secure
      host elasticsearch.example.com
      port 9200
      verify_es_version_at_startup false
      scheme https
      ssl_version TLSv1_2
      target_index_key viaq_index_name
      id_key viaq_msg_id
      remove_keys viaq_index_name
      user fluentd     <--------------- This 
      password changeme <------------- This
      client_key '/var/run/ocp-collector/secrets/external-tls-secret/tls.key'
      client_cert '/var/run/ocp-collector/secrets/external-tls-secret/tls.crt'
      ca_file '/var/run/ocp-collector/secrets/external-tls-secret/ca-bundle.crt'
      type_name _doc
      http_backend typhoeus
      write_operation create
      reload_connections 'true'
      # https://github.com/uken/fluent-plugin-elasticsearch#reload-after
      reload_after '200'
      # https://github.com/uken/fluent-plugin-elasticsearch#sniffer-class-name
      sniffer_class_name 'Fluent::Plugin::ElasticsearchSimpleSniffer'
~~~


In the Elasticsearch documentation [1], it's possible to read:

  "You can use a combination of PKI and username/password authentication. For example, you can enable SSL/TLS on the transport layer and define a PKI realm to require transport clients to authenticate with X.509 certificates, while still authenticating HTTP traffic using username and password credentials."

Then, how by default, it's added the "user fluentd" and password "changeme", it's using PKI + username/password authentication and it fails.


WORKAROUND:

- Move the CLO to Unmanaged
- Delete from the fluentd configmap the user fluentd and password changeme

~~~
$ oc edit cm fluentd -n openshift-logging
<label @ELASTICSEARCH__SECURE>
  <match retry_retry_elasticsearch_secure>
    @type copy
    <store>
      @type elasticsearch
      @id retry_elasticsearch_secure
      host elasticsearch.example.com
      port 9200
(...)
      user fluentd     <--------------- delete this line 
      password changeme <------------- delete this line
      client_key '/var/run/ocp-collector/secrets/external-tls-secret/tls.key'
      client_cert '/var/run/ocp-collector/secrets/external-tls-secret/tls.crt'
      ca_file '/var/run/ocp-collector/secrets/external-tls-secret/ca-bundle.crt'
(...)
~~~
- Delete the fluentd pods

  $ oc delete pods -l component=fluentd

After doing it, it's able to send the logs to the external Elasticsearch


[Expected results:]

It should work without needing to delete the user fluentd and password changeme, when these are created and the elasticsearch is using elasticsearch using PKI user authentication, then, it fails because it's trying to use PKI + user(fluentd)/password(changeme).

Take into consideration that if PKI is used and a user/password is defined, then, it will try to use both and it will fail as it's commented in the Elasticsearch documentation:

  "You can use a combination of PKI and username/password authentication. For example, you can enable SSL/TLS on the transport layer and define a PKI realm to require transport clients to authenticate with X.509 certificates, while still authenticating HTTP traffic using username and password credentials."


[1] https://www.elastic.co/guide/en/elasticsearch/reference/current/pki-realm.html

Comment 1 Jeff Cantrill 2021-02-02 21:56:55 UTC

*** This bug has been marked as a duplicate of bug 1899334 ***


Note You need to log in before you can comment on or make changes to this bug.