Bug 1904380

Summary: Forwarding logs to Kafka using Chained certificates fails with error "state=error: certificate verify failed (unable to get local issuer certificate)"
Product: OpenShift Container Platform Reporter: Oscar Casal Sanchez <ocasalsa>
Component: LoggingAssignee: Sergey Yedrikov <syedriko>
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: medium Docs Contact:
Priority: high    
Version: 4.6CC: anli, aos-bugs, jcantril, jdelft, jwennerberg, kpelc, periklis, syedriko
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: logging-core
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, forwarding logs to Kafka using chained certificates failed with error "state=error: certificate verify failed (unable to get local issuer certificate)." Logs could not be forwarded to a Kafka broker with a certificate signed by an intermediate CA. This happened because fluentd Kafka plugin could only handle a single CA certificate supplied in the ca-bundle.crt entry of the corresponding secret. The current release fixes this issue. It enables the fluentd Kafka plugin to handle multiple CA certificates supplied in the ca-bundle.crt entry of the corresponding secret. Now, logs can be forwarded to a Kafka broker with a certificate signed by an intermediate CA.
Story Points: ---
Clone Of:
: 1939693 (view as bug list) Environment:
Last Closed: 2021-04-20 19:20:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Kafka config and certificates
none
Kafka in-cluster certs none

Description Oscar Casal Sanchez 2020-12-04 09:16:28 UTC
[Description of problem]
Configuring to forward logs to Kafka using TLS with a Chained certificate (Root CA + Intermediate certificate) is possible seeing in the fluentd pods the next error:

~~~
  2020-11-27 12:26:36 +0000 [warn]: suppressed same stacktrace
2020-11-27 12:26:36 +0000 [warn]: Send exception occurred: SSL_connect returned=1 errno=0 state=error: certificate verify failed (unable to get local issuer certificate)
2020-11-27 12:26:36 +0000 [warn]: Exception Backtrace : /usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/ssl_socket_with_timeout.rb:69:in `connect_nonblock'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/ssl_socket_with_timeout.rb:69:in `initialize'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/connection.rb:130:in `new'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/connection.rb:130:in `open'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/connection.rb:101:in `block in send_request'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/instrumenter.rb:23:in `instrument'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/connection.rb:100:in `send_request'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/broker.rb:200:in `send_request'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/broker.rb:44:in `fetch_metadata'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/cluster.rb:427:in `block in fetch_cluster_info'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/cluster.rb:422:in `each'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/cluster.rb:422:in `fetch_cluster_info'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/cluster.rb:402:in `cluster_info'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/cluster.rb:102:in `refresh_metadata!'
/usr/local/share/gems/gems/ruby-kafka-1.1.0/lib/kafka/cluster.rb:56:in `add_target_topics'
/usr/local/share/gems/gems/fluent-plugin-kafka-0.13.1/lib/fluent/plugin/kafka_producer_ext.rb:91:in `initialize'
/usr/local/share/gems/gems/fluent-plugin-kafka-0.13.1/lib/fluent/plugin/kafka_producer_ext.rb:60:in `new'
/usr/local/share/gems/gems/fluent-plugin-kafka-0.13.1/lib/fluent/plugin/kafka_producer_ext.rb:60:in `topic_producer'
/usr/local/share/gems/gems/fluent-plugin-kafka-0.13.1/lib/fluent/plugin/out_kafka2.rb:232:in `write'
/usr/local/share/gems/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:1125:in `try_flush'
/usr/local/share/gems/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:1431:in `flush_thread_run'
/usr/local/share/gems/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:461:in `block (2 levels) in start'
/usr/local/share/gems/gems/fluentd-1.7.4/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-11-27 12:26:36 +0000 [warn]: failed to flush the buffer. retry_time=6 next_retry_seconds=2020-11-27 12:27:11 +0000 chunk="5b5020a1f38cef2ace97184cd6d81ae9" error_class=OpenSSL::SSL::SSLError error="SSL_connect returned=1 errno=0 state=error: certificate verify failed (unable to get local issuer certificate)"
~~~

[Version-Release number of selected component (if applicable):]
- OCP 4.6
- Configure Logging to send logs to Kafka using TLS and chained certificates (Root CA + Intermediate CA)

[How reproducible]
Always in customer environment


Steps to Reproduce:
1. Deploy 4.6
2. Configure Logging to send logs to kafka brokers using TLS and using chained certificates (Root CA + Intermediate CA)
3. All the tests done with curl and openssl indicating the CA + cert + key were giving ok. Then, this is indicating that the CA file and certificates are ok
4. Check fluentd logs to see the error in the description part

It seems that a bug exists where the function read_ssl_file doesn't support chained certificates. This bug for the fluent-plugin-kafka is

  - https://github.com/fluent/fluent-plugin-kafka/issues/287

Where the description is:

"This is not mentioned explicitly in README.md, but read_ssl_file function doesn't support chained certificates. In *NIX environment this is very common that *.pem file contains multiple certificates concatenated together (eg. root CA + intermediate CA)."

And this is the way that we configure the ca-bundle for giving to fluent the CA that it must use. We introduce in the ca-bundle the (Root CA + intermediate CA)


[Actual results]
Fluentd is not able to send to Kafka using TLS and chained certificates

[Expected results]
Fluentd is able to send to Kafka using TLS and chained certificates

Comment 12 Sergey Yedrikov 2021-03-30 15:02:54 UTC
@anli I can give you all the info for a test case if you need one.

Comment 13 Sergey Yedrikov 2021-03-31 14:00:14 UTC
Created attachment 1768094 [details]
Kafka config and certificates

Comment 14 Sergey Yedrikov 2021-03-31 14:18:06 UTC
@anli The test case:

Unpack the latest release from https://kafka.apache.org and the bz_1904380_testcase.tar.gz attachment.

In kafka_2.13-2.7.0/config/server.properties, you care about this part:

listeners=PLAINTEXT://:9092,SSL://:9093
ssl.keystore.location=/home/syedriko/bz/1904380/pki/server_root_ca/server_intermediate_ca/server.pkcs12
ssl.keystore.password=server
ssl.truststore.location=/home/syedriko/bz/1904380/pki/client_root_ca/client_intermediate_ca/client_ca_bundle.jks
ssl.truststore.password=client_ca_bundle

fix the paths to point to where you unpacked the attachment.

Similarly, in kafka_2.13-2.7.0/client-ssl.properties, fix the paths to the certs.

Give the example in https://kafka.apache.org/quickstart a go, just run the producer and consumer over TLS:

kafka-console-producer.sh --bootstrap-server localhost:9093 --topic test --producer.config client-ssl.properties
kafka-console-consumer.sh --bootstrap-server localhost:9093 --topic test --consumer.config client-ssl.properties

Comment 15 Sergey Yedrikov 2021-03-31 14:32:40 UTC
Created attachment 1768104 [details]
Kafka in-cluster certs

Comment 16 Sergey Yedrikov 2021-03-31 14:37:13 UTC
@anli The certs in the bz_1904380_testcase.tar.gz attachment are for running Kafka on the localhost. I added another attachment with in-cluster certs, with different SANs:

[syedriko@localhost ~]$ diff ~/bz/1904380/pki_clo/server_root_ca/server_intermediate_ca/server.conf ~/bz/1904380/pki/server_root_ca/server_intermediate_ca/server.conf
17c17
< subjectAltName = DNS.1:kafka.openshift-logging.svc.cluster.local
---
> subjectAltName = IP.1:127.0.0.1,IP.2:0:0:0:0:0:0:0:1,DNS.1:localhost

Comment 17 Oscar Casal Sanchez 2021-04-07 07:24:07 UTC
Hello,

Do we have any news about this? Any issues doing the QA? Or are we missing something?

Best regards,
Oscar

Comment 18 Anping Li 2021-04-07 10:58:03 UTC
Verified on clusterlogging.4.6.0-202104030104.p0

Comment 19 Rolfe Dlugy-Hegwer 2021-04-12 09:14:51 UTC
Background information for release note:

Cause:
The fluentd Kafka plugin can only handle a single CA certificate supplied in the ca-bundle.crt entry of the corresponding secret.

Consequence:
Logs can not be forwarded to a Kafka broker with a certificate signed by an intermediate CA.

Fix:
Enable the fluentd Kafka plugin to handle multiple CA certificates supplied in the ca-bundle.crt entry of the corresponding secret.

Result:
Logs can be forwarded to a Kafka broker with a certificate signed by an intermediate CA.

Comment 24 errata-xmlrpc 2021-04-20 19:20:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.25 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1155