Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1507407

Summary: Review/Harden elasticsearch client security
Product: [oVirt] ovirt-engine-metrics Reporter: Yedidyah Bar David <didi>
Component: GenericAssignee: Yedidyah Bar David <didi>
Status: CLOSED WONTFIX QA Contact: Lukas Svaty <lsvaty>
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: bugs, didi, pportant, rmeggins, sradco
Target Milestone: ---Keywords: CodeChange, Security
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-12 13:09:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1548154, 1554878, 1593646, 1599589, 1615348, 1637156    
Bug Blocks:    

Description Yedidyah Bar David 2017-10-30 07:37:58 UTC
Description of problem:

In bug 1506458 we add a plugin to fluentd to send data directly to elasticsearch. The current code/plan seems to me less than ideal, security-wise.

Examples:

1. We turn off ssl_verify. Not sure why, perhaps some internal OpenShift issues. We need to understand and probably work this out so that we can enable. It seems to allow an attacker on the same network to try to impersonate the server, and make the fluentd clients send data to itself instead of the real server. This isn't an issue if clients and server(s) are in the same internal controlled/secured network.

2. It seems that we generate a private key on the server machine and then copy it to clients. Standard practices are to generate a key + CSR on the client, sign the CSR by a CA to get a cert, then provide the cert and the CA cert to the server. We need to analyze the process and either make sure it's good enough as-is or fix.

Comment 1 Shirly Radco 2017-10-30 07:40:32 UTC
Rich, Can you please give more details about this issue?

Comment 2 Rich Megginson 2017-10-30 14:25:27 UTC
(In reply to Yedidyah Bar David from comment #0)
> Description of problem:
> 
> In bug 1506458 we add a plugin to fluentd to send data directly to
> elasticsearch. The current code/plan seems to me less than ideal,
> security-wise.
> 
> Examples:
> 
> 1. We turn off ssl_verify. Not sure why, perhaps some internal OpenShift
> issues. We need to understand and probably work this out so that we can
> enable. It seems to allow an attacker on the same network to try to
> impersonate the server, and make the fluentd clients send data to itself
> instead of the real server. This isn't an issue if clients and server(s) are
> in the same internal controlled/secured network.

This is only a temporary workaround, until we figure out how best to expose common logging to clients.  ovirt is using client cert authentication from fluentd directly to Elasticsearch.  Elasticsearch only has cn/subjectAltNames that resolve inside the OpenShift cluster.  Therefore, when fluentd goes to validate the hostname of Elasticsearch against the cn/subjectAltNames in the Elasticsearch SSL server cert, it fails, because it cannot resolve any of those hostnames.  We have to set `ssl_verify false` to disable this hostname checking.

In the future, we will have the ability to specify an external FQDN for Elasticsearch to use in its SSL server cert.

> 
> 2. It seems that we generate a private key on the server machine and then
> copy it to clients. Standard practices are to generate a key + CSR on the
> client, sign the CSR by a CA to get a cert, then provide the cert and the CA
> cert to the server. We need to analyze the process and either make sure it's
> good enough as-is or fix.

This is only a temporary workaround, until we figure out how best to expose common logging to clients.  Yes, the procedure you describe is part of it.  We would also need to create one or more external fluentd system accounts in OpenShift, and assign them ACLs in Elasticsearch/SearchGuard.  In the meantime, we are using the pre-created openshift cluster internal fluentd cert/key/system account.

Comment 8 Yedidyah Bar David 2018-04-01 09:15:41 UTC
Moving to 4.2.4 for now, not sure when dep bugs are scheduled to be fixed.

Comment 11 Sandro Bonazzola 2018-09-20 14:13:14 UTC
What's the status here?

Comment 12 Yedidyah Bar David 2018-10-02 10:16:36 UTC
(In reply to Rich Megginson from comment #2)
> (In reply to Yedidyah Bar David from comment #0)
> > 2. It seems that we generate a private key on the server machine and then
> > copy it to clients. Standard practices are to generate a key + CSR on the
> > client, sign the CSR by a CA to get a cert, then provide the cert and the CA
> > cert to the server. We need to analyze the process and either make sure it's
> > good enough as-is or fix.
> 
> This is only a temporary workaround, until we figure out how best to expose
> common logging to clients.  Yes, the procedure you describe is part of it. 
> We would also need to create one or more external fluentd system accounts in
> OpenShift, and assign them ACLs in Elasticsearch/SearchGuard.  In the
> meantime, we are using the pre-created openshift cluster internal fluentd
> cert/key/system account.

What's the status of this one? Do we have a bug for it? A fix? Thanks!

Comment 13 Rich Megginson 2018-10-02 15:00:24 UTC
(In reply to Yedidyah Bar David from comment #12)
> (In reply to Rich Megginson from comment #2)
> > (In reply to Yedidyah Bar David from comment #0)
> > > 2. It seems that we generate a private key on the server machine and then
> > > copy it to clients. Standard practices are to generate a key + CSR on the
> > > client, sign the CSR by a CA to get a cert, then provide the cert and the CA
> > > cert to the server. We need to analyze the process and either make sure it's
> > > good enough as-is or fix.
> > 
> > This is only a temporary workaround, until we figure out how best to expose
> > common logging to clients.  Yes, the procedure you describe is part of it. 
> > We would also need to create one or more external fluentd system accounts in
> > OpenShift, and assign them ACLs in Elasticsearch/SearchGuard.  In the
> > meantime, we are using the pre-created openshift cluster internal fluentd
> > cert/key/system account.
> 
> What's the status of this one? Do we have a bug for it? A fix? Thanks!

It is not fixed.  We have a trello card/jira issue for it.

Comment 14 Yedidyah Bar David 2018-10-03 05:44:54 UTC
(In reply to Rich Megginson from comment #13)
> It is not fixed.  We have a trello card/jira issue for it.

Can you please add a link here, so that it's easier to track the status? Thanks.

Comment 15 Rich Megginson 2018-10-03 14:14:08 UTC
https://jira.coreos.com/browse/LOG-196

Comment 16 Yaniv Lavi 2018-10-08 07:24:06 UTC
(In reply to Rich Megginson from comment #15)
> https://jira.coreos.com/browse/LOG-196

do you have a bugzilla link to this issue?
We can't access this one.

Comment 17 Rich Megginson 2018-10-08 19:08:21 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1637156

Comment 18 Sandro Bonazzola 2018-10-15 05:44:42 UTC
Moving to 4.2.8, being required functionality not yet in openshift.

Comment 19 Sandro Bonazzola 2018-10-24 08:04:25 UTC
Moving to 4.3 since we still miss functionalities in openshift for this.

Comment 20 Sandro Bonazzola 2019-02-18 07:54:56 UTC
Moving to 4.3.2 not being identified as blocker for 4.3.1.

Comment 21 Sandro Bonazzola 2019-03-12 13:09:00 UTC
We are not using fluentd anymore so this bug seems not relevant anymore.
If this needs further investigation with current implementation using rsyslog please open a different bug.