1809197 – CoreDNS Metrics exposed over insecure channel

Bug 1809197 - CoreDNS Metrics exposed over insecure channel

Summary: CoreDNS Metrics exposed over insecure channel

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	4.5.0
Assignee:	Stephen Greene
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-03-02 15:00 UTC by Pawel Krupa
Modified:	2022-08-04 22:39 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: CoreDNS prometheus metrics integration was not set up properly. Consequence: CoreDNS metrics were being exposed over an insecure channel within a cluster. Fix: Add proper TLS components and a kube-rbac-proxy sidecar to secure the CoreDNS metrics endpoint. Result: CoreDNS metrics are exposed over a secure channel.
Clone Of:
Environment:
Last Closed:	2020-07-13 17:17:25 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
prometheus scrape targets - dns section (35.33 KB, image/png) 2020-03-03 16:48 UTC, Pawel Krupa	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-dns-operator pull 163	0	None	closed	Bug 1809197: Secure CoreDNS metrics	2020-11-11 21:46:07 UTC
Red Hat Product Errata	RHBA-2020:2409	0	None	None	None	2020-07-13 17:17:44 UTC

Description Pawel Krupa 2020-03-02 15:00:53 UTC

Description of problem:
Metrics endpoint is not using TLS to encrypt traffic.

Version-Release number of selected component (if applicable):
4.4 (possibly also earlier versions)

How reproducible:
Always

Steps to Reproduce:
1. Start a cluster
2. Go to prometheus UI
3. Check connection schema for this component

Actual results:
Metrics are exposed over HTTP connection

Expected results:
Metrics are exposed over HTTPS connection

Additional info:
API server operator ServiceMonitor definition can be used as a template on how to fix this issue: https://github.com/openshift/cluster-openshift-apiserver-operator/blob/master/manifests/0000_90_openshift-apiserver-operator_03_servicemonitor.yaml

Comment 1 Dan Mace 2020-03-03 13:57:09 UTC

https://github.com/openshift/cluster-dns-operator/blob/master/manifests/0000_90_dns-operator_02_servicemonitor.yaml

Was this bug generated by some boilerplate process? It refers to "the component" and the reproducer steps seem totally non-specific to the DNS operator.

Comment 2 Pawel Krupa 2020-03-03 15:58:23 UTC

Sorry for not clarifying. This is about openshift-dns/dns-default component.

Comment 3 Dan Mace 2020-03-03 16:03:20 UTC

Yes, so am I — what exactly leads you to believe that metrics are served insecurely? The CoreDNS pods are exposing a TCP port 9153 serving a TLS endpoint secured by a serving signer service certificate which Prometheus is configured to use.

Comment 4 Pawel Krupa 2020-03-03 16:48:28 UTC

Created attachment 1667246 [details]
prometheus scrape targets - dns section

Based on prometheus scrape targets page all DNS endpoints are scraped over HTTP and not HTTPS, which is an insecure channel. Screenshot from 4.3 cluster is attached in this BZ.

Comment 5 Dan Mace 2020-03-03 18:46:23 UTC

Thanks, I see my confusion now — I was looking at the dns operator and not coreDNS itself, which indeed looks misconfigured somehow despite TLS config present throughout the relevant resources. Going to move to 4.5 for now unless someone can justify the blocker status (given this has probably been an issue since 4.1).

Comment 6 Pawel Krupa 2020-04-06 11:41:54 UTC

After fixing please remove your component from an exclusion list in e2e tests at https://github.com/openshift/origin/blob/master/test/extended/prometheus/prometheus.go#L253-L268

Comment 9 Hongan Li 2020-04-26 04:02:10 UTC

Verified with 4.5.0-0.nightly-2020-04-25-170442 and issue has been fixed.

$ oc -n openshift-dns get pod -owide
NAME                READY   STATUS    RESTARTS   AGE    IP           NODE                                     NOMINATED NODE   READINESS GATES
dns-default-wp7p6   3/3     Running   0          111m   10.128.0.4   hongli-pl442-mld8x-master-1              <none>           <none>
<---snip--->

Go to Prometheus UI and check the targets as below:

https://10.128.0.4:9154/metrics

Comment 10 Miciah Dashiel Butler Masters 2020-04-29 16:00:57 UTC

> After fixing please remove your component from an exclusion list in e2e tests

For the record, that was done with this PR: https://github.com/openshift/origin/pull/24904

Comment 12 errata-xmlrpc 2020-07-13 17:17:25 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409

Note You need to log in before you can comment on or make changes to this bug.