Bug 1809200

Summary: Metrics exposed over insecure channel
Product: OpenShift Container Platform Reporter: Pawel Krupa <pkrupa>
Component: OLMAssignee: Daniel Sover <dsover>
OLM sub component: OLM QA Contact: Bruno Andrade <bandrade>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: dsover, ecordell
Version: 4.4   
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1813065 (view as bug list) Environment:
Last Closed: 2020-07-13 17:17:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1813065, 1814013    

Description Pawel Krupa 2020-03-02 15:04:00 UTC
Description of problem:
Metrics endpoint is not using TLS to encrypt traffic.

Version-Release number of selected component (if applicable):
4.4 (possibly also earlier versions)

How reproducible:
Always

Steps to Reproduce:
1. Start a cluster
2. Go to prometheus UI
3. Check connection schema for this component

Actual results:
Metrics are exposed over HTTP connection

Expected results:
Metrics are exposed over HTTPS connection

Additional info:
API server operator ServiceMonitor definition can be used as a template on how to fix this issue: https://github.com/openshift/cluster-openshift-apiserver-operator/blob/master/manifests/0000_90_openshift-apiserver-operator_03_servicemonitor.yaml

Comment 1 Evan Cordell 2020-03-03 14:37:54 UTC
This requires further investigation.

OLM configures its metrics with certs if provided:

https://github.com/operator-framework/operator-lifecycle-manager/blob/master/cmd/olm/main.go#L112-L120

And on OCP we rely on the service-ca-signer to provide certs for metrics:

https://github.com/operator-framework/operator-lifecycle-manager/blob/master/manifests/0000_50_olm_07-olm-operator.deployment.yaml#L33-L36
https://github.com/operator-framework/operator-lifecycle-manager/blob/master/manifests/0000_50_olm_07-olm-operator.deployment.yaml#L69-L77

And configured for the servicemonitor:

https://github.com/operator-framework/operator-lifecycle-manager/blob/master/manifests/0000_90_olm_00-service-monitor.yaml#L48-L52

We will need to investigate whether there is an issue with this configuration or with the service-ca-signer.

Comment 2 Pawel Krupa 2020-03-03 15:55:31 UTC
This is about marketplace-operator. Sorry for confusion and could you redirect to correct sub-team?

Comment 6 Bruno Andrade 2020-03-16 14:47:34 UTC
Confirmed that Prometheus is exposed with a self-signed certificate. Marking as VERIFIED.

Cluster Version: 4.5.0-0.nightly-2020-03-15-220309

curl --insecure -v https://prometheus-k8s-openshift-monitoring.apps.bandrade-806.qe.devcluster.openshift.com  2>&1 | awk 'BEGIN { cert=0 } /^\* SSL connection/ { cert=1 } /^\*/ { if (cert) print }'
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=*.apps.bandrade-806.qe.devcluster.openshift.com
*  start date: Mar 16 04:41:48 2020 GMT
*  expire date: Mar 16 04:41:49 2022 GMT
*  issuer: CN=ingress-operator@1584333705
*  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55c1014bb580)
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* Connection #0 to host prometheus-k8s-openshift-monitoring.apps.bandrade-806.qe.devcluster.openshift.com left intact

Comment 8 errata-xmlrpc 2020-07-13 17:17:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409