Bug 1835483 - No TLS certs available for HTTPS metrics
Summary: No TLS certs available for HTTPS metrics
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.z
Assignee: Jack Ottofaro
QA Contact: liujia
URL:
Whiteboard:
Depends On: 1834568
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-13 22:05 UTC by OpenShift BugZilla Robot
Modified: 2020-06-02 11:19 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The cluster-version operator should serve metrics over HTTPS. To do that, it needs a TLS key and certificate. Consequence: Without a TLS key and certificate, cluster-version operators which expect them to be in place will crash loop. Fix: Add a service annotation in 4.4.z (this bug), so the 4.4 monitoring operator will create the TLS key and certificate. Result: When an update from future 4.4.z to 4.5 is initiated, the incoming 4.5 cluster version operator will have the TLS key and certificate that it needs to start serving metrics over HTTPS.
Clone Of:
Environment:
Last Closed: 2020-06-02 11:18:33 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-version-operator pull 367 0 None closed Bug 1835483: Add cert to service to support https 2021-01-13 03:22:16 UTC
Red Hat Product Errata RHBA-2020:2310 0 None None None 2020-06-02 11:19:09 UTC

Description OpenShift BugZilla Robot 2020-05-13 22:05:54 UTC
+++ This bug was initially created as a clone of Bug #1834568 +++

Description of problem:

Fix for https://bugzilla.redhat.com/show_bug.cgi?id=1809195 is to add https metrics target. The PR to fix that bug is failing CI upgrade test since the 0001_00_cluster-version-operator_03_service with the service.beta.openshift.io/serving-cert-secret-name annotation is coming too late. So the monitoring tools aren't creating the secret causing the CVO container to fail to start.

Solution, which this bug addresses, is to first distribute a z-stream release, e.g. 4.y.z, with the service annotation. A subsequent z-stream distribution, e.g. 4.y.(z+n), will contain the remaining changes to address https://bugzilla.redhat.com/show_bug.cgi?id=1809195.

When 4.y.(z+n) enters Cincinnati, we will ensure that it and all subsequent 4.y.(z+n) releases are never the update of a 4.y.(z-m) release. So you have to go from the early z-stream (which has nothing about HTTPS) to a middle-ground z-stream (which only has the annotation) to a late z-stream (which has both the annotation and remaining code changes). We also need to backport the annotation to 4.(y-1), and block all 4.(y-1) -> 4.y from before the annotation landed in 4.(y-1) so you couldn't go straight to 4.y.(z+n) or later.

--- Additional comment from wking on 2020-05-12 23:51:44 UTC ---

Test plan is:

1. Launch the cluster.
2. Ensure that a secret named cluster-version-operator-serving-cert exists in the openshift-cluster-version namespace.

Comment 3 liujia 2020-05-25 06:28:19 UTC
Version:
4.4.0-0.nightly-2020-05-24-193742

Fresh installation:
# ./oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-05-24-193742   True        False         2m24s   Cluster version is 4.4.0-0.nightly-2020-05-24-193742
# ./oc get secrets cluster-version-operator-serving-cert -n openshift-cluster-version
NAME                                    TYPE                DATA   AGE
cluster-version-operator-serving-cert   kubernetes.io/tls   2      15m
# ./oc get service/cluster-version-operator -o json -n openshift-cluster-version|jq .metadata.annotations
{
  "exclude.release.openshift.io/internal-openshift-hosted": "true",
  "service.alpha.openshift.io/serving-cert-signed-by": "openshift-service-serving-signer@1590377080",
  "service.beta.openshift.io/serving-cert-secret-name": "cluster-version-operator-serving-cert",
  "service.beta.openshift.io/serving-cert-signed-by": "openshift-service-serving-signer@1590377080"
}

Upgrade from old v4.4 to latest v4.4:
Before upgrade:
# ./oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.5     True        False         5m24s   Cluster version is 4.4.5
# ./oc get secrets cluster-version-operator-serving-cert -n openshift-cluster-version
Error from server (NotFound): secrets "cluster-version-operator-serving-cert" not found

After upgrade:
# ./oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-05-24-193742   True        False         79m     Cluster version is 4.4.0-0.nightly-2020-05-24-193742
# ./oc get secrets cluster-version-operator-serving-cert -n openshift-cluster-versionNAME                                    TYPE                DATA   AGE
cluster-version-operator-serving-cert   kubernetes.io/tls   2      120m

Comment 5 errata-xmlrpc 2020-06-02 11:18:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2310


Note You need to log in before you can comment on or make changes to this bug.