Bug 1879248 - OLM Cert Dir for Webhooks does not align SDK/Kubebuilder
Summary: OLM Cert Dir for Webhooks does not align SDK/Kubebuilder
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.5
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.7.0
Assignee: Alexander Greene
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks: 1891940 1920665
TreeView+ depends on / blocked
 
Reported: 2020-09-15 19:05 UTC by Alexander Greene
Modified: 2021-02-24 15:18 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: OLM's support for Admission Webhook Configurations reused the CA Cert generation code utilized when deploying API Servers. The mounting dir used by this code placed the cert information at the following locations: * /apiserver.local.config/certificates/apiserver.crt * /apiserver.local.config/certificates/apiserver.key Consequence: Admission Webhooks built using Kubebuilder or the Operator SDK expect the CA Certs to be mounted in different locations, as shown here: * /tmp/k8s-webhook-server/serving-certs/tls.cert * /tmp/k8s-webhook-server/serving-certs/tls.key Ultimately, the webhooks failed to run as expected due to the mismatch. Fix: OLM now mounts the webhook CA Certs at the default locations expected by webhooks built with Kubebuilder or the Operator SDK. Result: Webhooks built with Kubebuilder or the Operator SDK can now be deployed by OLM.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:18:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-lifecycle-manager pull 1808 0 None closed Bug 1879248: OLM mounts CA Certs where Kubebuilder expects 2021-02-11 15:23:32 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:18:40 UTC

Description Alexander Greene 2020-09-15 19:05:15 UTC
Description of problem:

Kubebuilder and the SDK generate webhooks with the expectation that the CA Certs will be available at the following locations:
* /tmp/k8s-webhook-server/serving-certs/tls.cert
* /tmp/k8s-webhook-server/serving-certs/tls.key


OLM currently requires the operator author to override these locations:
* /apiserver.local.config/certificates/apiserver.crt
* /apiserver.local.config/certificates/apiserver.key


This introduces a requirement for operators built with the SDK to override the default webserver:
```
const (
	WebhookPort     = 4343
	WebhookCertDir  = "/apiserver.local.config/certificates"
	WebhookCertName = "apiserver.crt"
	WebhookKeyName  = "apiserver.key"
)

func (r *WebhookTest) SetupWebhookWithManager(mgr ctrl.Manager) error {
	bldr := ctrl.NewWebhookManagedBy(mgr).
		For(r)

	// Specify OLM CA Info
	srv := mgr.GetWebhookServer()
	srv.CertDir = WebhookCertDir
	srv.CertName = WebhookCertName
	srv.KeyName = WebhookKeyName
	srv.Port = WebhookPort

	return bldr.Complete()
}
```

OLM should support webhooks built with the sdk out of the box.

Version-Release number of selected component (if applicable):
4.5

How reproducible:
Always

Steps to Reproduce:
1. Install an operator that includes a webhook built by the SDK using OLM

Actual results:
The operator fails to be installed and the webhook pod crashes when it cannot find the certs in the default kubebuilder/sdk webhook cert location.

Expected results:
The operator is installed and the webhook works.

Additional info:

Comment 4 yhui 2020-10-20 03:44:38 UTC
The release-4.7.0 nightly image is available now. And I'm verifying the bug.

Comment 5 yhui 2020-10-20 15:53:19 UTC
Version:
[hui@localhost 1020]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-10-17-034503   True        False         10h     Cluster version is 4.7.0-0.nightly-2020-10-17-034503
[hui@localhost 1020]$ oc exec olm-operator-69b864f866-6sjj4 -n openshift-operator-lifecycle-manager -- olm --version
OLM version: 0.16.1
git commit: e2c0f2c47573ec5dfc509502881fa3dd8eb7bae9

Test procedure:
1. Prepare the operator image and the operator includes a webhook. Thanks Alex for providing the image.
   quay.io/agreene/webhook-operator-index:revert-olm-certs

2. Create the catalogsource using the index image.
[hui@localhost 1020]$ cat catsrc.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: webhook-operator-catalog
  namespace: openshift-marketplace
spec:
  displayName: Webhook Operator Catalog
  image: quay.io/agreene/webhook-operator-index:revert-olm-certs
  sourceType: grpc
[hui@localhost 1020]$ oc create -f catsrc.yaml 
catalogsource.operators.coreos.com/webhook-operator-catalog created
[hui@localhost 1020]$ oc get catalogsource -n openshift-marketplace
NAME                       DISPLAY                    TYPE   PUBLISHER      AGE
certified-operators        Certified Operators        grpc   Red Hat        10h
community-operators        Community Operators        grpc   Red Hat        10h
qe-app-registry            Production Operators       grpc   OpenShift QE   10h
redhat-marketplace         Red Hat Marketplace        grpc   Red Hat        10h
redhat-operators           Red Hat Operators          grpc   Red Hat        10h
webhook-operator-catalog   Webhook Operator Catalog   grpc                  20s
[hui@localhost 1020]$ oc get pods -n  openshift-marketplace
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-pmfzn               1/1     Running   0          93m
community-operators-424bw               1/1     Running   0          10h
marketplace-operator-678cc6846b-vcxhw   1/1     Running   0          10h
qe-app-registry-dxk7q                   1/1     Running   0          6h38m
redhat-marketplace-p4vgh                1/1     Running   0          93m
redhat-operators-m4rg2                  1/1     Running   0          34m
webhook-operator-catalog-lwdcz          1/1     Running   0          52s

3. Create the subscription.
[hui@localhost 1020]$ cat sub.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: webhook-operator-subscription
  namespace: openshift-operators
spec:
  channel: "alpha"
  installPlanApproval: Automatic
  name: webhook-operator
  source: webhook-operator-catalog
  sourceNamespace: openshift-marketplace
[hui@localhost 1020]$ oc create -f sub.yaml 
subscription.operators.coreos.com/webhook-operator-subscription created

4. Check the pod has been installed successfully.
[hui@localhost 1020]$ oc get sub -n openshift-operators
NAME                            PACKAGE            SOURCE                     CHANNEL
webhook-operator-subscription   webhook-operator   webhook-operator-catalog   alpha
[hui@localhost 1020]$ oc get csv -n openshift-operators
NAME                      DISPLAY            VERSION   REPLACES   PHASE
webhook-operator.v0.0.1   Webhook Operator   0.0.1                Succeeded
[hui@localhost 1020]$ oc get pods -n openshift-operators
NAME                                        READY   STATUS    RESTARTS   AGE
webhook-operator-webhook-659fb6b776-fz9wh   2/2     Running   0          20m9s

Verify the bug on 4.7.0.

Comment 9 errata-xmlrpc 2021-02-24 15:18:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.