Bug 1920665 - OLM Cert Dir for Webhooks does not align SDK/Kubebuilder
Summary: OLM Cert Dir for Webhooks does not align SDK/Kubebuilder
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.5
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.5.z
Assignee: Evan Cordell
QA Contact: Jian Zhang
URL:
Whiteboard:
: 1904070 (view as bug list)
Depends On: 1879248 1891940
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-26 20:14 UTC by Nick Hale
Modified: 2021-01-26 20:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1891940
Environment:
Last Closed: 2021-01-26 20:21:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Nick Hale 2021-01-26 20:14:38 UTC
+++ This bug was initially created as a clone of Bug #1891940 +++

+++ This bug was initially created as a clone of Bug #1879248 +++

Description of problem:

Kubebuilder and the SDK generate webhooks with the expectation that the CA Certs will be available at the following locations:
* /tmp/k8s-webhook-server/serving-certs/tls.cert
* /tmp/k8s-webhook-server/serving-certs/tls.key


OLM currently requires the operator author to override these locations:
* /apiserver.local.config/certificates/apiserver.crt
* /apiserver.local.config/certificates/apiserver.key


This introduces a requirement for operators built with the SDK to override the default webserver:
```
const (
	WebhookPort     = 4343
	WebhookCertDir  = "/apiserver.local.config/certificates"
	WebhookCertName = "apiserver.crt"
	WebhookKeyName  = "apiserver.key"
)

func (r *WebhookTest) SetupWebhookWithManager(mgr ctrl.Manager) error {
	bldr := ctrl.NewWebhookManagedBy(mgr).
		For(r)

	// Specify OLM CA Info
	srv := mgr.GetWebhookServer()
	srv.CertDir = WebhookCertDir
	srv.CertName = WebhookCertName
	srv.KeyName = WebhookKeyName
	srv.Port = WebhookPort

	return bldr.Complete()
}
```

OLM should support webhooks built with the sdk out of the box.

Version-Release number of selected component (if applicable):
4.5

How reproducible:
Always

Steps to Reproduce:
1. Install an operator that includes a webhook built by the SDK using OLM

Actual results:
The operator fails to be installed and the webhook pod crashes when it cannot find the certs in the default kubebuilder/sdk webhook cert location.

Expected results:
The operator is installed and the webhook works.

Additional info:

--- Additional comment from bluddy on 2020-10-02 17:11:52 UTC ---

Checking in for "every bug, every sprint": no updates to report since this was filed. A workaround exists until a patch is implemented.

--- Additional comment from aos-team-art-private on 2020-10-14 16:33:40 UTC ---

Elliott changed bug status from MODIFIED to ON_QA.

--- Additional comment from nhale on 2020-10-19 15:22:31 UTC ---

Awaiting QE verification.

--- Additional comment from yhui on 2020-10-20 03:44:38 UTC ---

The release-4.7.0 nightly image is available now. And I'm verifying the bug.

--- Additional comment from yhui on 2020-10-20 15:53:19 UTC ---

Version:
[hui@localhost 1020]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-10-17-034503   True        False         10h     Cluster version is 4.7.0-0.nightly-2020-10-17-034503
[hui@localhost 1020]$ oc exec olm-operator-69b864f866-6sjj4 -n openshift-operator-lifecycle-manager -- olm --version
OLM version: 0.16.1
git commit: e2c0f2c47573ec5dfc509502881fa3dd8eb7bae9

Test procedure:
1. Prepare the operator image and the operator includes a webhook. Thanks Alex for providing the image.
   quay.io/agreene/webhook-operator-index:revert-olm-certs

2. Create the catalogsource using the index image.
[hui@localhost 1020]$ cat catsrc.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: webhook-operator-catalog
  namespace: openshift-marketplace
spec:
  displayName: Webhook Operator Catalog
  image: quay.io/agreene/webhook-operator-index:revert-olm-certs
  sourceType: grpc
[hui@localhost 1020]$ oc create -f catsrc.yaml 
catalogsource.operators.coreos.com/webhook-operator-catalog created
[hui@localhost 1020]$ oc get catalogsource -n openshift-marketplace
NAME                       DISPLAY                    TYPE   PUBLISHER      AGE
certified-operators        Certified Operators        grpc   Red Hat        10h
community-operators        Community Operators        grpc   Red Hat        10h
qe-app-registry            Production Operators       grpc   OpenShift QE   10h
redhat-marketplace         Red Hat Marketplace        grpc   Red Hat        10h
redhat-operators           Red Hat Operators          grpc   Red Hat        10h
webhook-operator-catalog   Webhook Operator Catalog   grpc                  20s
[hui@localhost 1020]$ oc get pods -n  openshift-marketplace
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-pmfzn               1/1     Running   0          93m
community-operators-424bw               1/1     Running   0          10h
marketplace-operator-678cc6846b-vcxhw   1/1     Running   0          10h
qe-app-registry-dxk7q                   1/1     Running   0          6h38m
redhat-marketplace-p4vgh                1/1     Running   0          93m
redhat-operators-m4rg2                  1/1     Running   0          34m
webhook-operator-catalog-lwdcz          1/1     Running   0          52s

3. Create the subscription.
[hui@localhost 1020]$ cat sub.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: webhook-operator-subscription
  namespace: openshift-operators
spec:
  channel: "alpha"
  installPlanApproval: Automatic
  name: webhook-operator
  source: webhook-operator-catalog
  sourceNamespace: openshift-marketplace
[hui@localhost 1020]$ oc create -f sub.yaml 
subscription.operators.coreos.com/webhook-operator-subscription created

4. Check the pod has been installed successfully.
[hui@localhost 1020]$ oc get sub -n openshift-operators
NAME                            PACKAGE            SOURCE                     CHANNEL
webhook-operator-subscription   webhook-operator   webhook-operator-catalog   alpha
[hui@localhost 1020]$ oc get csv -n openshift-operators
NAME                      DISPLAY            VERSION   REPLACES   PHASE
webhook-operator.v0.0.1   Webhook Operator   0.0.1                Succeeded
[hui@localhost 1020]$ oc get pods -n openshift-operators
NAME                                        READY   STATUS    RESTARTS   AGE
webhook-operator-webhook-659fb6b776-fz9wh   2/2     Running   0          20m9s

Verify the bug on 4.7.0.

--- Additional comment from Vu Dinh on 2020-11-12 16:28:49 UTC ---

The PR is waiting for cherry-pick approval. Has been approved and passed CI.

--- Additional comment from PnT Account Manager on 2020-11-16 16:16:14 UTC ---

Employee 'yhui' has left the company.

--- Additional comment from Vu Dinh on 2020-12-14 19:24:20 UTC ---

The PR is ready and waiting for cherry-pick approval.

Comment 1 Nick Hale 2021-01-26 20:21:43 UTC
After some deliberation with the team, we've come to the conclusion that this is a backwards compatibility and/or docs issue with SDK. 

- https://github.com/operator-framework/operator-sdk/issues/4439
- https://github.com/operator-framework/operator-sdk/issues/4438

It makes sense for OLM to switch to the new location -- while still supporting the old -- in master and for new releases, but any attempt to backport a patch would amount to supporting forwards compatibility with SDK; something that doesn't seem feasible overall.

Additionally, this low priority bug has two distinct workarounds at the operator author scope.

Comment 2 Nick Hale 2021-01-26 20:28:29 UTC
*** Bug 1904070 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.