Bug 1807920

Summary: [RFE] should provide a way to add quay CA in OperatorSource
Product: OpenShift Container Platform Reporter: Jian Zhang <jiazha>
Component: OLMAssignee: Evan Cordell <ecordell>
OLM sub component: OLM QA Contact: Jian Zhang <jiazha>
Status: CLOSED WONTFIX Docs Contact:
Severity: high    
Priority: high CC: agreene, bandrade, dmesser, kuiwang, scolange, tbuskey, xjiang
Version: 4.4Keywords: RFE
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-02-28 13:54:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jian Zhang 2020-02-27 13:29:44 UTC
Description of problem:
When using a new quay registry in OperatorSource, got the "x509: certificate signed by unknown authority" error.

Version-Release number of selected component (if applicable):
4.x

How reproducible:
always

Steps to Reproduce:
1. Create a quay registry on the cluster by following this doc:
https://access.redhat.com/documentation/en-us/red_hat_quay/3/html-single/deploy_red_hat_quay_on_openshift/index#red_hat_quay_database

2. Create an OperatorSource to use this quay registry.
mac:~ jianzhang$ cat operatorsource-quay.yaml 
apiVersion: operators.coreos.com/v1
kind: OperatorSource
metadata:
  name: quay-applications
  namespace: openshift-marketplace
spec:
  authorizationToken:
    secretName: marketplacesecret
  endpoint: https://quay-enterprise-quay-enterprise.apps.preserve-xjiang42aws0224.qe.devcluster.openshift.com/cnr
  registryNamespace: admin
  type: appregistry


3. Check the OperatorSource status.

Actual results:
Got the x509 errors:
mac:~ jianzhang$ oc get operatorsource
NAME                TYPE          ENDPOINT                                                                                                REGISTRY   DISPLAYNAME   PUBLISHER   STATUS        MESSAGE                                                                                                                                                                                    AGE
quay-applications   appregistry   https://quay-enterprise-quay-enterprise.apps.preserve-xjiang42aws0224.qe.devcluster.openshift.com/cnr   admin                                Configuring   Get https://quay-enterprise-quay-enterprise.apps.preserve-xjiang42aws0224.qe.devcluster.openshift.com/cnr/api/v1/packages?namespace=admin: x509: certificate signed by unknown authority   3m56s


Expected results:
I'm not sure if this is the right component. Maybe the CVO is the right component since the cert is injected into the `marketplace-trusted-ca` ConfigMap automatically by it. We have a workaround that stops the CVO or set the `marketplace-trusted-ca` to unmanaged. And then, add the cert of the quay registry to this ConfigMap manually.
But, I don't think it's an official solution. We should provide a way to support adding the CA of the quay registry in the OperatorSource.


Additional info:
The workaround as follows:
1, Stop the CVO
mac:~ jianzhang$ oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator

Or set the "marketplace-trusted-ca" ConfigMap to unmanged.
mac:~ jianzhang$ oc patch clusterversion version --type=merge -p '{"spec": {"overrides":[{"kind": "ConfigMap", "name": "marketplace-trusted-ca", "namespace": "openshift-marketplace", "unmanaged": true, "group": "apps"}]}}'
clusterversion.config.openshift.io/version patched

2, Remove the "config.openshift.io/inject-trusted-cabundle: “true”` label" from the marketplace-trusted-ca ConfigMap
3, Edit this marketplace-trusted-ca ConfigMap, add the quay crt
mac:~ jianzhang$ oc edit cm marketplace-trusted-ca 
...
    # private quay.io registry
    -----BEGIN CERTIFICATE-----
    MIIDazCCAlOgAwIBAgIIZ02K0qLpUoowDQYJKoZIhvcNAQELBQAwJjEkMCIGA1UE
...
kind: ConfigMap
metadata:
  creationTimestamp: "2020-02-25T08:02:15Z"
  name: marketplace-trusted-ca
...

5,  Recreate the marketplace operator pods and OperatorSource, but still, get this error.
mac:~ jianzhang$ oc delete pods marketplace-operator-6bffffc9c6-q5k6x
mac:~ jianzhang$ oc get pods
NAME                                    READY   STATUS    RESTARTS   AGE
marketplace-operator-6bffffc9c6-2hlrc   1/1     Running   0          11m

6, It works!
mac:~ jianzhang$ oc get cm
NAME                        DATA   AGE
marketplace-operator-lock   0      60m
marketplace-trusted-ca      1      6h32m
mac:~ jianzhang$ oc get pods
NAME                                    READY   STATUS    RESTARTS   AGE
marketplace-operator-6bffffc9c6-2hlrc   1/1     Running   1          60m
quay-applications-65865cfcc4-w7ncc      1/1     Running   0          89s
mac:~ jianzhang$ oc get operatorsource
NAME                TYPE          ENDPOINT                                                                                                REGISTRY   DISPLAYNAME   PUBLISHER   STATUS      MESSAGE                                       AGE
quay-applications   appregistry   https://quay-enterprise-quay-enterprise.apps.preserve-xjiang42aws0224.qe.devcluster.openshift.com/cnr   admin                                Succeeded   The object has been successfully reconciled   59m

Comment 1 Daniel Messer 2020-02-27 22:02:38 UTC
Hi Jian, thanks for this RFE. We are actively looking to decommission appregistry usage in OCP 4.5, so we will not invest in any additional features. The general concept of providing a pull secret alongside with the catalog is something we are working on here: https://issues.redhat.com/browse/RFE-537. It'll be for CatalogSources which is our go-forward way of representing catalogs on clusters.

Hope this clarifies.

PS: This RFE Jira project would also be the preferred place to file future RFEs.

Comment 2 Jian Zhang 2020-02-28 01:21:37 UTC
Hi, Daniel

Thanks for your explanation! I'm not sure if the CA supporting feature should be added to OLM or CVO.
If https://issues.redhat.com/browse/RFE-537 can cover this scenario, I'd glad to use it to trace this bug.
If not, do I need to open another RFE in JIRA?

Comment 3 Daniel Messer 2020-02-28 10:39:52 UTC
@Jian - I think we want to push people towards hosting their own catalogs not on appregistry but with bundles and index images on their own container registries. That's why we are concentrating to make custom CA support available there and NOT on for appregistry. We are looking to deprecate OperatorSource and appregistry support.