Bug 1818851 - Catalog-operator crashed when a CatalogSource object doesn't have the `address` and `image` fields
Summary: Catalog-operator crashed when a CatalogSource object doesn't have the `addre...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.3.z
Assignee: Evan Cordell
QA Contact: yhui
URL:
Whiteboard:
Depends On: 1818850
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-30 14:30 UTC by Nick Hale
Modified: 2020-05-20 13:48 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1818850
Environment:
Last Closed: 2020-05-20 13:47:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-lifecycle-manager pull 1486 0 None closed [release-4.3] Bug 1818851: Prevent nil pointer dereference 2021-02-04 21:56:51 UTC
Github operator-framework operator-lifecycle-manager pull 1508 0 None closed Bug 1818851: feat(catalogs): add spec validation for sourcetypes 2021-02-04 21:56:52 UTC
Red Hat Product Errata RHBA-2020:2129 0 None None None 2020-05-20 13:48:08 UTC

Comment 3 yhui 2020-05-06 09:59:11 UTC
1, Create an OCP 4.3 cluster within the fixed PR.

[hui@localhost ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.0-0.nightly-2020-05-04-051714   True        False         6m38s   Cluster version is 4.3.0-0.nightly-2020-05-04-051714
[hui@localhost ~]$ oc exec catalog-operator-6bdc7ccfd5-tbjfv  -- olm --version
OLM version: 0.13.0
git commit: 502b8a003c8b635b33657162b85bd297971a1dc4

2, Check the default CatalogSource, OLM pods, they worked well.

[hui@localhost ~]$ oc get pods
NAME                                READY   STATUS    RESTARTS   AGE
catalog-operator-6bdc7ccfd5-tbjfv   1/1     Running   0          16m
olm-operator-5844d8dd67-htmrp       1/1     Running   0          16m
packageserver-5f4bdf4cdd-cccgk      1/1     Running   0          16m
packageserver-5f4bdf4cdd-jndl4      1/1     Running   0          16m
[hui@localhost ~]$ oc get pods -n openshift-marketplace
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-784f4f7c97-zzjkw    1/1     Running   0          18m
community-operators-57d6b5b4-p4vcd      1/1     Running   0          18m
marketplace-operator-5d8c98d6df-l8pwq   1/1     Running   0          19m
redhat-operators-5b858bddc4-rrcvv       1/1     Running   0          18m
[hui@localhost ~]$ oc get packagemanifest
NAME                                         CATALOG               AGE
aqua-operator-certified                      Certified Operators   19m
sriov-network-operator                       Red Hat Operators     18m
federatorai                                  Community Operators   18m

3, Create a CatalogSource object(grpc) without image and address.

[hui@localhost ~]$ cat cs.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: bug-no-image
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  displayName: Jian Operators
  publisher: jian
[hui@localhost ~]$ oc create -f cs.yaml 
catalogsource.operators.coreos.com/bug-no-image created
[hui@localhost ~]$ oc get pods -n openshift-marketplace
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-784f4f7c97-zzjkw    1/1     Running   0          21m
community-operators-57d6b5b4-p4vcd      1/1     Running   0          21m
marketplace-operator-5d8c98d6df-l8pwq   1/1     Running   0          21m
redhat-operators-5b858bddc4-rrcvv       1/1     Running   0          21m
[hui@localhost ~]$  oc get catalogsource -n openshift-marketplace
NAME                  DISPLAY               TYPE   PUBLISHER   AGE
bug-no-image          Jian Operators        grpc   jian        46s
certified-operators   Certified Operators   grpc   Red Hat     21m
community-operators   Community Operators   grpc   Red Hat     21m
redhat-operators      Red Hat Operators     grpc   Red Hat     21m
[hui@localhost ~]$ oc get catalogsource -n openshift-marketplace bug-no-image -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  creationTimestamp: "2020-05-06T06:52:48Z"
  generation: 1
  name: bug-no-image
  namespace: openshift-marketplace
  resourceVersion: "21174"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/openshift-marketplace/catalogsources/bug-no-image
  uid: 57868882-9aea-4136-91d7-1e48987f632a
spec:
  displayName: Jian Operators
  publisher: jian
  sourceType: grpc
status:
  message: no reconciler for source type grpc
  reason: RegistryServerError
[hui@localhost ~]$ oc get pods
NAME                                READY   STATUS    RESTARTS   AGE
catalog-operator-6bdc7ccfd5-tbjfv   1/1     Running   0          28m
olm-operator-5844d8dd67-htmrp       1/1     Running   0          28m
packageserver-5f4bdf4cdd-cccgk      1/1     Running   0          27m
packageserver-5f4bdf4cdd-jndl4      1/1     Running   0          27m

The OLM pods work well. But the error message reported (message: no reconciler for source type grpc) for the catalog source is unclear and not good for readable.



4, Create a CatalogSource object(configmap) without image and address.

[hui@localhost ~]$ cat cs-configmap.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: bug-no-image-cm
  namespace: openshift-marketplace
spec:
  sourceType: configmap
  displayName: Jian Operators
  publisher: jian
[hui@localhost ~]$ oc create -f cs-configmap.yaml 
catalogsource.operators.coreos.com/bug-no-image-cm created
[hui@localhost ~]$ oc get catalogsource -n openshift-marketplace bug-no-image-cm -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  creationTimestamp: "2020-05-06T07:17:13Z"
  generation: 1
  name: bug-no-image-cm
  namespace: openshift-marketplace
  resourceVersion: "27579"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/openshift-marketplace/catalogsources/bug-no-image-cm
  uid: 7637fd04-6739-4aba-885e-67839f5f2d8c
spec:
  displayName: Jian Operators
  publisher: jian
  sourceType: configmap
status:
  message: 'failed to get catalog config map : configmap "" not found'
  reason: ConfigMapError
[hui@localhost ~]$ oc get pods
NAME                                READY   STATUS    RESTARTS   AGE
catalog-operator-6bdc7ccfd5-tbjfv   1/1     Running   0          46m
olm-operator-5844d8dd67-htmrp       1/1     Running   0          46m
packageserver-5f4bdf4cdd-cccgk      1/1     Running   0          46m
packageserver-5f4bdf4cdd-jndl4      1/1     Running   0          46m

The OLM pods work well. But the error message reported (message: 'failed to get catalog config map : configmap "" not found') for the catalog source is unclear and not good for readable.



5, Create a CatalogSource object without image, address, and sourceType.

[hui@localhost ~]$ cat cs-grpc-cm.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: bug-empty
  namespace: openshift-marketplace
spec:
  displayName: Jian Operators
  publisher: jian
[hui@localhost ~]$ oc create -f cs-grpc-cm.yaml 
The CatalogSource "bug-empty" is invalid: spec.sourceType: Required value

The message looks good to me.

6, Install an operator on the console, for example, etcd. But there is no csv or pod created.

[hui@localhost ~]$ oc get sub -A
NAMESPACE   NAME   PACKAGE   SOURCE                CHANNEL
default     etcd   etcd      community-operators   singlenamespace-alpha
[hui@localhost ~]$ oc get sub etcd -n default -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  creationTimestamp: "2020-05-06T07:25:57Z"
  generation: 1
  name: etcd
  namespace: default
  resourceVersion: "29871"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/default/subscriptions/etcd
  uid: a757fb4b-3225-403c-8854-fda5232bc717
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Automatic
  name: etcd
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.4
[hui@localhost ~]$ oc get csv -n default 
No resources found in default namespace.
[hui@localhost ~]$ oc get pods -n default 
No resources found in default namespace.


7. Delete the catalogsource objects (grpc and configmap) created in the step 3 and 4. 
[hui@localhost ~]$ oc delete -f cs.yaml 
catalogsource.operators.coreos.com "bug-no-image" deleted
[hui@localhost ~]$ oc delete -f cs-configmap.yaml 
catalogsource.operators.coreos.com "bug-no-image-cm" deleted

8. Install an operator on the console, for example, etcd. It works well.
[hui@localhost ~]$ oc get sub -A
NAMESPACE   NAME   PACKAGE   SOURCE                CHANNEL
default     etcd   etcd      community-operators   singlenamespace-alpha
[hui@localhost ~]$ oc get csv -n default
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Succeeded
[hui@localhost ~]$ oc get pod -n default
NAME                             READY   STATUS    RESTARTS   AGE
etcd-operator-65c7948765-lfhsl   3/3     Running   0          77s



In summary, there are still two issues. 
1. The error message reported is unclear. 
2. The csv and ip for the sub etcd can not be created because of the incorrect catalogsource (grpc or configmap without address or image). Delete the incorrect catalogsource, the etcd csv and ip works well.

Comment 6 Ben Luddy 2020-05-07 19:26:26 UTC
> 1. The error message reported is unclear. 

Backported a change from 4.4 that improves the status messages. Thanks!

> 2. The csv and ip for the sub etcd can not be created because of the incorrect catalogsource (grpc or configmap without address or image). Delete the incorrect catalogsource, the etcd csv and ip works well.

This is expected behavior. Resources should not be generated for a subscription if the catalog operator is unable to communicate with all relevant catalog sources. This prevents situations where an out-of-date (and potentially vulnerable) operator is installed as a dependency because the catalog source containing a newer version is not healthy.

Comment 10 yhui 2020-05-13 02:26:34 UTC
1, Create an OCP 4.3 cluster with the fixed PR.

[root@preserve-olm-env ~]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.0-0.nightly-2020-05-12-070750   True        False         14h     Cluster version is 4.3.0-0.nightly-2020-05-12-070750
[root@preserve-olm-env ~]# oc exec olm-operator-5f5ff4fd94-h8lbc -n openshift-operator-lifecycle-manager -- olm --version
OLM version: 0.13.0
git commit: 1702292171a9eef82ea43d8392cde3fb65455d95

2, Check the default CatalogSource, OLM pods, they worked well.

[root@preserve-olm-env ~]# oc get pods -n openshift-operator-lifecycle-manager
NAME                               READY   STATUS    RESTARTS   AGE
catalog-operator-55f6b8555-8mzm6   1/1     Running   0          15h
olm-operator-5f5ff4fd94-h8lbc      1/1     Running   0          15h
packageserver-56d9b9ccff-cg6lp     1/1     Running   0          15h
packageserver-56d9b9ccff-qbgxw     1/1     Running   0          15h
[root@preserve-olm-env ~]# oc get pods -n openshift-marketplace
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-8568576984-zbcs8    1/1     Running   0          15h
community-operators-8b657497d-br9n2     1/1     Running   0          15h
marketplace-operator-7c4bfd4d55-8fsgf   1/1     Running   0          15h
qe-app-registry-cb78d6784-h8pp7         1/1     Running   0          71m
redhat-operators-7b98459d9-9bh9c        1/1     Running   0          10h
[root@preserve-olm-env ~]# oc get packagemanifest
NAME                                         CATALOG               AGE
ibm-spectrum-scale-csi                       Certified Operators   15h
ibm-block-csi-operator                       Certified Operators   15h
metering                                     Community Operators   15h
...

3, Create a CatalogSource object(grpc) without image and address.

[root@preserve-olm-env bug-1818851]# cat cs.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: bug-no-image
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  displayName: Jian Operators
  publisher: jian
[root@preserve-olm-env bug-1818851]# oc create -f cs.yaml 
catalogsource.operators.coreos.com/bug-no-image created

[root@preserve-olm-env bug-1818851]# oc get pods -n openshift-marketplace
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-8568576984-zbcs8    1/1     Running   0          15h
community-operators-8b657497d-br9n2     1/1     Running   0          15h
marketplace-operator-7c4bfd4d55-8fsgf   1/1     Running   0          15h
qe-app-registry-cb78d6784-h8pp7         1/1     Running   0          75m
redhat-operators-7b98459d9-9bh9c        1/1     Running   0          10h

[root@preserve-olm-env bug-1818851]# oc get catsrc -n openshift-marketplace
NAME                  DISPLAY               TYPE   PUBLISHER   AGE
bug-no-image          Jian Operators        grpc   jian        45s
certified-operators   Certified Operators   grpc   Red Hat     15h
community-operators   Community Operators   grpc   Red Hat     15h
qe-app-registry                             grpc               15h
redhat-operators      Red Hat Operators     grpc   Red Hat     15h

[root@preserve-olm-env bug-1818851]# oc get catalogsource -n openshift-marketplace bug-no-image -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  creationTimestamp: "2020-05-13T01:50:00Z"
  generation: 1
  name: bug-no-image
  namespace: openshift-marketplace
  resourceVersion: "264183"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/openshift-marketplace/catalogsources/bug-no-image
  uid: 634c53ef-faf7-48a1-8939-3e767fd5df29
spec:
  displayName: Jian Operators
  publisher: jian
  sourceType: grpc
status:
  message: 'image and address unset: at least one must be set for sourcetype: grpc'
  reason: SpecInvalidError

[root@preserve-olm-env bug-1818851]# oc get pods -n openshift-operator-lifecycle-manager
NAME                               READY   STATUS    RESTARTS   AGE
catalog-operator-55f6b8555-8mzm6   1/1     Running   0          15h
olm-operator-5f5ff4fd94-h8lbc      1/1     Running   0          15h
packageserver-56d9b9ccff-cg6lp     1/1     Running   0          15h
packageserver-56d9b9ccff-qbgxw     1/1     Running   0          15h


The OLM pods work well. And the error message reported looks good to me.


4, Create a CatalogSource object(configmap) without image and address.

[root@preserve-olm-env bug-1818851]# cat cs-configmap.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: bug-no-image-cm
  namespace: openshift-marketplace
spec:
  sourceType: configmap
  displayName: Jian Operators
  publisher: jian
[root@preserve-olm-env bug-1818851]# oc create -f cs-configmap.yaml 
catalogsource.operators.coreos.com/bug-no-image-cm created

[root@preserve-olm-env bug-1818851]# oc get catalogsource -n openshift-marketplace bug-no-image-cm -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  creationTimestamp: "2020-05-13T01:56:59Z"
  generation: 1
  name: bug-no-image-cm
  namespace: openshift-marketplace
  resourceVersion: "266074"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/openshift-marketplace/catalogsources/bug-no-image-cm
  uid: 2bcf4c09-dde4-4c9b-9908-8234d6cc3426
spec:
  displayName: Jian Operators
  publisher: jian
  sourceType: configmap
status:
  message: 'configmap name unset: must be set for sourcetype: configmap'
  reason: SpecInvalidError

[root@preserve-olm-env bug-1818851]# oc get pods -n openshift-operator-lifecycle-manager
NAME                               READY   STATUS    RESTARTS   AGE
catalog-operator-55f6b8555-8mzm6   1/1     Running   0          15h
olm-operator-5f5ff4fd94-h8lbc      1/1     Running   0          15h
packageserver-56d9b9ccff-cg6lp     1/1     Running   0          15h
packageserver-56d9b9ccff-qbgxw     1/1     Running   0          15h


The OLM pods work well. And the error message reported looks good to me.



5, Create a CatalogSource object without image, address, and sourceType.

[root@preserve-olm-env bug-1818851]# cat cs-grpc-cm.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: bug-empty
  namespace: openshift-marketplace
spec:
  displayName: Jian Operators
  publisher: jian
[root@preserve-olm-env bug-1818851]# oc create -f cs-grpc-cm.yaml 
The CatalogSource "bug-empty" is invalid: spec.sourceType: Required value

The message looks good to me.

6, Install an operator on the console, for example, etcd. The csv, ip and pod can be created successfully.

[root@preserve-olm-env bug-1818851]# oc get sub
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha

[root@preserve-olm-env bug-1818851]# oc get sub etcd -n default -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  creationTimestamp: "2020-05-13T02:20:46Z"
  generation: 1
  name: etcd
  namespace: default
  resourceVersion: "272588"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/default/subscriptions/etcd
  uid: 6b067b0b-ca4d-4420-a666-ca308ac06323
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Automatic
  name: etcd
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.4
status:
  catalogHealth:
  - catalogSourceRef:
      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      name: bug-no-image
      namespace: openshift-marketplace
      resourceVersion: "264183"
      uid: 634c53ef-faf7-48a1-8939-3e767fd5df29
    healthy: false
    lastUpdated: "2020-05-13T02:20:46Z"
  - catalogSourceRef:
      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      name: bug-no-image-cm
      namespace: openshift-marketplace
      resourceVersion: "266074"
      uid: 2bcf4c09-dde4-4c9b-9908-8234d6cc3426
    healthy: false
    lastUpdated: "2020-05-13T02:20:46Z"
  - catalogSourceRef:
      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      name: certified-operators
      namespace: openshift-marketplace
      resourceVersion: "245551"
      uid: 2d0d5d40-0800-4468-b10d-e5dd32e6f2b7
    healthy: true
    lastUpdated: "2020-05-13T02:20:46Z"
  - catalogSourceRef:
      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      name: community-operators
      namespace: openshift-marketplace
      resourceVersion: "245553"
      uid: 57d7d2c8-4be1-4ddd-b4a8-0705f765f16b
    healthy: true
    lastUpdated: "2020-05-13T02:20:46Z"
  - catalogSourceRef:
      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      name: qe-app-registry
      namespace: openshift-marketplace
      resourceVersion: "245552"
      uid: 27d6c7e2-854b-4214-a4b4-05406b7f4ddc
    healthy: true
    lastUpdated: "2020-05-13T02:20:46Z"
  - catalogSourceRef:
      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      name: redhat-operators
      namespace: openshift-marketplace
      resourceVersion: "245554"
      uid: a798f22b-2914-4f6c-98a3-6658d03a36ba
    healthy: true
    lastUpdated: "2020-05-13T02:20:46Z"
  conditions:
  - lastTransitionTime: "2020-05-13T02:20:46Z"
    reason: UnhealthyCatalogSourceFound
    status: "True"
    type: CatalogSourcesUnhealthy
  currentCSV: etcdoperator.v0.9.4
  installPlanRef:
    apiVersion: operators.coreos.com/v1alpha1
    kind: InstallPlan
    name: install-gnq7m
    namespace: default
    resourceVersion: "272512"
    uid: 07ca6192-c093-4064-84d3-ff158df88daa
  installedCSV: etcdoperator.v0.9.4
  installplan:
    apiVersion: operators.coreos.com/v1alpha1
    kind: InstallPlan
    name: install-gnq7m
    uuid: 07ca6192-c093-4064-84d3-ff158df88daa
  lastUpdated: "2020-05-13T02:20:50Z"
  state: AtLatestKnown
[root@preserve-olm-env bug-1818851]# oc get csv
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Succeeded
[root@preserve-olm-env bug-1818851]# oc get pods
NAME                             READY   STATUS    RESTARTS   AGE
etcd-operator-644b4f8577-jsj8l   3/3     Running   0          91s
[root@preserve-olm-env bug-1818851]# oc get ip
NAME            CSV                   APPROVAL    APPROVED
install-gnq7m   etcdoperator.v0.9.4   Automatic   true

It looks good to me. Verify the bug.

Comment 12 errata-xmlrpc 2020-05-20 13:47:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2129


Note You need to log in before you can comment on or make changes to this bug.