Bug 1683303 - [Marketplace] cannot fetch the packages from the CatalogSourceConfig
Summary: [Marketplace] cannot fetch the packages from the CatalogSourceConfig
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.1.0
Assignee: Kevin Rizza
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-26 14:58 UTC by Jian Zhang
Modified: 2019-06-04 10:44 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:44:39 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:44:44 UTC

Description Jian Zhang 2019-02-26 14:58:59 UTC
Description of problem:
I installed the Couchbase successfully but got below errors when continue installing the MongoDB.
[core@ip-10-0-134-56 ~]$  oc logs -f catalog-operator-6c6d74b786-5ltgv  |grep mongodb
E0226 10:27:49.662135       1 queueinformer_operator.go:155] Sync "openshift-operators" failed: {mongodb-enterprise preview mongodboperator.v0.3.2 {installed-certified-openshift-operators openshift-operators}} not found: rpc error: code = Unknown desc = no bundle found for csv mongodboperator.v0.3.2

Version-Release number of selected component (if applicable):
Operator registry image:
               io.openshift.build.commit.id=0531400c661ef7088d71b86ff5f52892f9407a1a
               io.openshift.build.commit.url=https://github.com/operator-framework/operator-registry/commit/0531400c661ef7088d71b86ff5f52892f9407a1a
               io.openshift.build.source-location=https://github.com/operator-framework/operator-registry


How reproducible:
Often

Steps to Reproduce:
1. Install the OCP 4.0.
2. Install the Couchbase from Operator Hub on Web console.
3. Install the MongoDB from Operator Hub on Web console.

Actual results:
No InstallPlan/csv generated. Got below errors:
E0226 10:27:49.662135       1 queueinformer_operator.go:155] Sync "openshift-operators" failed: {mongodb-enterprise preview mongodboperator.v0.3.2 {installed-certified-openshift-operators openshift-operators}} not found: rpc error: code = Unknown desc = no bundle found for csv mongodboperator.v0.3.2


Expected results:
The "mongodb-enterprise" package should be fetched successfully.

Additional info:
1, the CatalogSourceConfig already point to the "mongodb-enterprise" package.
[core@ip-10-0-134-56 ~]$ oc get csc installed-certified-openshift-operators -o yaml
apiVersion: marketplace.redhat.com/v1alpha1
kind: CatalogSourceConfig
metadata:
  creationTimestamp: 2019-02-26T06:02:58Z
  finalizers:
  - finalizer.catalogsourceconfigs.marketplace.redhat.com
  generation: 1
  name: installed-certified-openshift-operators
  namespace: openshift-marketplace
  resourceVersion: "203734"
  selfLink: /apis/marketplace.redhat.com/v1alpha1/namespaces/openshift-marketplace/catalogsourceconfigs/installed-certified-openshift-operators
  uid: 2a7a2d94-398c-11e9-b509-02f3b7fd9150
spec:
  csDisplayName: Certified Operators
  csPublisher: Certified
  packages: couchbase-enterprise,mongodb-enterprise
  targetNamespace: openshift-operators
status:
  currentPhase:
    lastTransitionTime: 2019-02-26T06:02:58Z
    lastUpdateTime: 2019-02-26T06:02:58Z
    phase:
      message: The object has been successfully reconciled
      name: Succeeded

2, But, I didn't find the MongoDB package by checking the logs of the "installed-certified-openshift-operators-6f669f766b-48k78". As below:
[core@ip-10-0-134-56 ~]$ oc logs installed-certified-openshift-operators-6f669f766b-48k78 
time="2019-02-26T06:03:06Z" level=info msg="Using in-cluster kube client config" port=50051 type=appregistry
time="2019-02-26T06:03:06Z" level=info msg="Using in-cluster kube client config" port=50051 type=appregistry
time="2019-02-26T06:03:06Z" level=info msg="operator source(s) specified are - openshift-marketplace/certified-operators" port=50051 type=appregistry
time="2019-02-26T06:03:06Z" level=info msg="package(s) specified are - couchbase-enterprise" port=50051 type=appregistry
time="2019-02-26T06:03:06Z" level=info msg="input sanitized - sources: [openshift-marketplace/certified-operators], packages: [couchbase-enterprise]" port=50051 type=appregistry
time="2019-02-26T06:03:07Z" level=info msg="resolved the following packages: [certified-operators/couchbase-enterprise:0.0.1]" port=50051 type=appregistry
time="2019-02-26T06:03:07Z" level=info msg="downloading repository: certified-operators/couchbase-enterprise:0.0.1 from https://quay.io/cnr" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="download complete - 1 repositories have been downloaded" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="all manifest(s) have been merged into one" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="loading into sqlite database" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="using configmap loader to build sqlite database" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="loading CRDs" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="loading Bundles" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="loading Packages" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="extracting provided API information" port=50051 type=appregistry
time="2019-02-26T06:03:09Z" level=info msg="serving registry" port=50051 type=appregistry

And, the args of the appregistry-server is incorrect.
sh-4.2$ ps -elf|cat 
F S UID         PID   PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S 1000430+      1      0  0  80   0 - 109805 futex_ 06:03 ?       00:00:08 appregistry-server -s openshift-marketplace/certified-operators -o couchbase-enterprise


The CatalogSource info:
[core@ip-10-0-134-56 ~]$ oc get catsrc -n openshift-operators installed-certified-openshift-operators -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  creationTimestamp: 2019-02-26T06:02:58Z
  generation: 1
  name: installed-certified-openshift-operators
  namespace: openshift-operators
  ownerReferences:
  - apiVersion: marketplace.redhat.com/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: CatalogSourceConfig
    name: installed-certified-openshift-operators
    uid: 2a7a2d94-398c-11e9-b509-02f3b7fd9150
  resourceVersion: "357452"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/openshift-operators/catalogsources/installed-certified-openshift-operators
  uid: 2a937c84-398c-11e9-aefc-0a5bc12f5264
spec:
  address: 172.30.98.123:50051
  displayName: Certified Operators
  icon:
    base64data: ""
    mediatype: ""
  publisher: Certified
  sourceType: grpc
status:
  lastSync: 2019-02-26T10:11:29Z
  registryService:
    createdAt: 2019-02-26T10:11:29Z
    protocol: grpc

Comment 1 Jian Zhang 2019-02-27 02:31:28 UTC
FYI.
Cluster version: 4.0.0-0.nightly-2019-02-26-054336
Marketplace Operator info:
             io.openshift.build.commit.id=7b53305ee695597ecedba5b81f3275a2fe1f74fa
             io.openshift.build.commit.url=https://github.com/operator-framework/operator-marketplace/commit/7b53305ee695597ecedba5b81f3275a2fe1f74fa
             io.openshift.build.source-location=https://github.com/operator-framework/operator-marketplace

Comment 2 aravindh 2019-02-27 02:45:54 UTC
Please test this with a build that includes https://github.com/operator-framework/operator-marketplace/commit/255d89f6893b3854e9e3e522aa7d9ddbea17d4d2 and https://github.com/operator-framework/operator-lifecycle-manager/commit/c1db6cd9092a12f3c1b47a0a4549acbc39fd80b2. These help mitigate the GC issues (https://bugzilla.redhat.com/show_bug.cgi?id=1679309). This is to ensure that we are not running into another manifestation of that problem.

Comment 6 Kevin Rizza 2019-02-27 20:23:49 UTC
So a few things I want to bring up:

1. I believe this problem is a coincidental artifact of the Marketplace flapping issue that was documented here:
https://bugzilla.redhat.com/show_bug.cgi?id=1683792

2. The "error" that is being documented will always occur when another operator from the same operator source is subscribed to:

1 queueinformer_operator.go:155] Sync "openshift-operators" failed: {mongodb-enterprise preview mongodboperator.v0.3.2 {installed-certified-openshift-operators openshift-operators}} not found: rpc error: code = Unknown desc = no bundle found for csv mongodboperator.v0.3.2

That log just means the catalog-operator doesn't see the operator metadata in the grpc endpoint yet. That will occur because the update is async: the catalog source and the grpc pod are notified of the new operator that can be subscribed to, but the grpc pod takes a few seconds to download the metadata and start serving content. Therefore, that error message doesn't tell us much about what the state of the environment is in.

3. This process is not synchronous, so when a subscription is added the result is not "instantaneous". Please be aware that some time must be spent waiting for the subscription to succeed or fail before pulling logs.

So, once the bugzilla mentioned above is marked as fixed, please attempt to test this again. My guess is that there is something unexpected going on right now in the environment because of that issue, which is why the subscription is failing to bring up the operator. In the example provided, it was because something went wrong between the catalogsourceconfig getting the update and the grpc pod getting recreated.

If the error does still persist after that fix is included in a build and closed, please also pull the logs from the catalogsourceconfig installed-certified-openshift-operators. If that catalogsourceconfig includes a reference to the second operator that we are trying to subscribe to, it should create an appregistry pod that includes that pod. If it doesn't, the logs should indicate why.

Comment 7 Jian Zhang 2019-02-28 05:13:26 UTC
Kevin,

[jzhang@dhcp-140-18 2019-02-27-213933]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.0.0-0.nightly-2019-02-27-213933   True        False         53m     Cluster version is 4.0.0-0.nightly-2019-02-27-213933

Now, the appregistry-server works well, it refer to the crrect packages. 
[jzhang@dhcp-140-18 2019-02-27-213933]$ oc logs installed-certified-openshift-operators-7997858647-c4wp7
time="2019-02-28T03:14:08Z" level=info msg="Using in-cluster kube client config" port=50051 type=appregistry
time="2019-02-28T03:14:08Z" level=info msg="Using in-cluster kube client config" port=50051 type=appregistry
time="2019-02-28T03:14:08Z" level=info msg="operator source(s) specified are - openshift-marketplace/certified-operators" port=50051 type=appregistry
time="2019-02-28T03:14:08Z" level=info msg="package(s) specified are - couchbase-enterprise,mongodb-enterprise" port=50051 type=appregistry
time="2019-02-28T03:14:08Z" level=info msg="input sanitized - sources: [openshift-marketplace/certified-operators], packages: [couchbase-enterprise mongodb-enterprise]" port=50051 type=appregistry
time="2019-02-28T03:14:08Z" level=info msg="resolved the following packages: [certified-operators/couchbase-enterprise:0.0.2 certified-operators/mongodb-enterprise:0.0.1]" port=50051 type=appregistry
time="2019-02-28T03:14:08Z" level=info msg="downloading repository: certified-operators/couchbase-enterprise:0.0.2 from https://quay.io/cnr" port=50051 type=appregistry
time="2019-02-28T03:14:11Z" level=info msg="downloading repository: certified-operators/mongodb-enterprise:0.0.1 from https://quay.io/cnr" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="download complete - 2 repositories have been downloaded" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="all manifest(s) have been merged into one" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="loading into sqlite database" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="using configmap loader to build sqlite database" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="loading CRDs" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="loading Bundles" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="loading Packages" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="extracting provided API information" port=50051 type=appregistry
time="2019-02-28T03:14:12Z" level=info msg="serving registry" port=50051 type=appregistry
[jzhang@dhcp-140-18 2019-02-27-213933]$ oc rsh installed-certified-openshift-operators-7997858647-c4wp7 
sh-4.2$ ps -elf | cat
F S UID         PID   PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S 1000030+      1      0  0  80   0 - 93553 futex_ 03:14 ?        00:00:00 appregistry-server -s openshift-marketplace/certified-operators -o couchbase-enterprise,mongodb-enterprise
4 S 1000030+    576      0  0  80   0 -  2954 do_wai 03:18 pts/0    00:00:00 /bin/sh
4 R 1000030+    603    576  0  80   0 - 12935 -      03:18 pts/0    00:00:00 ps -elf
0 S 1000030+    604    576  0  80   0 -  1095 pipe_w 03:18 pts/0    00:00:00 cat
sh-4.2$ exit

And, the InstallPlan can be generated correctly.
[jzhang@dhcp-140-18 2019-02-27-213933]$ oc get sub
NAME                   PACKAGE                SOURCE                                    CHANNEL
couchbase-enterprise   couchbase-enterprise   installed-certified-openshift-operators   preview
mongodb-enterprise     mongodb-enterprise     installed-certified-openshift-operators   preview

But, here the installplan refer to all the csv objects, the "install-tqmmb" should only refer to "mongodboperator.v0.3.2". I will create a new bug to trace it. 
[jzhang@dhcp-140-18 2019-02-27-213933]$ oc get ip
NAME            CSV                         SOURCE   APPROVAL    APPROVED
install-l2m9m   couchbase-operator.v1.1.0            Automatic   true
install-tqmmb   couchbase-operator.v1.1.0            Automatic   true

[jzhang@dhcp-140-18 2019-02-27-213933]$ oc get csv
NAME                        DISPLAY              VERSION   REPLACES                    PHASE
couchbase-operator.v1.1.0   Couchbase Operator   1.1.0     couchbase-operator.v1.0.0   Failed
mongodboperator.v0.3.2      MongoDB              0.3.2                                 Installing

[jzhang@dhcp-140-18 2019-02-27-213933]$ oc get pods
NAME                                           READY   STATUS             RESTARTS   AGE
couchbase-operator-855b57b84c-qfznv            0/1     ImagePullBackOff   0          20m
mongodb-enterprise-operator-5db87fd68b-5ppmr   0/1     ImagePullBackOff   0          4m50s

LGTM, verify it.

Comment 10 errata-xmlrpc 2019-06-04 10:44:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.