Bug 1827544

Summary: The base image "quay.io/openshift/origin-operator-registry:4.5" doesn't work
Product: OpenShift Container Platform Reporter: Jian Zhang <jiazha>
Component: OLMAssignee: Evan Cordell <ecordell>
OLM sub component: OLM QA Contact: yhui
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: abays, agreene, ashoshan, bandrade, bsong, ecordell, hfukumot, jiazha, kuiwang, sasha, yhui, yprokule
Version: 4.4Keywords: Regression
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1844156 (view as bug list) Environment:
Last Closed: 2020-07-13 17:30:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1838473, 1844156    

Description Jian Zhang 2020-04-24 07:12:28 UTC
Description of problem:
The reason why I set the `Severity` to `High` is that the default base image always doesn't work. I hope we can create a mechanism to make sure it works before pushing to quay. Screenshot: https://user-images.githubusercontent.com/15416633/80183935-4f9bed00-863c-11ea-969c-9309d7a109c5.png
Or we don't use the latest `quay.io/openshift/origin-operator-registry` image as default. 

[root@preserve-olm-env ~]# oc adm catalog build --help
Builds a catalog container image from a collection operator manifests.

 Extracts the contents of a collection of operator manifests to disk, and builds them into an operator registry catalog
image.

Usage:
  oc adm catalog build [flags]

Examples:
  # Build an operator catalog from an appregistry repo and store in a file
  oc adm catalog build --appregistry-org=redhat-operators --to=file://offline/redhat-operators:4.3
  
  # Build an operator catalog from an appregistry repo and mirror to a registry
  oc adm catalog build --appregistry-org=redhat-operators --to=quay.io/my/redhat-operators:4.3

Options:
      --appregistry-endpoint='https://quay.io/cnr': Endpoint for pulling from an application registry instance.
      --appregistry-org='': Organization (Namespace) to pull from an application registry instance
      --auth-token='': Auth token for communicating with an application registry.
      --dir='': The directory on disk that file:// images will be copied under.
      --filter-by-os='': A regular expression to control which images are considered when multiple variants are
available. Images will be passed as '<platform>/<architecture>[/<variant>]'.
      --from='quay.io/openshift/origin-operator-registry:latest': The image to use as a base.
      --from-dir='': The directory on disk that file:// images will be read from. Overrides --dir
      --insecure=false: Allow push and pull operations to registries to be made over HTTP
      --manifest-dir='': Local path to cache manifests when downloading.
      --max-per-registry=4: Number of concurrent requests allowed per registry.
  -a, --registry-config='': Path to your registry credentials (defaults to ~/.docker/config.json)
      --skip-verification=false: Skip verifying the integrity of the retrieved content. This is not recommended, but may
be necessary when importing images from older image registries. Only bypass verification if the registry is known to be
trustworthy.
      --to='': The image repository tag to apply to the built catalog image.
      --to-db='': Local path to save the database to.

Version-Release number of selected component (if applicable):
Base image: 'quay.io/openshift/origin-operator-registry:latest'

How reproducible:
always

Steps to Reproduce:
1. Create a CatalogSource image by using the default base image.
[root@preserve-olm-env ~]# oc adm catalog build --appregistry-org="redhat-operators-art" --auth-token="xx" --to=quay.io/olmqe/art:v4

2. Install 4.4 cluster and create a CatalogSource object to consume this image.
[root@preserve-olm-env ~]# cat cs.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: art
  namespace: openshift-marketplace
spec:
  displayName: ART Operators
  image: quay.io/olmqe/art:v4
  publisher: QE
  sourceType: grpc

3. Check the `packagemanifest` object.

Actual results:
No operators display from this CatalogSource object. The 

Get the "couldn't find service in cache" warning from the OLM logs:
...
time="2020-04-24T06:41:10Z" level=warning msg="couldn't find service in cache" service=art
time="2020-04-24T06:41:10Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=CONNECTING"
time="2020-04-24T06:41:13Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=TRANSIENT_FAILURE"
time="2020-04-24T06:41:14Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=CONNECTING"
time="2020-04-24T06:41:30Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=READY"
time="2020-04-24T06:43:53Z" level=info msg="Adding related objects for operator-lifecycle-manager-catalog"
time="2020-04-24T06:48:16Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=CONNECTING"
time="2020-04-24T06:48:16Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=TRANSIENT_FAILURE"
time="2020-04-24T06:48:16Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=CONNECTING"
time="2020-04-24T06:48:36Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=TRANSIENT_FAILURE"
time="2020-04-24T06:48:37Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=CONNECTING"
time="2020-04-24T06:48:37Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=art state.State=READY"
time="2020-04-24T06:48:58Z" level=info msg="Adding related objects for operator-lifecycle-manager-catalog"
time="2020-04-24T06:54:03Z" level=info msg="Adding related objects for operator-lifecycle-manager-catalog"


Expected results:
The operators from this custom CatalogSource image should be display in the `packagemanifest`.

Additional info:
Workaround:
1, Rebuild the CatalogSource image with the base image 4.4 tag.
[root@preserve-olm-env ~]# oc adm catalog build --appregistry-org="redhat-operators-art" --auth-token="xxx" --from="quay.io/openshift/origin-operator-registry:4.4" --to=quay.io/olmqe/art:v4.4
...

2, Moify this CatalogSource object to use this new image.
[root@preserve-olm-env ~]# oc edit catalogsource art
catalogsource.operators.coreos.com/art edited
[root@preserve-olm-env ~]# oc get pods
NAME                                    READY   STATUS              RESTARTS   AGE
art-9pbpz                               0/1     ContainerCreating   0          3s
certified-operators-5567dbc5bf-hmgvk    1/1     Running             0          3h3m
...

3, It works well!
[root@preserve-olm-env ~]# oc get packagemanifest |grep -i art
openshiftansibleservicebroker                ART Operators         8m51s
nfd                                          ART Operators         8m51s
ptp-operator                                 ART Operators         8m51s
cluster-logging                              ART Operators         8m51s
clusterresourceoverride                      ART Operators         8m51s
metering-ocp                                 ART Operators         8m51s
local-storage-operator                       ART Operators

Comment 2 Evan Cordell 2020-05-07 15:01:52 UTC
We may be able to make the flag required for oc, or potentially take a reference to a release image and use that to find the base image for the target mirror ocp version.

Comment 3 Alexander Greene 2020-05-14 14:03:39 UTC
Hello @Jian,

When this bug was filed, the OLM team had just made a number of changes to that image. The OLM team was not aware that oc was pointing to the upstream image as well. We have since reverted the changes to that image, could you double check if this is still an issue?

Comment 4 Jian Zhang 2020-05-18 01:13:50 UTC
Hi Alex,

Thanks for your explanation! Got it. @yhui please help to take this issue, thanks!

Comment 5 Jian Zhang 2020-05-26 09:25:48 UTC
*** Bug 1838473 has been marked as a duplicate of this bug. ***

Comment 6 Jian Zhang 2020-05-26 09:40:27 UTC
Hi Alex,

I test the default base image: 'quay.io/openshift/origin-operator-registry:latest', and 4.5, 4.4 images, but none of them work currently. Label this bug as a TestBlocker.

./oc adm catalog build --appregistry-org redhat-operators  --to=quay.io/olmqe/local-redhat-operators:bug-1838473-3

./oc adm catalog build --appregistry-org redhat-operators --from=quay.io/openshift/origin-operator-registry:4.5 --to=quay.io/olmqe/local-redhat-operators:bug-1838473

./oc adm catalog build --appregistry-org redhat-operators --from=quay.io/openshift/origin-operator-registry:4.4 --to=quay.io/olmqe/local-redhat-operators:bug-1838473-2

[root@preserve-olm-env data]# oc get catalogsource
NAME                  DISPLAY               TYPE   PUBLISHER   AGE
certified-operators   Certified Operators   grpc   Red Hat     47m
community-operators   Community Operators   grpc   Red Hat     47m
olm-operators         OLM Operators         grpc   OLM         15m
olm-operators2        OLM2 Operators        grpc   OLM2        12m
olm-operators3        OLM3 Operators        grpc   OLM3        3m26s
...

[root@preserve-olm-env data]# oc get pods
NAME                                   READY   STATUS    RESTARTS   AGE
certified-operators-65f8dcf6fc-jw7nc   1/1     Running   0          47m
community-operators-64f4d86955-mmfm6   1/1     Running   0          47m
marketplace-operator-8688cfc9-bx4j7    1/1     Running   0          47m
olm-operators-n6tcb                    1/1     Running   0          15m
olm-operators2-bgrl5                   1/1     Running   0          12m
olm-operators3-88bcs                   1/1     Running   0          3m18s
...

[root@preserve-olm-env data]# oc get packagemanifest|grep OLM
[root@preserve-olm-env data]# 

But, seems like the catlogsource serves well. As follows:
[root@preserve-olm-env data]# oc port-forward olm-operators-n6tcb 50051 &
[1] 16483
[root@preserve-olm-env data]# Forwarding from 127.0.0.1:50051 -> 50051
Forwarding from [::1]:50051 -> 50051

[root@preserve-olm-env data]# grpcurl -plaintext  localhost:50051 api.Registry/ListPackages
Handling connection for 50051
{
  "name": "3scale-operator"
}
{
  "name": "advanced-cluster-management"
}
{
  "name": "amq-broker"
}
{
  "name": "amq-broker-lts"
}
{
  "name": "amq-online"
}
{
  "name": "amq-streams"
}
...


[root@preserve-olm-env data]# ./oc version
Client Version: 4.5.0-0.nightly-2020-05-26-063751
Server Version: 4.5.0-0.nightly-2020-05-26-063751
Kubernetes Version: v1.18.2

[root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-6648dc47b6-x4dw7 -- olm --version
OLM version: 0.15.1
git commit: 1849f658a5c703a1c15bf4467df7eb928d321b18

Comment 7 Jian Zhang 2020-05-26 10:05:51 UTC
Change the Priority to Urgent since no workaround for now. Logs:
[root@preserve-olm-env data]# ./oc logs olm-operators3-88bcs
time="2020-05-26T09:30:32Z" level=warning msg="couldn't migrate db" database=/bundles.db error="attempt to write a readonly database" port=50051
time="2020-05-26T09:30:32Z" level=info msg="serving registry" database=/bundles.db port=50051

Comment 8 Evan Cordell 2020-05-28 11:53:30 UTC
*** Bug 1831698 has been marked as a duplicate of this bug. ***

Comment 13 Jian Zhang 2020-06-02 11:08:49 UTC
Add the detail OC version info:

[root@preserve-olm-env hui]# ./oc version -o yaml
clientVersion:
  buildDate: "2020-05-29T14:24:36Z"
  compiler: gc
  gitCommit: 9933eb90790b36d153fcc55f8404724bb0929b96
  gitTreeState: clean
  gitVersion: 4.5.0-202005291417-9933eb9
  goVersion: go1.13.4
  major: ""
  minor: ""
  platform: linux/amd64
openshiftVersion: 4.5.0-0.nightly-2020-05-30-025738
serverVersion:
  buildDate: "2020-05-30T00:35:39Z"
  compiler: gc
  gitCommit: 224c8a2
  gitTreeState: clean
  gitVersion: v1.18.3+224c8a2
  goVersion: go1.13.4
  major: "1"
  minor: 18+
  platform: linux/amd64

Comment 18 yhui 2020-06-09 09:08:12 UTC
Version:
[root@preserve-olm-env hui]# /data/hui/oc version
Client Version: 4.5.0-202005291417-9933eb9
Server Version: 4.5.0-0.nightly-2020-06-08-204500
Kubernetes Version: v1.18.3+a637491

[root@preserve-olm-env ~]# /data/hui/oc version -o yaml
clientVersion:
  buildDate: "2020-05-29T14:24:36Z"
  compiler: gc
  gitCommit: 9933eb90790b36d153fcc55f8404724bb0929b96
  gitTreeState: clean
  gitVersion: 4.5.0-202005291417-9933eb9
  goVersion: go1.13.4
  major: ""
  minor: ""
  platform: linux/amd64
openshiftVersion: 4.5.0-0.nightly-2020-06-08-204500
serverVersion:
  buildDate: "2020-06-06T16:45:47Z"
  compiler: gc
  gitCommit: a637491
  gitTreeState: clean
  gitVersion: v1.18.3+a637491
  goVersion: go1.13.4
  major: "1"
  minor: 18+
  platform: linux/amd64

[root@preserve-olm-env hui]# /data/hui/oc -n openshift-operator-lifecycle-manager exec catalog-operator-cd67f87c4-v84nn -- olm --version
OLM version: 0.15.1
git commit: 0bcd497a01ff14faef72648b7d3131a45e41150d


Steps to test:
1. Create the CatalogSource image by using the quay.io/openshift/origin-operator-registry 4.4 images.
[root@preserve-olm-env hui]# /data/hui/oc adm catalog build --auth-token="basic eXVodWkxMjpRV0Vhc2QxMjM0NTY9PT0=" --appregistry-org redhat-operators --from=quay.io/openshift/origin-operator-registry:4.4 --to=quay.io/yuhui12/local-redhat-operators:bug-1838473-4

2. Create a CatalogSource object to consume 4.4 image.
[root@preserve-olm-env 1827544]# cat cs2.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: olm-operators3
  namespace: openshift-marketplace
spec:
  displayName: OLM3 Operators
  image: quay.io/yuhui12/local-redhat-operators:bug-1838473-4
  publisher: QE
  sourceType: grpc
[root@preserve-olm-env 1827544]# oc create -f cs2.yaml 
[root@preserve-olm-env 1827544]# oc get catsrc -n openshift-marketplace
NAME                  DISPLAY               TYPE   PUBLISHER   AGE
certified-operators   Certified Operators   grpc   Red Hat     42m
community-operators   Community Operators   grpc   Red Hat     43m
olm-operators3        OLM3 Operators        grpc   QE          5m47s
qe-app-registry                             grpc               24m
redhat-marketplace    Red Hat Marketplace   grpc   Red Hat     43m
redhat-operators      Red Hat Operators     grpc   Red Hat     42m
[root@preserve-olm-env 1827544]# 
[root@preserve-olm-env 1827544]# oc get pod -n openshift-marketplace
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-6786bddbd4-msv74    1/1     Running   0          43m
community-operators-6876b858df-dxcxd    1/1     Running   0          43m
marketplace-operator-68fcdccdbb-lvjz2   1/1     Running   0          45m
olm-operators3-zzlhk                    1/1     Running   0          6m10s
qe-app-registry-567579c9bc-v4zd7        1/1     Running   0          25m
redhat-marketplace-6757bd48b5-sxdp4     1/1     Running   0          43m
redhat-operators-69b8b894c6-gk4bp       1/1     Running   0          43m

3. Check the `packagemanifest` object.
[root@preserve-olm-env 1827544]# oc get packagemanifest |grep OLM
[root@preserve-olm-env 1827544]# 

There is no packagemanifest in this catalogsource using the quay.io/openshift/origin-operator-registry:4.4 image.

4. Check the grpcurl.
[root@preserve-olm-env 1827544]# oc port-forward olm-operators3-zzlhk  50051 -n openshift-marketplace
Forwarding from 127.0.0.1:50051 -> 50051
Forwarding from [::1]:50051 -> 50051

[root@preserve-olm-env ~]# grpcurl -plaintext  localhost:50051 api.Registry/ListPackages
{
  "name": "3scale-operator"
}
{
  "name": "advanced-cluster-management"
}
{
  "name": "amq-broker"
}
{
  "name": "amq-broker-lts"
}
{
  "name": "amq-online"
}
{
  "name": "amq-streams"
}
{
  "name": "amq7-cert-manager"
}
```
The packages can be checked in the grpcurl.

Comment 19 yhui 2020-06-09 09:28:18 UTC
I saw the Comment 17 said the 4.4 image does not need to test in this bug. Please ignore the above Comment 18.
Another question: Do we need to test 'oc adm catalog build' without the `--from` option in this bug?

Comment 24 yhui 2020-06-16 01:14:11 UTC
Version:
[root@preserve-olm-env ~]# /data/hui/oc  version
Client Version: 4.5.0-202005291417-9933eb9
Server Version: 4.5.0-0.nightly-2020-06-11-183238
Kubernetes Version: v1.18.3+91d0edd
[root@preserve-olm-env ~]# oc exec catalog-operator-696f8fb9f7-2n65g -n openshift-operator-lifecycle-manager -- olm --version
OLM version: 0.15.1
git commit: 0bcd497a01ff14faef72648b7d3131a45e41150d


Steps to test:
1. Create the CatalogSource image by using the quay.io/openshift/origin-operator-registry 4.5 images.
[root@preserve-olm-env new-feature]# /data/hui/oc adm catalog build --auth-token="basic eXVodWkxMjpRV0Vhc2QxMjM0NTY9PT0=" --appregistry-org redhat-operators --from=quay.io/openshift/origin-operator-registry:4.5 --to=quay.io/yuhui12/local-redhat-operators:bug-1838473-5

2. Create a CatalogSource object to consume 4.5 image.
[root@preserve-olm-env new-feature]# cat cs.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: olm-operators
  namespace: openshift-marketplace
spec:
  displayName: OLM Operators
  image: quay.io/yuhui12/local-redhat-operators:bug-1838473-5
  publisher: QE
  sourceType: grpc
[root@preserve-olm-env new-feature]# oc create -f cs.yaml 

[root@preserve-olm-env ~]# oc get catsrc -n openshift-marketplace
NAME                  DISPLAY               TYPE   PUBLISHER   AGE
certified-operators   Certified Operators   grpc   Red Hat     62m
community-operators   Community Operators   grpc   Red Hat     62m
olm-operators         OLM Operators         grpc   QE          7m
qe-app-registry                             grpc               50m
qitang-operators      Red Hat Operators     grpc   Red Hat     19m
redhat-marketplace    Red Hat Marketplace   grpc   Red Hat     62m
redhat-operators      Red Hat Operators     grpc   Red Hat     62m
[root@preserve-olm-env ~]# oc get pod -n openshift-marketplace
NAME                                   READY   STATUS    RESTARTS   AGE
certified-operators-99f874b98-zkblk    1/1     Running   0          63m
community-operators-64459687f7-zhxw9   1/1     Running   0          63m
marketplace-operator-84d8777f9-8pt72   1/1     Running   0          63m
olm-operators-7q6bg                    1/1     Running   0          7m
qe-app-registry-84cc4b89c8-nmm8h       1/1     Running   0          51m
qitang-operators-5b5fd68d7-s97d5       1/1     Running   0          19m
redhat-marketplace-7dfd44cfb7-khrtj    1/1     Running   0          63m
redhat-operators-7676678689-r2ccf      1/1     Running   0          63m

3. Check the `packagemanifest` object.
[root@preserve-olm-env ~]# oc get packagemanifest |grep OLM
eap                                          OLM Operators         7m
kiali-ossm                                   OLM Operators         7m
3scale-operator                              OLM Operators         7m
fuse-apicurito                               OLM Operators         7m
datagrid                                     OLM Operators         7m
amq-online                                   OLM Operators         7m
nfd                                          OLM Operators         7m
openshifttemplateservicebroker               OLM Operators         7m
amq7-interconnect-operator                   OLM Operators         7m
```

Since only 4.5 image is tracked in this bug and other images will be tracked in other bugs, 4.5 image works well. Label the bug as verified.

Comment 25 errata-xmlrpc 2020-07-13 17:30:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409