Bug 2076323 - OLM blocks all operator installs if an openshift-marketplace catalogsource is unavailable
Summary: OLM blocks all operator installs if an openshift-marketplace catalogsource is...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Per da Silva
QA Contact: Jian Zhang
URL:
Whiteboard:
: 2048197 2082676 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-18 17:56 UTC by Naveen Malik
Modified: 2022-11-27 23:59 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: there is a bad catalog source in the openshift-marketplace namespace. Consequence: All subscriptions will be blocked. Fix: Result: If there is a bad catalog source in the global namespace(openshift-marketplace), the user can subscribe to an operator from a good catalog source of their own namespace with the OG annotation. The subscription point to the good catalog source of the global namespace still is blocked. If there is a bad catalog source in the local namespace(user's namespace), the user cannot subscribe to any operator into this namespace, no matter whether the subscription point to the good catalog source of the local or global namespace.
Clone Of:
Environment:
Last Closed: 2022-08-10 11:07:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift operator-framework-olm pull 320 0 None open Bug 2076323: Disable global catalogs from resolution 2022-06-21 15:20:53 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:08:02 UTC

Description Naveen Malik 2022-04-18 17:56:51 UTC
Description of problem:
If any catalogsources in openshift-marketplace are in a bad state all operator installation via OLM is blocked.


Version-Release number of selected component (if applicable):
4.10 and 4.8 OSD clusters


How reproducible:
100%


Steps to Reproduce:
1. break one of the catalogsources in openshift-marketplace
2. create custom catalogsource in new namespace
3. create subscription with `source` and `sourceNamespace` set for the custom catalogsource

Actual results:
Operator installation does not happen.

Expected results:
Operator is installed.

Additional info:
Was seen on ROSA/OSD over the weekend of April 14/17 when imaging signing component of registry.redhat.io experieenced a few hours of outage.  All ROSA/OSD cluster installs were impacted.  Workaround was to delete / disable the broken catalogsources in openshift-marketplace to allow other operators to install.  Note there are no dependencies between operators, all custom catalogsources not in openshift-marketplace, and all subscriptions have source and sourceNamespace set.

Comment 5 Per da Silva 2022-04-27 19:04:03 UTC
Setting this as not a blocker, since it's working as designed. However, we should still aim to improve the UX for this use-case.

Comment 10 Per da Silva 2022-05-27 08:46:33 UTC
Summary of the path forward:

From the OLM side we feel strongly about the promise we make users about the determinism of the resolver. This is why we fail resolution in case a catalog source cannot be reached. Rolling back on this could lead to confusion for admins and large blast radius for problems.

In order to mitigate the issue above, we suggest that we add a mechanism to allow certain namespaces to opt-out of using the global catalogs during resolution. This should ease the case for non-CVO managed namespaces to rely solely on the catalog source they provide.

Use-cases:

1. Self-management of operators though local catalog sources

In this case, the admin provides all operators in a locally namespaced catalog sources. Resolution will be robust to global catalog source failures by ignoring them entirely. Local catalog source errors will still be surfaced and affect resolution.

2. Self-managed + Global catalog sources

In this case, if you depend on global catalog sources and there's an issue with them, resolution will fail. This guards against non-deterministic resolution, and guarantees to admins that the intended operator will be used independently of the underlying network conditions.

Back-portability:

Since we don't backport API changes, we propose the following compromise:

For OCP versions <= 4.10: the admin can add an annotation (olm.operatorframework.io/exclude-global-catalog-resolution) to the namespace operator group.
For OCP versions >= 4.11: the OperatorGroup API will include a toggle excludeGlobalCatalogResolution = true | false

P.S. I need to double check the versions. It may well be that in 4.11 we only use the annotation as well and push the OG changes to 4.12.

Comment 11 Daniel Sover 2022-06-01 20:50:35 UTC
Moving to assigned as this is currently in-progress.

Comment 12 tflannag 2022-06-07 20:20:37 UTC
*** Bug 2048197 has been marked as a duplicate of this bug. ***

Comment 13 Daniel Sover 2022-06-16 21:18:00 UTC
Upstream PR has merged: https://github.com/operator-framework/operator-lifecycle-manager/pull/2788

This should get pulled in during the next downstream sync. 

The operatorgroup annotation key is olm.operatorframework.io/exclude-global-namespace-resolution and setting the value to "true" will cause resolution to exclude global catalogs in that namespace.

Comment 14 Jian Zhang 2022-06-22 10:21:25 UTC
1, Build a cluster that contains the fixed PR via cluster-bot.
mac:~ jianzhang$ oc get clusterversion
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.ci.test-2022-06-22-084806-ci-ln-mmpd1lk-latest   True        False         12m     Cluster version is 4.11.0-0.ci.test-2022-06-22-084806-ci-ln-mmpd1lk-latest

2, Install a bad CatalogSource in the openshift-marketplace project.

mac:~ jianzhang$ oc create -f cs-qe.yaml 
catalogsource.operators.coreos.com/qe-app-registry created
mac:~ jianzhang$ cat ~/cs-qe.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: qe-app-registry
  namespace: openshift-marketplace
spec:
  displayName: Production Operators
  image: quay.io/openshift-qe-optional-operators/ocp4-index:latest
  publisher: OpenShift QE
  sourceType: grpc
  updateStrategy:
    registryPoll:
      interval: 15m

mac:~ jianzhang$ oc get catalogsource
NAME                  DISPLAY                TYPE   PUBLISHER      AGE
certified-operators   Certified Operators    grpc   Red Hat        32m
community-operators   Community Operators    grpc   Red Hat        32m
qe-app-registry       Production Operators   grpc   OpenShift QE   61s
redhat-marketplace    Red Hat Marketplace    grpc   Red Hat        32m
redhat-operators      Red Hat Operators      grpc   Red Hat        32m
mac:~ jianzhang$ oc get pods
NAME                                    READY   STATUS         RESTARTS      AGE
certified-operators-fpbtc               1/1     Running        0             32m
community-operators-8d6fw               1/1     Running        0             32m
marketplace-operator-5d5cc746d4-skxjn   1/1     Running        1 (26m ago)   35m
qe-app-registry-9mfdg                   0/1     ErrImagePull   0             65s
redhat-marketplace-5wnzx                1/1     Running        0             32m
redhat-operators-k82bt                  1/1     Running        0             32m

3, Subscribe to the etcd operator (from community-operators) to default project.

mac:~ jianzhang$ oc get sub -A
NAMESPACE   NAME   PACKAGE   SOURCE                CHANNEL
default     etcd   etcd      community-operators   singlenamespace-alpha

Still be blocked.
mac:~ jianzhang$  oc get sub -n default etcd  -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
...
  conditions:
  - lastTransitionTime: "2022-06-22T09:39:42Z"
    message: all available catalogsources are healthy
    reason: AllCatalogSourcesHealthy
    status: "False"
    type: CatalogSourcesUnhealthy
  - message: 'failed to populate resolver cache from source qe-app-registry/openshift-marketplace:
      failed to list bundles: rpc error: code = Unavailable desc = connection error:
      desc = "transport: Error while dialing dial tcp 172.30.33.41:50051: i/o timeout"'
    reason: ErrorPreventedResolution
    status: "True"
    type: ResolutionFailed

3-1, add olm.operatorframework.io/exclude-global-namespace-resolution: "true" to the OperatorGroup.
mac:~ jianzhang$ oc get og default-bk9zf -o yaml
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  annotations:
    olm.operatorframework.io/exclude-global-namespace-resolution: "true"
    olm.providedAPIs: ""
  creationTimestamp: "2022-06-22T09:38:55Z"
  generateName: default-
  generation: 1
  name: default-bk9zf
  namespace: default
  resourceVersion: "47944"
  uid: 8dad82ef-faa8-4963-94f3-2b6ffd768a15
spec:
  targetNamespaces:
  - default
  upgradeStrategy: Default
status:
  lastUpdated: "2022-06-22T09:38:55Z"
  namespaces:
  - default

Nothing changed.
mac:~ jianzhang$ oc get ip
No resources found in default namespace.
mac:~ jianzhang$ oc get csv
No resources found in default namespace.

3-2, resubscribe it. Got another error: "constraints not satisfiable"

mac:~ jianzhang$ oc get sub
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha
mac:~ jianzhang$ oc get ip
mac:~ jianzhang$ oc get sub etcd -o yaml
...
...
  conditions:
  - lastTransitionTime: "2022-06-22T10:14:05Z"
    message: all available catalogsources are healthy
    reason: AllCatalogSourcesHealthy
    status: "False"
    type: CatalogSourcesUnhealthy
  - message: 'constraints not satisfiable: no operators found from catalog community-operators
      in namespace openshift-marketplace referenced by subscription etcd, subscription
      etcd exists'
    reason: ConstraintsNotSatisfiable
    status: "True"
    type: ResolutionFailed
  lastUpdated: "2022-06-22T10:14:05Z"

PS: even if this step work, I still have some concerns:
1) As you know, the OperatorGroup is created automatically when subscribing to it on the Web console. So, how does the user add the annotation? Must create the OperatorGroup before subscribing?  


4, remove the bad CatalogSource from the openshift-marketplace project, and install it in other project

mac:~ jianzhang$ oc delete catalogsource qe-app-registry
catalogsource.operators.coreos.com "qe-app-registry" deleted

mac:~ jianzhang$ oc create -f cs-qe.yaml 
catalogsource.operators.coreos.com/qe-app-registry created
mac:~ jianzhang$ 
mac:~ jianzhang$ cat cs-qe.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: qe-app-registry
  namespace: jian
spec:
  displayName: Production Operators
  image: quay.io/openshift-qe-optional-operators/ocp4-index:latest
  publisher: OpenShift QE
  sourceType: grpc
  updateStrategy:
    registryPoll:
      interval: 15m

5, Subscribe to the etcd operator (from community-operators) to default project.

mac:~ jianzhang$ oc get catalogsource -n jian
NAME              DISPLAY                TYPE   PUBLISHER      AGE
qe-app-registry   Production Operators   grpc   OpenShift QE   3m15s
mac:~ jianzhang$ oc get pods -n jian
NAME                    READY   STATUS         RESTARTS   AGE
qe-app-registry-5c7c4   0/1     ErrImagePull   0          3m21s

mac:~ jianzhang$ oc get sub -n default
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha
mac:~ jianzhang$ oc get ip -n default
NAME            CSV                   APPROVAL    APPROVED
install-hnh8h   etcdoperator.v0.9.4   Automatic   true
mac:~ jianzhang$ oc get csv -n default
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Installing
mac:~ jianzhang$ oc get csv -n default
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Succeeded

6, Subscribe to the etcd operator (from community-operators) to "jian" project that the bad CatalogSource running in.
mac:~ jianzhang$ oc get sub -n jian
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha

  - message: 'failed to populate resolver cache from source qe-app-registry/jian:
      failed to list bundles: rpc error: code = Unavailable desc = connection error:
      desc = "transport: Error while dialing dial tcp 172.30.8.68:50051: i/o timeout"'
    reason: ErrorPreventedResolution
    status: "True"


Change the status to ASSIGNED.

Comment 15 Daniel Sover 2022-06-22 13:49:41 UTC
QE did not verify the behavior that this PR is addressing -- the failure Jian saw is unrelated. Per spoke to Jian on slack -- should have a correct QE test shortly. Moving back to POST.

Comment 16 Jian Zhang 2022-06-23 07:07:02 UTC
Below is the explanation from Per:
if there is a bad catalog source in the global namespace (openshift-marketplace), this will block subscription resolution across the whole cluster. This doesn't change. Even if you have a custom catalog source in your own namespaces and a subscription pointing to it, it will not resolve. Adding the OG annotation will tell the resolver to only consider local catalog sources during resolution for the OG's namespace. So, if you have a local catalog source and a subscription pointing to it, it will resolve once the annotation is added to the OG. 

Testing:

>> test scenario: a bad catalog source in the global namespace, and a good catalog source in the user's namespace. And, subscribe to an operator from a good catalog source of the local namespace. It works with the OG annotation.

mac:~ jianzhang$ oc get clusterversion
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.ci.test-2022-06-23-025214-ci-ln-039sl4k-latest   True        False         2m11s   Cluster version is 4.11.0-0.ci.test-2022-06-23-025214-ci-ln-039sl4k-latest

1, Create a bad catalog source in the global namespace.
mac:bug2076323 jianzhang$ oc get pods -n openshift-marketplace
NAME                                                              READY   STATUS             RESTARTS   AGE
certified-operators-zghnf                                         1/1     Running            0          75m
community-operators-zwtvp                                         1/1     Running            0          75m
e8c9651078ae45ddb2807e3a07727d459b82d7def5572a7b7ccaae332b6klgx   0/1     Completed          0          51m
marketplace-operator-5b56956987-l7bhb                             1/1     Running            0          79m
qe-app-registry-fr5t2                                             0/1     ImagePullBackOff   0          77s
redhat-marketplace-dtdg9                                          1/1     Running            0          75m
redhat-operators-b5w2q                                            1/1     Running            0          75m

2, Create a good catalog source in a project called "test".
mac:bug2076323 jianzhang$ oc get catalogsource -n test
NAME                  DISPLAY   TYPE   PUBLISHER   AGE
community-operators             grpc   Red Hat     15m
mac:bug2076323 jianzhang$ oc get pods -n test
NAME                                                              READY   STATUS      RESTARTS   AGE
community-operators-692x8                                         1/1     Running     0          15m

3, Create an OG without the annotation.

mac:bug2076323 jianzhang$ oc create -f ~/og.yaml 
operatorgroup.operators.coreos.com/default-og created
mac:bug2076323 jianzhang$ cat ~/og.yaml 
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: default-og
  namespace: test
spec:
  targetNamespaces:
  - test

4, subscribe to an operator from the good one.
mac:bug2076323 jianzhang$ cat ~/sub-etcd.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: etcd
  namespace: test
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Automatic
  name: etcd
  source: community-operators
  sourceNamespace: test
  startingCSV: etcdoperator.v0.9.4

mac:bug2076323 jianzhang$ oc create -f ~/sub-etcd.yaml 
subscription.operators.coreos.com/etcd created
mac:bug2076323 jianzhang$ oc get sub
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha

mac:bug2076323 jianzhang$ oc get sub etcd -o yaml
...
  conditions:
  - lastTransitionTime: "2022-06-23T04:24:25Z"
    message: all available catalogsources are healthy
    reason: AllCatalogSourcesHealthy
    status: "False"
    type: CatalogSourcesUnhealthy
  - message: 'failed to populate resolver cache from source qe-app-registry/openshift-marketplace:
      failed to list bundles: rpc error: code = Unavailable desc = connection error:
      desc = "transport: Error while dialing dial tcp 172.30.254.241:50051: i/o timeout"'
    reason: ErrorPreventedResolution
    status: "True"
    type: ResolutionFailed

5, Update the OG to add the annotation.

mac:bug2076323 jianzhang$ oc edit og default-og
operatorgroup.operators.coreos.com/default-og edited
mac:bug2076323 jianzhang$ oc get og default-og -o yaml
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  annotations:
    olm.operatorframework.io/exclude-global-namespace-resolution: "true"
    olm.providedAPIs: EtcdBackup.v1beta2.etcd.database.coreos.com,EtcdCluster.v1beta2.etcd.database.coreos.com,EtcdRestore.v1beta2.etcd.database.coreos.com
  creationTimestamp: "2022-06-23T04:23:14Z"
  generation: 1
  name: default-og
  namespace: test
  resourceVersion: "51863"
  uid: b78d9b49-a8b7-41e8-a705-e4ba70e3b687
spec:
  targetNamespaces:
  - test
  upgradeStrategy: Default
status:
  lastUpdated: "2022-06-23T04:23:14Z"
  namespaces:
  - test

mac:bug2076323 jianzhang$ oc get sub
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha
mac:bug2076323 jianzhang$ oc get ip
NAME            CSV                   APPROVAL    APPROVED
install-hn4tt   etcdoperator.v0.9.4   Automatic   true
mac:bug2076323 jianzhang$ oc get csv
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Succeeded

The subscription succeeded. Looks good

>>> test scenario: a bad catalog source in the global namespace, and a good catalog source in the user's namespace. And, subscribe to an operator from a good catalog source of the global namespace. It failed.

6, subscribe to an operator from a good one running on the global namespace, failed. but the error is different.

mac:bug2076323 jianzhang$ cat ~/sub-etcd.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: etcd
  namespace: test
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Automatic
  name: etcd
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.4

mac:bug2076323 jianzhang$ oc get sub
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha
mac:bug2076323 jianzhang$ oc get ip
No resources found in test namespace.
mac:bug2076323 jianzhang$ oc get sub etcd -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  creationTimestamp: "2022-06-23T04:32:39Z"
  generation: 1
  labels:
    operators.coreos.com/etcd.test: ""
  name: etcd
  namespace: test
  resourceVersion: "54460"
  uid: a1d3c005-bcb1-4716-8863-a6b36b4006f5
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Automatic
  name: etcd
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.4
...
  conditions:
  - lastTransitionTime: "2022-06-23T04:32:39Z"
    message: all available catalogsources are healthy
    reason: AllCatalogSourcesHealthy
    status: "False"
    type: CatalogSourcesUnhealthy
  - message: 'constraints not satisfiable: no operators found from catalog community-operators
      in namespace openshift-marketplace referenced by subscription etcd, subscription
      etcd exists'
    reason: ConstraintsNotSatisfiable
    status: "True"
    type: ResolutionFailed
  lastUpdated: "2022-06-23T04:32:39Z"


>> There is a bad and a good catalog source in the local namespace, and subscribe to an operator from the good one of the local namespace. It failed.

mac:bug2076323 jianzhang$ oc get catalogsource
NAME                  DISPLAY                TYPE   PUBLISHER      AGE
community-operators                          grpc   Red Hat        11s
qe-app-registry       Production Operators   grpc   OpenShift QE   42m

mac:bug2076323 jianzhang$ oc get pods
NAME                        READY   STATUS             RESTARTS   AGE
community-operators-692x8   1/1     Running            0          35s
qe-app-registry-jwkkp       0/1     ImagePullBackOff   0          43m
qe-app-registry-vrwzq       0/1     ImagePullBackOff   0          27m

1, New a project called test, and create an OG with the annotation.
mac:bug2076323 jianzhang$ oc get og -o yaml
apiVersion: v1
items:
- apiVersion: operators.coreos.com/v1
  kind: OperatorGroup
  metadata:
    annotations:
      olm.operatorframework.io/exclude-global-namespace-resolution: "true"
    creationTimestamp: "2022-06-23T03:36:46Z"
    generation: 1
    name: default-og
    namespace: test
    resourceVersion: "33658"
    uid: 3627d484-36d8-4501-97d7-62aa653ef5c9
  spec:
    targetNamespaces:
    - test
    upgradeStrategy: Default
  status:
    lastUpdated: "2022-06-23T03:36:46Z"
    namespaces:
    - test
kind: List
metadata:
  resourceVersion: ""

2, subscribe to the etcd operator from the good catalog source.
mac:bug2076323 jianzhang$ cat ~/sub-etcd.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: etcd
  namespace: test
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Automatic
  name: etcd
  source: community-operators
  sourceNamespace: test
  startingCSV: etcdoperator.v0.9.4

mac:bug2076323 jianzhang$ oc get sub etcd -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
...
  conditions:
  - lastTransitionTime: "2022-06-23T04:09:04Z"
    message: all available catalogsources are healthy
    reason: AllCatalogSourcesHealthy
    status: "False"
    type: CatalogSourcesUnhealthy
  - message: 'failed to populate resolver cache from source qe-app-registry/test:
      failed to list bundles: rpc error: code = Unavailable desc = connection error:
      desc = "transport: Error while dialing dial tcp 172.30.157.83:50051: i/o timeout"'
    reason: ErrorPreventedResolution
    status: "True"
    type: ResolutionFailed
  lastUpdated: "2022-06-23T04:09:28Z"


>> There is a bad catalog source in the local namespace, and subscribe to an operator from a good catalog source of the global namespace. It failed.

3, remove the good catalog source and reserve the bad catalog source in it.
mac:~ jianzhang$ oc get catalogsource -n test
NAME              DISPLAY                TYPE   PUBLISHER      AGE
qe-app-registry   Production Operators   grpc   OpenShift QE   23s
mac:~ jianzhang$ oc get pods -n test
NAME                    READY   STATUS         RESTARTS   AGE
qe-app-registry-jwkkp   0/1     ErrImagePull   0          30s

4, Subscribe to etcd operator from community-operators that running in the global namespace.

mac:~ jianzhang$ cat sub-etcd.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: etcd
  namespace: test
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Automatic
  name: etcd
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.4

mac:~ jianzhang$ oc get sub 
NAME   PACKAGE   SOURCE                CHANNEL
etcd   etcd      community-operators   singlenamespace-alpha
mac:~ jianzhang$ oc get ip
No resources found in test namespace.

mac:~ jianzhang$ oc get sub etcd -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
...
  conditions:
  - lastTransitionTime: "2022-06-23T03:37:49Z"
    message: all available catalogsources are healthy
    reason: AllCatalogSourcesHealthy
    status: "False"
    type: CatalogSourcesUnhealthy
  - message: 'failed to populate resolver cache from source qe-app-registry/test:
      failed to list bundles: rpc error: code = DeadlineExceeded desc = context deadline
      exceeded'
    reason: ErrorPreventedResolution
    status: "True"
    type: ResolutionFailed
  lastUpdated: "2022-06-23T03:40:55Z"

  conditions:
  - lastTransitionTime: "2022-06-23T03:37:49Z"
    message: all available catalogsources are healthy
    reason: AllCatalogSourcesHealthy
    status: "False"
    type: CatalogSourcesUnhealthy
  - message: 'failed to populate resolver cache from source qe-app-registry/test:
      failed to list bundles: rpc error: code = Unavailable desc = connection error:
      desc = "transport: Error while dialing dial tcp 172.30.157.83:50051: i/o timeout"'
    reason: ErrorPreventedResolution
    status: "True"
    type: ResolutionFailed
  lastUpdated: "2022-06-23T03:41:41Z"


So, for this PR, only fixed the first scenario: 
If there is a bad catalog source in the global namespace(openshift-marketplace), the user can subscribe to an operator from a good catalog source of their own namespace with the OG annotation. The subscription point to the good catalog source of the global namespace still is blocked.
If there is a bad catalog source in the local namespace(user's namespace), the user cannot subscribe to any operator into this namespace, no matter whether the subscription point to the good catalog source of the local or global namespace. 
Correct me if I'm wrong, thanks!

Include the document team here, it's better to document this point in the 4.11 release note.

Comment 20 Alexander Greene 2022-07-01 19:30:10 UTC
*** Bug 2048197 has been marked as a duplicate of this bug. ***

Comment 21 Per da Silva 2022-07-07 10:01:57 UTC
*** Bug 2082676 has been marked as a duplicate of this bug. ***

Comment 22 errata-xmlrpc 2022-08-10 11:07:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.