Bug 1891995 - OperatorHub displaying old content
Summary: OperatorHub displaying old content
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.6
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.7.0
Assignee: Vu Dinh
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks: 1894163
TreeView+ depends on / blocked
 
Reported: 2020-10-27 20:14 UTC by Kevin Rizza
Modified: 2024-03-25 16:50 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Marketplace operator doesn't clean stale Service during cluster upgrade. OLM accepts the stale Service without checking. Consequence: The stale Service directs traffics to an old CatalogSource pod which contains outdated content Fix: OLM adds spec hash information to Service and check to ensure Service having correct spec by comparing the hash information. OLM will delete and recreate Service if it deems the Service is stale. Result: The Service spec is accurate and directs traffic to correct CatalogSource pod.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:28:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-lifecycle-manager pull 1848 0 None closed Bug 1891995: Add spec hash to service's label to ensure service is correct 2021-02-18 21:31:42 UTC
Red Hat Knowledge Base (Solution) 5532081 0 None None None 2020-11-01 01:51:29 UTC
Red Hat Knowledge Base (Solution) 5532201 0 None None None 2020-11-01 14:30:23 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:29:09 UTC

Description Kevin Rizza 2020-10-27 20:14:51 UTC
Description of problem:

On upgrade from 4.5 to 4.6 (the first pair of versions that allowed differing default content between minor versions) the embedded OperatorHub UI is displaying old content that cannot be installed or used since it is not actually available on the latest catalog attached to the cluster.

Version-Release number of selected component (if applicable):

4.6

How reproducible:

Always

Steps to Reproduce:
1. Create a 4.5 cluster
2. Upgrade to 4.6
3. Go to the OperatorHub page

Actual results:

Operators that are not included in 4.6 are shown as available and cannot be installed

Expected results:

Only 4.6 content is displayed

Additional info:

This seems likely to be either a cache problem in the package manifests/package server or potentially a UI cache issue with the operatorhub page.

Comment 2 Heiko Braun 2020-10-30 07:12:18 UTC
Is https://bugzilla.redhat.com/show_bug.cgi?id=1891993 the actual root cause of this?

Comment 3 Heiko Braun 2020-10-30 07:14:00 UTC
More importantly, is there a workaround for the OCP 4.6 operators to show up?

Comment 12 Jian Zhang 2020-11-05 02:55:34 UTC
Cluster version is 4.7.0-0.nightly-2020-11-05-010603
[root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-67dbcb669b-jk4ks -- olm --version
OLM version: 0.17.0
git commit: 594996a0f09040c56312fdb8c9321284529283fe

Since this bug is for 4.7, but there is no `OperatorSource` object in 4.6, I cannot reproduce this issue by upgrading the cluster to 4.7 from 4.6. So, I check the service selector, it works as expected. As follows:

[root@preserve-olm-env data]# oc get svc --show-labels -o wide
NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE   SELECTOR                                LABELS
certified-operators            ClusterIP   172.30.28.166    <none>        50051/TCP           48m   olm.catalogSource=certified-operators   olm.service-spec-hash=7b54949b85
community-operators            ClusterIP   172.30.170.216   <none>        50051/TCP           48m   olm.catalogSource=community-operators   olm.service-spec-hash=78cd454589
marketplace-operator-metrics   ClusterIP   172.30.206.80    <none>        8383/TCP,8081/TCP   56m   name=marketplace-operator               name=marketplace-operator
redhat-marketplace             ClusterIP   172.30.167.86    <none>        50051/TCP           48m   olm.catalogSource=redhat-marketplace    olm.service-spec-hash=5cdf7cd58
redhat-operators               ClusterIP   172.30.18.218    <none>        50051/TCP           48m   olm.catalogSource=redhat-operators      olm.service-spec-hash=b74c4648d

[root@preserve-olm-env data]# oc get pods --show-labels
NAME                                                              READY   STATUS      RESTARTS   AGE     LABELS
certified-operators-mxdcc                                         1/1     Running     0          46m     olm.catalogSource=certified-operators
community-operators-m8xsz                                         1/1     Running     0          46m     olm.catalogSource=community-operators
marketplace-operator-85c6b76fd6-b5d58                             1/1     Running     0          53m     name=marketplace-operator,pod-template-hash=85c6b76fd6
redhat-marketplace-wk9ht                                          1/1     Running     0          46m     olm.catalogSource=redhat-marketplace
redhat-operators-zhvmz                                            1/1     Running     0          46m     olm.catalogSource=redhat-operators

[root@preserve-olm-env data]# oc get packagemanifest|grep "Red Hat"
anzograph-operator-rhmp                              Red Hat Marketplace   35m
businessautomation-operator                          Red Hat Operators     35m
...

And. opertors can be installed well.
[root@preserve-olm-env data]# oc get csv -A
NAMESPACE                              NAME                               DISPLAY                 VERSION   REPLACES                           PHASE
default                                etcdoperator.v0.9.4                etcd                    0.9.4     etcdoperator.v0.9.2                Succeeded
openshift-operator-lifecycle-manager   packageserver                      Package Server          0.17.0                                       Succeeded
test                                   3scale-community-operator.v0.5.1   3scale API Management   0.5.1     3scale-community-operator.v0.4.0   Succeeded

LGTM, verify it.

Comment 13 Pedro Amoedo 2020-12-01 11:54:42 UTC
Hi all, I can see this BZ is still on VERIFIED state, however I've made the following test and the issue is no longer present, can someone please corroborate?

1) AWS IPI 4.5.16 from scratch with default values.

2) Changed cluster channel to "fast-4.6".

~~~
$ oc patch clusterversion/version -p '{"spec":{"channel":"fast-4.6"}}' --type=merge
~~~

3) Succesfully upgraded to 4.6.4:

~~~
$ oc adm upgrade --to=4.6.4
Updating to 4.6.4

---

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.4     True        False         13m     Cluster version is 4.6.4
~~~

NOTE: latest version is 4.6.6 but wanted to show you the minimal one on which the problem is no longer present.

4) Confirmed via OperatorHub that new 4.6 operator channels are visible, for example with Logging operator.

5) Confirmed also that the legacy deployments are no longer present:

~~~
$ oc get all -n openshift-marketplace
NAME                                        READY   STATUS    RESTARTS   AGE
pod/certified-operators-pl8nc               1/1     Running   0          13m
pod/community-operators-dgldj               1/1     Running   0          16m
pod/marketplace-operator-5bbff88564-ddsr9   1/1     Running   0          11m
pod/redhat-marketplace-pkkn5                1/1     Running   0          16m
pod/redhat-operators-lzqc7                  1/1     Running   0          12m

NAME                                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/certified-operators            ClusterIP   172.30.28.166    <none>        50051/TCP           44m
service/community-operators            ClusterIP   172.30.40.160    <none>        50051/TCP           44m
service/marketplace-operator-metrics   ClusterIP   172.30.33.234    <none>        8383/TCP,8081/TCP   99m
service/redhat-marketplace             ClusterIP   172.30.163.254   <none>        50051/TCP           44m
service/redhat-operators               ClusterIP   172.30.122.0     <none>        50051/TCP           44m

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/marketplace-operator   1/1     1            1           99m

NAME                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/marketplace-operator-5bbff88564   1         1         1       45m
replicaset.apps/marketplace-operator-c679bd65b    0         0         0       99m
~~~

Best Regards.

Comment 14 Pedro Amoedo 2020-12-01 11:57:41 UTC
Please disregard my latest comment, I have found the backport BZ#1894163 which corroborates the fix within OCP 4.6.4.

Thanks and regards.

Comment 17 errata-xmlrpc 2021-02-24 15:28:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.