Bug 1891995
Summary: | OperatorHub displaying old content | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Kevin Rizza <krizza> |
Component: | OLM | Assignee: | Vu Dinh <vdinh> |
OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | urgent | ||
Priority: | urgent | CC: | anbhatta, dkulkarn, dmoessne, hbraun, mas-hatada, mfuruta, pamoedom, rh-container, vdinh, ychoukse |
Version: | 4.6 | ||
Target Milestone: | --- | ||
Target Release: | 4.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: Marketplace operator doesn't clean stale Service during cluster upgrade. OLM accepts the stale Service without checking.
Consequence: The stale Service directs traffics to an old CatalogSource pod which contains outdated content
Fix: OLM adds spec hash information to Service and check to ensure Service having correct spec by comparing the hash information. OLM will delete and recreate Service if it deems the Service is stale.
Result: The Service spec is accurate and directs traffic to correct CatalogSource pod.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2021-02-24 15:28:35 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1894163 |
Description
Kevin Rizza
2020-10-27 20:14:51 UTC
Is https://bugzilla.redhat.com/show_bug.cgi?id=1891993 the actual root cause of this? More importantly, is there a workaround for the OCP 4.6 operators to show up? Cluster version is 4.7.0-0.nightly-2020-11-05-010603 [root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-67dbcb669b-jk4ks -- olm --version OLM version: 0.17.0 git commit: 594996a0f09040c56312fdb8c9321284529283fe Since this bug is for 4.7, but there is no `OperatorSource` object in 4.6, I cannot reproduce this issue by upgrading the cluster to 4.7 from 4.6. So, I check the service selector, it works as expected. As follows: [root@preserve-olm-env data]# oc get svc --show-labels -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR LABELS certified-operators ClusterIP 172.30.28.166 <none> 50051/TCP 48m olm.catalogSource=certified-operators olm.service-spec-hash=7b54949b85 community-operators ClusterIP 172.30.170.216 <none> 50051/TCP 48m olm.catalogSource=community-operators olm.service-spec-hash=78cd454589 marketplace-operator-metrics ClusterIP 172.30.206.80 <none> 8383/TCP,8081/TCP 56m name=marketplace-operator name=marketplace-operator redhat-marketplace ClusterIP 172.30.167.86 <none> 50051/TCP 48m olm.catalogSource=redhat-marketplace olm.service-spec-hash=5cdf7cd58 redhat-operators ClusterIP 172.30.18.218 <none> 50051/TCP 48m olm.catalogSource=redhat-operators olm.service-spec-hash=b74c4648d [root@preserve-olm-env data]# oc get pods --show-labels NAME READY STATUS RESTARTS AGE LABELS certified-operators-mxdcc 1/1 Running 0 46m olm.catalogSource=certified-operators community-operators-m8xsz 1/1 Running 0 46m olm.catalogSource=community-operators marketplace-operator-85c6b76fd6-b5d58 1/1 Running 0 53m name=marketplace-operator,pod-template-hash=85c6b76fd6 redhat-marketplace-wk9ht 1/1 Running 0 46m olm.catalogSource=redhat-marketplace redhat-operators-zhvmz 1/1 Running 0 46m olm.catalogSource=redhat-operators [root@preserve-olm-env data]# oc get packagemanifest|grep "Red Hat" anzograph-operator-rhmp Red Hat Marketplace 35m businessautomation-operator Red Hat Operators 35m ... And. opertors can be installed well. [root@preserve-olm-env data]# oc get csv -A NAMESPACE NAME DISPLAY VERSION REPLACES PHASE default etcdoperator.v0.9.4 etcd 0.9.4 etcdoperator.v0.9.2 Succeeded openshift-operator-lifecycle-manager packageserver Package Server 0.17.0 Succeeded test 3scale-community-operator.v0.5.1 3scale API Management 0.5.1 3scale-community-operator.v0.4.0 Succeeded LGTM, verify it. Hi all, I can see this BZ is still on VERIFIED state, however I've made the following test and the issue is no longer present, can someone please corroborate? 1) AWS IPI 4.5.16 from scratch with default values. 2) Changed cluster channel to "fast-4.6". ~~~ $ oc patch clusterversion/version -p '{"spec":{"channel":"fast-4.6"}}' --type=merge ~~~ 3) Succesfully upgraded to 4.6.4: ~~~ $ oc adm upgrade --to=4.6.4 Updating to 4.6.4 --- $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.4 True False 13m Cluster version is 4.6.4 ~~~ NOTE: latest version is 4.6.6 but wanted to show you the minimal one on which the problem is no longer present. 4) Confirmed via OperatorHub that new 4.6 operator channels are visible, for example with Logging operator. 5) Confirmed also that the legacy deployments are no longer present: ~~~ $ oc get all -n openshift-marketplace NAME READY STATUS RESTARTS AGE pod/certified-operators-pl8nc 1/1 Running 0 13m pod/community-operators-dgldj 1/1 Running 0 16m pod/marketplace-operator-5bbff88564-ddsr9 1/1 Running 0 11m pod/redhat-marketplace-pkkn5 1/1 Running 0 16m pod/redhat-operators-lzqc7 1/1 Running 0 12m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/certified-operators ClusterIP 172.30.28.166 <none> 50051/TCP 44m service/community-operators ClusterIP 172.30.40.160 <none> 50051/TCP 44m service/marketplace-operator-metrics ClusterIP 172.30.33.234 <none> 8383/TCP,8081/TCP 99m service/redhat-marketplace ClusterIP 172.30.163.254 <none> 50051/TCP 44m service/redhat-operators ClusterIP 172.30.122.0 <none> 50051/TCP 44m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/marketplace-operator 1/1 1 1 99m NAME DESIRED CURRENT READY AGE replicaset.apps/marketplace-operator-5bbff88564 1 1 1 45m replicaset.apps/marketplace-operator-c679bd65b 0 0 0 99m ~~~ Best Regards. Please disregard my latest comment, I have found the backport BZ#1894163 which corroborates the fix within OCP 4.6.4. Thanks and regards. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |