Bug 2026343

Summary: [upgrade from 4.5 to 4.6] .status.connectionState.address of catsrc community-operators is not correct
Product: OpenShift Container Platform Reporter: xzha
Component: OLMAssignee: Alexander Greene <agreene>
OLM sub component: OLM QA Contact: xzha
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: davegord, jmcmeek, tflannag
Version: 4.6   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Within the catalogSource resource, the RegistryServiceStatus stores service information that is used to generate an address that OLM relies on in order to establish a connection with the associated pod. Consequence: If the RegistryStatusService is not nil and is missing the namespace, name, and port information for its service, OLM is unable to recover until the catalogService's associated pod has an invalid image or spec. Fix: When reconciling a CatalogSource, OLM will now ensure that the RegistryServiceStatus of the catalogSource is valid and will update the catalogSource's status to reflect the change. Additionally, this address is stored within the status of the catalogSource within the status.GRPCConnectionState.Address field. If the address changes, OLM will update this field to reflect the new address as well. Result: The `.status.connectionState.address` field within a catalogSource should no longer be nil.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-12 04:39:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description xzha 2021-11-24 12:13:45 UTC
Description of problem:
upgrade path: 4.5.41-x86_64--> 4.6.0-0.nightly-2021-11-22-174225
after upgrade, .status.connectionState.address of catsrc community-operators is not correct

zhaoxia@xzha-mac JIRA-2196 % oc get catsrc community-operators -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  annotations:
    operatorframework.io/managed-by: marketplace-operator
  creationTimestamp: "2021-11-24T08:39:47Z"
  generation: 2
  labels:
    olm-visibility: hidden
    openshift-marketplace: "true"
    opsrc-datastore: "true"
    opsrc-provider: community
  name: community-operators
  namespace: openshift-marketplace
  resourceVersion: "106325"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/openshift-marketplace/catalogsources/community-operators
  uid: 6084e6b3-5d13-458d-9cca-a68f02ee36de
spec:
  displayName: Community Operators
  icon:
    base64data: ""
    mediatype: ""
  image: registry.redhat.io/redhat/community-operator-index:v4.6
  priority: -400
  publisher: Red Hat
  sourceType: grpc
  updateStrategy:
    registryPoll:
      interval: 10m0s
status:
  connectionState:
    address: '..svc:'
    lastConnect: "2021-11-24T11:56:04Z"
    lastObservedState: TRANSIENT_FAILURE
  latestImageRegistryPoll: "2021-11-24T11:52:03Z"
  registryService:
    createdAt: "2021-11-24T08:39:48Z"
    protocol: grpc

zhaoxia@xzha-mac JIRA-2196 % oc get packagemanifests | grep -i comm
zhaoxia@xzha-mac JIRA-2196 % 


Version-Release number of selected component (if applicable):
zhaoxia@xzha-mac JIRA-2196 % oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2021-11-22-174225   True        False         23m     Cluster version is 4.6.0-0.nightly-2021-11-22-174225


How reproducible:
not always

Steps to Reproduce:
1. upgrade path: 4.5.41-x86_64--> 4.6.0-0.nightly-2021-11-22-174225
2. check catsrc
3.

Actual results:
address: '..svc:'

Expected results:
address have the correct value.

Additional info:
If I delete the pod community-operators-crxrv, after the new pod is created, the address is correct.

Comment 3 Alexander Greene 2021-11-25 17:29:42 UTC
A fix can be found here: https://github.com/operator-framework/operator-lifecycle-manager/pull/2499

Comment 5 Kevin Rizza 2022-01-05 19:45:58 UTC
*** Bug 1949279 has been marked as a duplicate of this bug. ***

Comment 9 xzha 2022-01-28 04:49:57 UTC
verify:

upgrade to 4.10.0-0.nightly-2022-01-27-144113, no such issue, address is correct.

zhaoxia@xzha-mac ~ % oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-27-144113   True        False         29m     Cluster version is 4.10.0-0.nightly-2022-01-27-144113

zhaoxia@xzha-mac ~ % oc get catsrc -A -o yaml| grep address
      address: certified-operators.openshift-marketplace.svc:50051
      address: community-operators.openshift-marketplace.svc:50051
      address: qe-app-registry.openshift-marketplace.svc:50051
      address: redhat-marketplace.openshift-marketplace.svc:50051
      address: redhat-operators.openshift-marketplace.svc:50051

Check latest upgrade ci result, no such issue.

LGTM, verified.

Comment 12 errata-xmlrpc 2022-03-12 04:39:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056