Bug 2060837

Summary: [oc-mirror] Catalog merging error when two or more bundles does not have a set Replace field
Description of problem:
Description of problem:

I am facing problems when running the redhat-operators catalogsource built with oc-mirror. Looks like it has problems when a new release of a selected operator is added to the previously built catalogsource and does not have a Replace field such as the sriov-network-operator or the cluster-logging operator.  

oc logs -f redhat-operators-7zxnf 
Error: could not build index model from declarative config: invalid index:
├── invalid package "sriov-network-operator":
│   ├── invalid channel "4.9":
│   │   └── multiple channel heads found in graph: sriov-network-operator.4.9.0-202202150216, sriov-network-operator.4.9.0-202202211206
│   └── invalid channel "stable":
│       └── multiple channel heads found in graph: sriov-network-operator.4.9.0-202202150216, sriov-network-operator.4.9.0-202202211206
└── invalid package "cluster-logging":
    ├── invalid channel "stable":
    │   └── multiple channel heads found in graph: cluster-logging.5.3.4-13, cluster-logging.5.3.5-20
    ├── invalid channel "stable-5.2":
    │   └── multiple channel heads found in graph: cluster-logging.5.2.7-18, cluster-logging.5.2.8-21
    └── invalid channel "stable-5.3":
        └── multiple channel heads found in graph: cluster-logging.5.3.4-13, cluster-logging.5.3.5-20
  opm serve <source_path> [flags]

      --debug                    enable debug logging
  -h, --help                     help for serve
  -p, --port string              port number to serve on (default "50051")
  -t, --termination-log string   path to a container termination log file (default "/dev/termination-log")

Global Flags:
      --skip-tls   skip TLS certificate verification for container image registries while pulling bundles or index

My workaround is to delete the catalogsource image in the registry and mirror it again. 

Version-Release number of selected component (if applicable):

How reproducible:
Always with the following oc-mirror version:

Client Version: version.Info{Major:"0", Minor:"2", GitVersion:"v0.2.0-alpha.1", GitCommit:"", GitTreeState:"clean", BuildDate:"2022-02-24T09:01:48Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/amd64"}

Steps to Reproduce:

1. Create a mirrored content from this imageset with both releases and operators

apiVersion: mirror.openshift.io/v1alpha1
kind: ImageSetConfiguration
    imageURL: fx2-1a.cloud.lab.eng.bos.redhat.com:8443/ocp4/openshift4
    skipTLS: true
  #  path: /path/to/dir
    - name: quay.io/alosadag/tc:latest
    - name: quay.io/alosadag/troubleshoot:latest
    - name: quay.io/alosadag/trex:v2.95
    - name: quay.io/alosadag/testpmd:21.11.1
    #- name: registry.redhat.io/rhacm2/assisted-installer-agent-rhel8:v2.4
    #- name: registry.redhat.io/rhacm2/assisted-installer-rhel8:v2.4
      - name: stable-4.9
          - 4.9.12
          - 4.9.19
          - 4.9.21
    graph: true
    - catalog: registry.redhat.io/redhat/redhat-operator-index:v4.9 # References entire catalog
      headsOnly: false # References latest version of each operator in catalog (true is the default value and can be omitted)
        - name: advanced-cluster-management
             - name: 'release-2.4'
        - name: local-storage-operator
             - name: 'stable'
        - name: ocs-operator
            - name: 'stable-4.9'
        - name: performance-addon-operator
            - name: '4.9'
        - name: ptp-operator
            - name: '4.9'
        - name: sriov-network-operator
            - name: '4.9'
        - name: cluster-logging
            - name: 'stable'
    - catalog: registry.redhat.io/redhat/certified-operator-index:v4.9
      headsOnly: false # References latest version of each operator in catalog (true is the default value and can be omitted)
        - name: sriov-fec
            - name: 'stable'
    - catalog: registry.redhat.io/redhat/community-operator-index:v4.9 
      headsOnly: false # References latest version of each operator in catalog (true is the default value and can be omitted)
        - name: hive-operator
            - name: 'ocm-2.5'

2. Run again the same imagesetconfiguration when a new version of at least either the sriov or the cluster logging operator is released.
3. The mirror is done correctly

Actual results:
The redhat-operators Pod with the redhat-operators catalogsource does not start properly 

oc get pods
NAME                                                              READY   STATUS             RESTARTS         AGE
4aa84e6b6cad28070402c0dfcd8a700c65e7d45bc1420b68a9327d--1-mx95h   0/1     Completed          0                93m
certified-operators-g42zz                                         1/1     Running            0                98m
community-operators-tfzxg                                         1/1     Running            0                98m
marketplace-operator-6dfcd5b64c-9nkd6                             1/1     Running            2 (120m ago)     128m
redhat-operators-7zxnf                                            0/1     CrashLoopBackOff   23 (4m29s ago)   98m
oc logs -f redhat-operators-7zxnf 
Error: could not build index model from declarative config: invalid index:
├── invalid package "sriov-network-operator":
│   ├── invalid channel "4.9":
│   │   └── multiple channel heads found in graph: sriov-network-operator.4.9.0-202202150216, sriov-network-operator.4.9.0-202202211206
│   └── invalid channel "stable":
│       └── multiple channel heads found in graph: sriov-network-operator.4.9.0-202202150216, sriov-network-operator.4.9.0-202202211206
└── invalid package "cluster-logging":
    ├── invalid channel "stable":
    │   └── multiple channel heads found in graph: cluster-logging.5.3.4-13, cluster-logging.5.3.5-20
    ├── invalid channel "stable-5.2":
    │   └── multiple channel heads found in graph: cluster-logging.5.2.7-18, cluster-logging.5.2.8-21
    └── invalid channel "stable-5.3":
        └── multiple channel heads found in graph: cluster-logging.5.3.4-13, cluster-logging.5.3.5-20
  opm serve <source_path> [flags]

      --debug                    enable debug logging
  -h, --help                     help for serve
  -p, --port string              port number to serve on (default "50051")
  -t, --termination-log string   path to a container termination log file (default "/dev/termination-log")

Global Flags:
      --skip-tls   skip TLS certificate verification for container image registries while pulling bundles or index

Expected results:
redhat-operators catalogSource is exposed correctly.

Additional info:

A workaround is to delete the catalogsource container image in the registry and mirror it again

Comment 1 Alberto Losada 2022-03-04 11:01:18 UTC
More information can be found in this slack thread: https://coreos.slack.com/archives/C02JW6VCYS1/p1646332535137299

Comment 2 Alex 2022-03-04 21:58:31 UTC
Thank you for bringing this to our attention @alosadag . We have identified a possible solution and will work to release it as soon as possible. More information to follow.

Comment 5 zhou ying 2022-03-24 12:12:39 UTC
checked with latest oc-mirror , can't reproduce the issue now : 

1. Create 2 registry server ;
2. Copy the older catalog to the connected registry:
    `skopeo copy docker://registry.redhat.io/redhat/redhat-operator-index:v4.9-1646041330 docker://localhost:5000/fake/redhat-operator-index:v4.9`
3. Run oc-mirror with below config:
apiVersion: mirror.openshift.io/v1alpha2
kind: ImageSetConfiguration
    path: metadata
    - catalog: localhost:5000/fake/redhat-operator-index:v4.9
      headsOnly: false # References latest version of each operator in catalog (true is the default value and can be omitted)
        - name: sriov-network-operator
            - name: '4.9'
        - name: cluster-logging
            - name: 'stable'

   `oc-mirror --config imageset.yaml docker://ec2-52-15-159-170.us-east-2.compute.amazonaws.com:5000/zhouy  --source-use-http`

4. Create the CatalogSource and ICSP;
5. Check the redhat-operator pod:
[root@localhost results-1648107735]# oc get pods
NAME                                    READY   STATUS             RESTARTS   AGE
redhat-operator-index-q9slv             1/1     Running            0          4m19s

6. Copy the newer catalog to the same image name to the connected registry
  `skopeo copy docker://registry.redhat.io/redhat/redhat-operator-index:v4.9 docker://localhost:5000/fake/redhat-operator-index:v4.9`

7. Run oc-mirror again with the same config

8. Check the pod still running well:
[root@localhost results-1648107735]# oc get pods
NAME                                    READY   STATUS             RESTARTS   AGE
redhat-operator-index-q9slv             1/1     Running            0          3h31m

Comment 7 errata-xmlrpc 2022-08-10 10:52:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.
