Bug 1875247 - Dependency resolution error "found more than one head for channel" is unhelpful for users
Summary: Dependency resolution error "found more than one head for channel" is unhelpf...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.6
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.7.0
Assignee: Evan Cordell
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-03 06:22 UTC by Jian Zhang
Modified: 2022-10-11 09:38 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-14 16:12:43 UTC
Target Upstream Version:
Embargoed:
davegord: needinfo-
davegord: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-lifecycle-manager pull 1748 0 None closed Bug 1875247: Add package check for the channel head querying 2021-02-12 15:55:51 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:17:26 UTC

Description Jian Zhang 2020-09-03 06:22:36 UTC
Description of problem:
Failed to install quay-operator due to the "found more than one head for channel".
I know that's a bug of the quay-operator itslef, not the OLM issue.
But, I'd like to use this bug to improve the error info so that the end user can clear know where is wrong. 

Version-Release number of selected component (if applicable):
4.6
[root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-86f6ccdcd4-42l56 -- olm --version
OLM version: 0.16.0
git commit: d2dc60abcc554058497d6feb85bd6e26a0fec338

How reproducible:
always

Steps to Reproduce:
1. Install the OCP 4.6.
[root@preserve-olm-env data]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-08-31-220837   True        False         47h     Cluster version is 4.6.0-0.nightly-2020-08-31-220837

2. Install the quay-operator on the Webconsole.

3. Check the quay-operator.
[root@preserve-olm-env data]# oc get sub -n default
NAME            PACKAGE         SOURCE             CHANNEL
quay-operator   quay-operator   redhat-operators   quay-v3.3
[root@preserve-olm-env data]# oc get ip -n default
No resources found in default namespace.


Actual results:
No InstallPlan generated. Check the OLM logs and get errors below:
time="2020-09-03T05:34:55Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/default/subscriptions/quay-operator
E0903 05:34:55.578024       1 queueinformer_operator.go:290] sync "default" failed: found more than one head for channel

Expected results:
1, The quay-operator can be installed successfully. A bundle without a replacement is a channel head. Should modify the bundle files to add the "replaces" or "skipRange" fields. Feel free to transfer it to the Quay team. 

2, Show more details about the channel's head.

Additional info:

Comment 1 Jian Zhang 2020-09-03 06:59:39 UTC
For the quay-operator problem, there 3 bundles without the `replaces`: quay-bridge-operator.v3.3.0, red-hat-quay.v3.3.0, container-security-operator.v3.3.0. 

[root@preserve-olm-env data]# oc cp redhat-operators-9rwhc:/database/index.db dex.db
tar: Removing leading `/' from member names
[root@preserve-olm-env data]# sqlite3 dex.db 
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .header on
sqlite> .mode column
sqlite> .table
api                channel            operatorbundle     related_image    
api_provider       channel_entry      package            schema_migrations
api_requirer       dependencies       properties       
sqlite> .width 20 20 60 60 10 10 
sqlite> select * from channel_entry;
entry_id              channel_name          package_name                                                  operatorbundle_name                                           replaces    depth     
--------------------  --------------------  ------------------------------------------------------------  ------------------------------------------------------------  ----------  ----------
3                     stable                jaeger-product                                                jaeger-operator.v1.17.6                                                   0         
4                     1.13-stable           jaeger-product                                                jaeger-operator.v1.13.2-1                                                 0         
5                     1.17-stable           jaeger-product                                                jaeger-operator.v1.17.6                                                   0         
9                     quay-v3.3             quay-bridge-operator                                          quay-bridge-operator.v3.3.1                                   10          0         
10                    quay-v3.3             quay-bridge-operator                                          quay-bridge-operator.v3.3.0                                               1         
15                    quay-v3.3             quay-operator                                                 red-hat-quay.v3.3.1                                           16          0         
16                    quay-v3.3             quay-operator                                                 red-hat-quay.v3.3.0                                                       1         
17                    quay-v3.3             container-security-operator                                   container-security-operator.v3.3.1                            18          0         
18                    quay-v3.3             container-security-operator                                   container-security-operator.v3.3.0                                        1    


But, we specify the pakcagename is "quay-operator" in the subscription of the quay-operator. In this pakcage_name(quay-operator), only on head bundle: red-hat-quay.v3.3.0.

Based on my understadning, OLM should check the head bundle based on the pakcagename, not the channel. 


[root@preserve-olm-env data]# oc get sub quay-operator -n default -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
...
spec:
  channel: quay-v3.3
  installPlanApproval: Automatic
  name: quay-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  startingCSV: red-hat-quay.v3.3.1
...

Comment 2 Jian Zhang 2020-09-03 07:05:29 UTC
Adjust the width in order more clear for you.

sqlite> .width 10 10 30 30 10 10
sqlite> select * from channel_entry;
entry_id    channel_na  package_name                    operatorbundle_name             replaces    depth     
----------  ----------  ------------------------------  ------------------------------  ----------  ----------
3           stable      jaeger-product                  jaeger-operator.v1.17.6                     0         
4           1.13-stabl  jaeger-product                  jaeger-operator.v1.13.2-1                   0         
5           1.17-stabl  jaeger-product                  jaeger-operator.v1.17.6                     0         
9           quay-v3.3   quay-bridge-operator            quay-bridge-operator.v3.3.1     10          0         
10          quay-v3.3   quay-bridge-operator            quay-bridge-operator.v3.3.0                 1         
15          quay-v3.3   quay-operator                   red-hat-quay.v3.3.1             16          0         
16          quay-v3.3   quay-operator                   red-hat-quay.v3.3.0                         1         
17          quay-v3.3   container-security-operator     container-security-operator.v3  18          0         
18          quay-v3.3   container-security-operator     container-security-operator.v3              1 
168         amq-stream  amq-streams                     amqstreams.v1.5.3               169         0         
169         amq-stream  amq-streams                     amqstreams.v1.5.2               170         1         
170         amq-stream  amq-streams                     amqstreams.v1.5.1               171         2
...

Based on my understanding, the OLM supports multi packages in a channle, so OLM should check the head bundle based on the pakcage, not the channel. Correct me if I'm wrong, thanks!

Comment 3 Jian Zhang 2020-09-03 09:10:25 UTC
Tried to fix this issue: https://github.com/operator-framework/operator-lifecycle-manager/pull/1748, any comments are welcome! Thanks!

Comment 4 Ben Luddy 2020-09-03 13:28:11 UTC
Hi Jian,

It is okay to have a channel with the same name in multiple packages, but the "head" specifically refers to the latest version in a single package's channel.

At the moment, almost all of the "more than one head in channel" cases are caused by https://bugzilla.redhat.com/show_bug.cgi?id=1866861 -- the 4.6 index images are currently built from 4.5 opm, which is not providing complete information to the resolver in catalog-operator.

The list of bundles passed to "sortChannel" should only ever contain bundles from a single package/channel combination. If there's a case where that doesn't happen, then there's a mistake in the caller, but I haven't observed that.

Comment 6 Jian Zhang 2020-09-04 06:26:34 UTC
Hi Ben,

Thanks for your information! 
> The list of bundles passed to "sortChannel" should only ever contain bundles from a single package/channel combination. If there's a case where that doesn't happen, then there's a mistake in the caller, but I haven't observed that.

I'd suggest that we print the "headCandidates" in the error logs so that the end users can clealy know where is wrong.

Comment 8 Ben Luddy 2020-09-14 13:33:49 UTC
> I'd suggest that we print the "headCandidates" in the error logs so that the end users can clealy know where is wrong.

Yes, I agree. That would be helpful.

Comment 9 Evan Cordell 2020-09-16 16:58:07 UTC

*** This bug has been marked as a duplicate of bug 1869441 ***

Comment 10 Jian Zhang 2020-09-18 10:34:53 UTC
[root@preserve-olm-env data]# oc cp redhat-operators-dmknm:/database/index.db quay.db
tar: Removing leading `/' from member names
[root@preserve-olm-env data]# 
[root@preserve-olm-env data]# sqlite3 quay.db 
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .header on
sqlite> .mode column
sqlite> .table
api                channel            operatorbundle     related_image    
api_provider       channel_entry      package            schema_migrations
api_requirer       dependencies       properties       
sqlite> .width 10 10 30 30 10 10
sqlite> select * from channel_entry;
entry_id    channel_na  package_name                    operatorbundle_name             replaces    depth     
----------  ----------  ------------------------------  ------------------------------  ----------  ----------
8           stable      jaeger-product                  jaeger-operator.v1.17.6                     0         
9           1.13-stabl  jaeger-product                  jaeger-operator.v1.13.2-1                   0         
10          1.17-stabl  jaeger-product                  jaeger-operator.v1.17.6                     0         
21          quay-v3.3   quay-bridge-operator            quay-bridge-operator.v3.3.1     22          0         
22          quay-v3.3   quay-bridge-operator            quay-bridge-operator.v3.3.0                 1         
23          quay-v3.3   quay-operator                   red-hat-quay.v3.3.1             24          0         
24          quay-v3.3   quay-operator                   red-hat-quay.v3.3.0                         1         
31          quay-v3.3   container-security-operator     container-security-operator.v3  32          0         
32          quay-v3.3   container-security-operator     container-security-operator.v3              1 


It did work well now, I reopen it just for PR(Address the error logs there) mergeing workflow. 
[root@preserve-olm-env data]# oc get sub -A
NAMESPACE   NAME            PACKAGE         SOURCE             CHANNEL
default     quay-operator   quay-operator   redhat-operators   quay-v3.3
[root@preserve-olm-env data]# oc get ip -n default
NAME            CSV                   APPROVAL    APPROVED
install-fth4l   red-hat-quay.v3.3.1   Automatic   true

[root@preserve-olm-env data]# oc get csv -n default
NAME                  DISPLAY        VERSION   REPLACES              PHASE
red-hat-quay.v3.3.1   Red Hat Quay   3.3.1     red-hat-quay.v3.3.0   Succeeded

Comment 13 Jian Zhang 2020-10-20 11:03:21 UTC
The Quay operator can be installed successfuuly, LGTM.
[root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-66999f6d68-q8wg2 -- olm --version
OLM version: 0.16.1
git commit: e2c0f2c47573ec5dfc509502881fa3dd8eb7bae9

[root@preserve-olm-env data]# oc rsh redhat-operators-fbjf9 
sh-4.2$ ls
db-954327980
sh-4.2$ sqlite3 db-954327980 
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .header on
sqlite> .mode column
sqlite> .width 10 10 30 30 10 10
sqlite> select * from channel_entry;
entry_id    channel_na  package_name                    operatorbundle_name             replaces    depth  
...
72          quay-v3.3   container-security-operator     container-security-operator.v3  73          0         
73          quay-v3.3   container-security-operator     container-security-operator.v3              1         
76          quay-v3.3   quay-bridge-operator            quay-bridge-operator.v3.3.1     77          0         
77          quay-v3.3   quay-bridge-operator            quay-bridge-operator.v3.3.0                 1 

[root@preserve-olm-env data]# oc get sub -n default
NAME            PACKAGE         SOURCE             CHANNEL
quay-operator   quay-operator   redhat-operators   quay-v3.3
[root@preserve-olm-env data]# oc get csv -n default
NAME                           DISPLAY        VERSION          REPLACES   PHASE
red-hat-quay.v3.3.2-20200903   Red Hat Quay   3.3.2-20200903              Succeeded
[root@preserve-olm-env data]# oc get ip -n default
NAME            CSV                            APPROVAL    APPROVED
install-mkqvn   red-hat-quay.v3.3.2-20200903   Automatic   true

@Ben For the head channel checking, I couldn't find an appropriate operator bundle, do you know how to make one? Thanks very much!

Comment 14 Ben Luddy 2020-10-20 16:39:40 UTC
Hi Jian,

The index bug that was triggering this error have been fixed, but you should be able to reproduce by checking out operator-registry at commit fb73be25997ffec408400e01cdab9855007a37ac, building an image from upstream-builder.Dockerfile, then using the image you just built as the --binary-image option to "opm index add".

Comment 15 Jian Zhang 2020-10-21 09:36:26 UTC
Hi Ben,

Thanks for your information! I tied it, but it works well, details as follows. Seems like it still cannot reproduce the head channel issues, any suggestions? Thanks!

1) back to the problem commit
[root@preserve-olm-env operator-registry]# git reset --hard fb73be25997ffec408400e01cdab9855007a37ac
HEAD is now at fb73be2 Merge pull request #355 from openshift-cherrypick-robot/cherry-pick-353-to-release-4.5
[root@preserve-olm-env operator-registry]# 
[root@preserve-olm-env operator-registry]# git log
commit fb73be25997ffec408400e01cdab9855007a37ac
Merge: a146011 b811e50
Author: OpenShift Merge Robot <openshift-merge-robot.github.com>
Date:   Wed Jun 17 11:37:08 2020 +0200
...

2) Compile it, and check the version.
[root@preserve-olm-env operator-registry]# ./bin/opm version
Version: version.Version{OpmVersion:"1.12.3", GitCommit:"fb73be2", BuildDate:"2020-10-21T08:01:46Z", GoOs:"linux", GoArch:"amd64"}

3) Build the Builder image
[root@preserve-olm-env operator-registry]# cat upstream-builder.Dockerfile 
FROM golang:1.13-alpine

RUN apk update && apk add sqlite build-base git mercurial bash
WORKDIR /build

COPY vendor vendor
COPY cmd cmd
COPY pkg pkg
COPY Makefile Makefile
COPY go.mod go.mod
RUN make static
RUN GRPC_HEALTH_PROBE_VERSION=v0.2.1 && \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-$(go env GOARCH) && \
    chmod +x /bin/grpc_health_probe
RUN cp /build/bin/opm /bin/opm && \
    cp /build/bin/initializer /bin/initializer && \
    cp /build/bin/appregistry-server /bin/appregistry-server && \
    cp /build/bin/configmap-server /bin/configmap-server && \
    cp /build/bin/registry-server /bin/registry-server

[root@preserve-olm-env operator-registry]# docker build -f upstream-builder.Dockerfile -t quay.io/olmqe/builder:bug1875247 .
Sending build context to Docker daemon    789MB
Step 1/11 : FROM golang:1.13-alpine
...

[root@preserve-olm-env operator-registry]# docker push quay.io/olmqe/builder:bug1875247
The push refers to repository [quay.io/olmqe/builder]
...

4) Create a bundle image. For example, etcd 0.9.4
[root@preserve-olm-env operator-registry]# ./bin/opm alpha bundle build -d /data/goproject/src/github.com/operator-framework/operator-registry/manifests/etcd/0.9.4 -p etcd -c alpha -e alpha -t quay.io/olmqe/etcd-bundle:0.9.4-1875247 -o
INFO[0000] Building annotations.yaml 
...
[root@preserve-olm-env operator-registry]# docker push  quay.io/olmqe/etcd-bundle:0.9.4-1875247
The push refers to repository [quay.io/olmqe/etcd-bundle]
...

5) Add this bundle image to the index image based on the above Builder image
[root@preserve-olm-env operator-registry]# ./bin/opm index add -i quay.io/olmqe/builder:bug1875247 -b quay.io/olmqe/etcd-bundle:0.9.4-1875247 -t quay.io/olmqe/etcd-index:0.9.4-1875247 
INFO[0000] building the index                            bundles="[quay.io/olmqe/etcd-bundle:0.9.4-1875247]"
...
[root@preserve-olm-env operator-registry]# podman push quay.io/olmqe/etcd-index:0.9.4-1875247
...

6) Subscribe the etcd 0.9.4, it works well.
[root@preserve-olm-env data]# oc get sub -n default
NAME   PACKAGE   SOURCE      CHANNEL
etcd   etcd      etcd-test   alpha
[root@preserve-olm-env data]# oc get ip -n default
NAME            CSV                   APPROVAL    APPROVED
install-fqcvg   etcdoperator.v0.9.4   Automatic   true
[root@preserve-olm-env data]# oc get csv -n default
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.4   etcd      0.9.4                Succeeded


7) Add etcd 0.9.5 bundle image to this index image based on this Builder image.
[root@preserve-olm-env operator-registry]# ./bin/opm index add -i quay.io/olmqe/builder:bug1875247 -b quay.io/olmqe/etcd-bundle:0.9.5 -f quay.io/olmqe/etcd-index:0.9.4-1875247 -t quay.io/olmqe/etcd-index:0.9.4-1875247 -p podman
INFO[0000] building the index                            bundles="[quay.io/olmqe/etcd-bundle:0.9.5]"
...
[root@preserve-olm-env operator-registry]# podman push quay.io/olmqe/etcd-index:0.9.4-1875247
Getting image source signatures

8) Subscribe the etcd 0.9.5, it works well.
[root@preserve-olm-env data]# oc get sub -n default
NAME   PACKAGE   SOURCE      CHANNEL
etcd   etcd      etcd-test   4.7
[root@preserve-olm-env data]# oc get ip -n default
NAME            CSV                   APPROVAL    APPROVED
install-hml4d   etcdoperator.v0.9.5   Automatic   true
[root@preserve-olm-env data]# oc get csv -n default
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.5   etcd      0.9.5                Succeeded

Anyway, I verified it first since the operator can be subscribed well.

Comment 16 Ben Luddy 2020-11-06 02:45:07 UTC
Hi Jian, the bundles seem to be in different channels (alpha and 4.7), so they are both unique channel heads.

Comment 19 errata-xmlrpc 2021-02-24 15:17:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633

Comment 24 Pamela Escorza 2021-10-18 09:11:37 UTC
Just for information Bug 2015022  opened to request the backport of this fix to OCP 4.6 EUS.


Note You need to log in before you can comment on or make changes to this bug.