Description of problem: Failed to install quay-operator due to the "found more than one head for channel". I know that's a bug of the quay-operator itslef, not the OLM issue. But, I'd like to use this bug to improve the error info so that the end user can clear know where is wrong. Version-Release number of selected component (if applicable): 4.6 [root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-86f6ccdcd4-42l56 -- olm --version OLM version: 0.16.0 git commit: d2dc60abcc554058497d6feb85bd6e26a0fec338 How reproducible: always Steps to Reproduce: 1. Install the OCP 4.6. [root@preserve-olm-env data]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-08-31-220837 True False 47h Cluster version is 4.6.0-0.nightly-2020-08-31-220837 2. Install the quay-operator on the Webconsole. 3. Check the quay-operator. [root@preserve-olm-env data]# oc get sub -n default NAME PACKAGE SOURCE CHANNEL quay-operator quay-operator redhat-operators quay-v3.3 [root@preserve-olm-env data]# oc get ip -n default No resources found in default namespace. Actual results: No InstallPlan generated. Check the OLM logs and get errors below: time="2020-09-03T05:34:55Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/default/subscriptions/quay-operator E0903 05:34:55.578024 1 queueinformer_operator.go:290] sync "default" failed: found more than one head for channel Expected results: 1, The quay-operator can be installed successfully. A bundle without a replacement is a channel head. Should modify the bundle files to add the "replaces" or "skipRange" fields. Feel free to transfer it to the Quay team. 2, Show more details about the channel's head. Additional info:
For the quay-operator problem, there 3 bundles without the `replaces`: quay-bridge-operator.v3.3.0, red-hat-quay.v3.3.0, container-security-operator.v3.3.0. [root@preserve-olm-env data]# oc cp redhat-operators-9rwhc:/database/index.db dex.db tar: Removing leading `/' from member names [root@preserve-olm-env data]# sqlite3 dex.db SQLite version 3.7.17 2013-05-20 00:56:22 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .header on sqlite> .mode column sqlite> .table api channel operatorbundle related_image api_provider channel_entry package schema_migrations api_requirer dependencies properties sqlite> .width 20 20 60 60 10 10 sqlite> select * from channel_entry; entry_id channel_name package_name operatorbundle_name replaces depth -------------------- -------------------- ------------------------------------------------------------ ------------------------------------------------------------ ---------- ---------- 3 stable jaeger-product jaeger-operator.v1.17.6 0 4 1.13-stable jaeger-product jaeger-operator.v1.13.2-1 0 5 1.17-stable jaeger-product jaeger-operator.v1.17.6 0 9 quay-v3.3 quay-bridge-operator quay-bridge-operator.v3.3.1 10 0 10 quay-v3.3 quay-bridge-operator quay-bridge-operator.v3.3.0 1 15 quay-v3.3 quay-operator red-hat-quay.v3.3.1 16 0 16 quay-v3.3 quay-operator red-hat-quay.v3.3.0 1 17 quay-v3.3 container-security-operator container-security-operator.v3.3.1 18 0 18 quay-v3.3 container-security-operator container-security-operator.v3.3.0 1 But, we specify the pakcagename is "quay-operator" in the subscription of the quay-operator. In this pakcage_name(quay-operator), only on head bundle: red-hat-quay.v3.3.0. Based on my understadning, OLM should check the head bundle based on the pakcagename, not the channel. [root@preserve-olm-env data]# oc get sub quay-operator -n default -o yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: ... spec: channel: quay-v3.3 installPlanApproval: Automatic name: quay-operator source: redhat-operators sourceNamespace: openshift-marketplace startingCSV: red-hat-quay.v3.3.1 ...
Adjust the width in order more clear for you. sqlite> .width 10 10 30 30 10 10 sqlite> select * from channel_entry; entry_id channel_na package_name operatorbundle_name replaces depth ---------- ---------- ------------------------------ ------------------------------ ---------- ---------- 3 stable jaeger-product jaeger-operator.v1.17.6 0 4 1.13-stabl jaeger-product jaeger-operator.v1.13.2-1 0 5 1.17-stabl jaeger-product jaeger-operator.v1.17.6 0 9 quay-v3.3 quay-bridge-operator quay-bridge-operator.v3.3.1 10 0 10 quay-v3.3 quay-bridge-operator quay-bridge-operator.v3.3.0 1 15 quay-v3.3 quay-operator red-hat-quay.v3.3.1 16 0 16 quay-v3.3 quay-operator red-hat-quay.v3.3.0 1 17 quay-v3.3 container-security-operator container-security-operator.v3 18 0 18 quay-v3.3 container-security-operator container-security-operator.v3 1 168 amq-stream amq-streams amqstreams.v1.5.3 169 0 169 amq-stream amq-streams amqstreams.v1.5.2 170 1 170 amq-stream amq-streams amqstreams.v1.5.1 171 2 ... Based on my understanding, the OLM supports multi packages in a channle, so OLM should check the head bundle based on the pakcage, not the channel. Correct me if I'm wrong, thanks!
Tried to fix this issue: https://github.com/operator-framework/operator-lifecycle-manager/pull/1748, any comments are welcome! Thanks!
Hi Jian, It is okay to have a channel with the same name in multiple packages, but the "head" specifically refers to the latest version in a single package's channel. At the moment, almost all of the "more than one head in channel" cases are caused by https://bugzilla.redhat.com/show_bug.cgi?id=1866861 -- the 4.6 index images are currently built from 4.5 opm, which is not providing complete information to the resolver in catalog-operator. The list of bundles passed to "sortChannel" should only ever contain bundles from a single package/channel combination. If there's a case where that doesn't happen, then there's a mistake in the caller, but I haven't observed that.
Hi Ben, Thanks for your information! > The list of bundles passed to "sortChannel" should only ever contain bundles from a single package/channel combination. If there's a case where that doesn't happen, then there's a mistake in the caller, but I haven't observed that. I'd suggest that we print the "headCandidates" in the error logs so that the end users can clealy know where is wrong.
> I'd suggest that we print the "headCandidates" in the error logs so that the end users can clealy know where is wrong. Yes, I agree. That would be helpful.
*** This bug has been marked as a duplicate of bug 1869441 ***
[root@preserve-olm-env data]# oc cp redhat-operators-dmknm:/database/index.db quay.db tar: Removing leading `/' from member names [root@preserve-olm-env data]# [root@preserve-olm-env data]# sqlite3 quay.db SQLite version 3.7.17 2013-05-20 00:56:22 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .header on sqlite> .mode column sqlite> .table api channel operatorbundle related_image api_provider channel_entry package schema_migrations api_requirer dependencies properties sqlite> .width 10 10 30 30 10 10 sqlite> select * from channel_entry; entry_id channel_na package_name operatorbundle_name replaces depth ---------- ---------- ------------------------------ ------------------------------ ---------- ---------- 8 stable jaeger-product jaeger-operator.v1.17.6 0 9 1.13-stabl jaeger-product jaeger-operator.v1.13.2-1 0 10 1.17-stabl jaeger-product jaeger-operator.v1.17.6 0 21 quay-v3.3 quay-bridge-operator quay-bridge-operator.v3.3.1 22 0 22 quay-v3.3 quay-bridge-operator quay-bridge-operator.v3.3.0 1 23 quay-v3.3 quay-operator red-hat-quay.v3.3.1 24 0 24 quay-v3.3 quay-operator red-hat-quay.v3.3.0 1 31 quay-v3.3 container-security-operator container-security-operator.v3 32 0 32 quay-v3.3 container-security-operator container-security-operator.v3 1 It did work well now, I reopen it just for PR(Address the error logs there) mergeing workflow. [root@preserve-olm-env data]# oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL default quay-operator quay-operator redhat-operators quay-v3.3 [root@preserve-olm-env data]# oc get ip -n default NAME CSV APPROVAL APPROVED install-fth4l red-hat-quay.v3.3.1 Automatic true [root@preserve-olm-env data]# oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE red-hat-quay.v3.3.1 Red Hat Quay 3.3.1 red-hat-quay.v3.3.0 Succeeded
The Quay operator can be installed successfuuly, LGTM. [root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-66999f6d68-q8wg2 -- olm --version OLM version: 0.16.1 git commit: e2c0f2c47573ec5dfc509502881fa3dd8eb7bae9 [root@preserve-olm-env data]# oc rsh redhat-operators-fbjf9 sh-4.2$ ls db-954327980 sh-4.2$ sqlite3 db-954327980 SQLite version 3.7.17 2013-05-20 00:56:22 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .header on sqlite> .mode column sqlite> .width 10 10 30 30 10 10 sqlite> select * from channel_entry; entry_id channel_na package_name operatorbundle_name replaces depth ... 72 quay-v3.3 container-security-operator container-security-operator.v3 73 0 73 quay-v3.3 container-security-operator container-security-operator.v3 1 76 quay-v3.3 quay-bridge-operator quay-bridge-operator.v3.3.1 77 0 77 quay-v3.3 quay-bridge-operator quay-bridge-operator.v3.3.0 1 [root@preserve-olm-env data]# oc get sub -n default NAME PACKAGE SOURCE CHANNEL quay-operator quay-operator redhat-operators quay-v3.3 [root@preserve-olm-env data]# oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE red-hat-quay.v3.3.2-20200903 Red Hat Quay 3.3.2-20200903 Succeeded [root@preserve-olm-env data]# oc get ip -n default NAME CSV APPROVAL APPROVED install-mkqvn red-hat-quay.v3.3.2-20200903 Automatic true @Ben For the head channel checking, I couldn't find an appropriate operator bundle, do you know how to make one? Thanks very much!
Hi Jian, The index bug that was triggering this error have been fixed, but you should be able to reproduce by checking out operator-registry at commit fb73be25997ffec408400e01cdab9855007a37ac, building an image from upstream-builder.Dockerfile, then using the image you just built as the --binary-image option to "opm index add".
Hi Ben, Thanks for your information! I tied it, but it works well, details as follows. Seems like it still cannot reproduce the head channel issues, any suggestions? Thanks! 1) back to the problem commit [root@preserve-olm-env operator-registry]# git reset --hard fb73be25997ffec408400e01cdab9855007a37ac HEAD is now at fb73be2 Merge pull request #355 from openshift-cherrypick-robot/cherry-pick-353-to-release-4.5 [root@preserve-olm-env operator-registry]# [root@preserve-olm-env operator-registry]# git log commit fb73be25997ffec408400e01cdab9855007a37ac Merge: a146011 b811e50 Author: OpenShift Merge Robot <openshift-merge-robot.github.com> Date: Wed Jun 17 11:37:08 2020 +0200 ... 2) Compile it, and check the version. [root@preserve-olm-env operator-registry]# ./bin/opm version Version: version.Version{OpmVersion:"1.12.3", GitCommit:"fb73be2", BuildDate:"2020-10-21T08:01:46Z", GoOs:"linux", GoArch:"amd64"} 3) Build the Builder image [root@preserve-olm-env operator-registry]# cat upstream-builder.Dockerfile FROM golang:1.13-alpine RUN apk update && apk add sqlite build-base git mercurial bash WORKDIR /build COPY vendor vendor COPY cmd cmd COPY pkg pkg COPY Makefile Makefile COPY go.mod go.mod RUN make static RUN GRPC_HEALTH_PROBE_VERSION=v0.2.1 && \ wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-$(go env GOARCH) && \ chmod +x /bin/grpc_health_probe RUN cp /build/bin/opm /bin/opm && \ cp /build/bin/initializer /bin/initializer && \ cp /build/bin/appregistry-server /bin/appregistry-server && \ cp /build/bin/configmap-server /bin/configmap-server && \ cp /build/bin/registry-server /bin/registry-server [root@preserve-olm-env operator-registry]# docker build -f upstream-builder.Dockerfile -t quay.io/olmqe/builder:bug1875247 . Sending build context to Docker daemon 789MB Step 1/11 : FROM golang:1.13-alpine ... [root@preserve-olm-env operator-registry]# docker push quay.io/olmqe/builder:bug1875247 The push refers to repository [quay.io/olmqe/builder] ... 4) Create a bundle image. For example, etcd 0.9.4 [root@preserve-olm-env operator-registry]# ./bin/opm alpha bundle build -d /data/goproject/src/github.com/operator-framework/operator-registry/manifests/etcd/0.9.4 -p etcd -c alpha -e alpha -t quay.io/olmqe/etcd-bundle:0.9.4-1875247 -o INFO[0000] Building annotations.yaml ... [root@preserve-olm-env operator-registry]# docker push quay.io/olmqe/etcd-bundle:0.9.4-1875247 The push refers to repository [quay.io/olmqe/etcd-bundle] ... 5) Add this bundle image to the index image based on the above Builder image [root@preserve-olm-env operator-registry]# ./bin/opm index add -i quay.io/olmqe/builder:bug1875247 -b quay.io/olmqe/etcd-bundle:0.9.4-1875247 -t quay.io/olmqe/etcd-index:0.9.4-1875247 INFO[0000] building the index bundles="[quay.io/olmqe/etcd-bundle:0.9.4-1875247]" ... [root@preserve-olm-env operator-registry]# podman push quay.io/olmqe/etcd-index:0.9.4-1875247 ... 6) Subscribe the etcd 0.9.4, it works well. [root@preserve-olm-env data]# oc get sub -n default NAME PACKAGE SOURCE CHANNEL etcd etcd etcd-test alpha [root@preserve-olm-env data]# oc get ip -n default NAME CSV APPROVAL APPROVED install-fqcvg etcdoperator.v0.9.4 Automatic true [root@preserve-olm-env data]# oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.4 etcd 0.9.4 Succeeded 7) Add etcd 0.9.5 bundle image to this index image based on this Builder image. [root@preserve-olm-env operator-registry]# ./bin/opm index add -i quay.io/olmqe/builder:bug1875247 -b quay.io/olmqe/etcd-bundle:0.9.5 -f quay.io/olmqe/etcd-index:0.9.4-1875247 -t quay.io/olmqe/etcd-index:0.9.4-1875247 -p podman INFO[0000] building the index bundles="[quay.io/olmqe/etcd-bundle:0.9.5]" ... [root@preserve-olm-env operator-registry]# podman push quay.io/olmqe/etcd-index:0.9.4-1875247 Getting image source signatures 8) Subscribe the etcd 0.9.5, it works well. [root@preserve-olm-env data]# oc get sub -n default NAME PACKAGE SOURCE CHANNEL etcd etcd etcd-test 4.7 [root@preserve-olm-env data]# oc get ip -n default NAME CSV APPROVAL APPROVED install-hml4d etcdoperator.v0.9.5 Automatic true [root@preserve-olm-env data]# oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.5 etcd 0.9.5 Succeeded Anyway, I verified it first since the operator can be subscribed well.
Hi Jian, the bundles seem to be in different channels (alpha and 4.7), so they are both unique channel heads.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633
Just for information Bug 2015022 opened to request the backport of this fix to OCP 4.6 EUS.