Bug 1982294
Summary: | OLM fails with 'ResolutionFailed' found multiple channel heads | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Kevin Rizza <krizza> | |
Component: | OLM | Assignee: | Kevin Rizza <krizza> | |
OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | bluddy, fdeutsch, jiazha, kliberti, rbaumgar, rgopired, stirabos, swoodman | |
Version: | 4.7 | Keywords: | Reopened | |
Target Milestone: | --- | |||
Target Release: | 4.8.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | 1981832 | |||
: | 1989712 (view as bug list) | Environment: | ||
Last Closed: | 2021-08-27 15:51:53 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1981832 | |||
Bug Blocks: | 1989712 |
Description
Kevin Rizza
2021-07-14 16:06:07 UTC
1, Install an OCP cluster that contains the fixed PR. [cloud-user@preserve-olm-env jian]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-07-24-211147 True False 29m Cluster version is 4.8.0-0.nightly-2021-07-24-211147 [cloud-user@preserve-olm-env jian]$ oc -n openshift-operator-lifecycle-manager exec deploy/catalog-operator -- olm --version OLM version: 0.17.0 git commit: 0e121cb2620547040ef455874d9c7a70e1533f2f 2, Create a CatalogSource that consumes an index image that contains the AMQ Stream v1.6.2. [cloud-user@preserve-olm-env jian]$ cat cs-amq.yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: amq-operators namespace: openshift-marketplace spec: displayName: Jian Operators image: quay.io/olmqe/rh-index:amq priority: -100 publisher: Jian sourceType: grpc updateStrategy: registryPoll: interval: 10m0s [cloud-user@preserve-olm-env jian]$ oc create -f cs-amq.yaml catalogsource.operators.coreos.com/amq-operators created [cloud-user@preserve-olm-env jian]$ oc get packagemanifest|grep Jian|grep amq amq-broker Jian Operators 3m50s amq-broker-lts Jian Operators 3m50s amq-online Jian Operators 3m50s amq-broker-rhel8 Jian Operators 3m50s amq7-interconnect-operator Jian Operators 3m50s amq-streams Jian Operators 3m50s 3, Subscribe to the amqstreams.v1.6.2. [cloud-user@preserve-olm-env jian]$ oc get sub -n default NAME PACKAGE SOURCE CHANNEL amq-streams amq-streams amq-operators amq-streams-1.6.x [cloud-user@preserve-olm-env jian]$ oc get ip -n default NAME CSV APPROVAL APPROVED install-95sft amqstreams.v1.6.2 Automatic true [cloud-user@preserve-olm-env jian]$ oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE amqstreams.v1.6.2 Red Hat Integration - AMQ Streams 1.6.2 amqstreams.v1.6.1 Installing elasticsearch-operator.5.1.1-1 OpenShift Elasticsearch Operator 5.1.1-1 Succeeded [cloud-user@preserve-olm-env jian]$ oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE amqstreams.v1.6.2 Red Hat Integration - AMQ Streams 1.6.2 amqstreams.v1.6.1 Succeeded elasticsearch-operator.5.1.1-1 OpenShift Elasticsearch Operator 5.1.1-1 Succeeded 4, Upgrade it to AMQ Streams v1.7.2 by switching the channel to `stable`. [cloud-user@preserve-olm-env jian]$ oc get sub -n default NAME PACKAGE SOURCE CHANNEL amq-streams amq-streams amq-operators stable But, there is no v1.7.x install plan generated, but, it should be. [cloud-user@preserve-olm-env jian]$ oc get ip -n default NAME CSV APPROVAL APPROVED install-95sft amqstreams.v1.6.2 Automatic true [cloud-user@preserve-olm-env jian]$ oc get ip -n default NAME CSV APPROVAL APPROVED install-95sft amqstreams.v1.6.2 Automatic true [cloud-user@preserve-olm-env jian]$ oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE amqstreams.v1.6.2 Red Hat Integration - AMQ Streams 1.6.2 amqstreams.v1.6.1 Succeeded elasticsearch-operator.5.1.1-1 OpenShift Elasticsearch Operator 5.1.1-1 Succeeded No helpful info found in the Catalog-operator logs, but: 1541 time="2021-07-26T07:27:21Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink= 1542 time="2021-07-26T07:27:22Z" level=info msg="error updating subscription status" channel=amq-streams-1.6.x error="Operation cannot be fulfi lled on subscriptions.operators.coreos.com \"amq-streams\": the object has been modified; please apply your changes to the latest version and try again" id=H2Xb5 namespace=default pkg=amq-streams source=amq-operators sub=amq-streams 1543 E0726 07:27:22.186569 1 queueinformer_operator.go:290] sync "default" failed: error updating Subscription status: Operation cannot b e fulfilled on subscriptions.operators.coreos.com "amq-streams": the object has been modified; please apply your changes to the latest ver sion and try again ... [cloud-user@preserve-olm-env jian]$ oc get sub -n default amq-streams -o yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: creationTimestamp: "2021-07-26T07:27:01Z" generation: 2 labels: operators.coreos.com/amq-streams.default: "" name: amq-streams namespace: default resourceVersion: "157444" uid: 53a123c9-c79b-4c56-969f-89125e325c17 spec: channel: stable installPlanApproval: Automatic name: amq-streams source: amq-operators sourceNamespace: openshift-marketplace startingCSV: amqstreams.v1.6.2 status: catalogHealth: - catalogSourceRef: apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource name: amq-operators namespace: openshift-marketplace resourceVersion: "153252" uid: 83b81708-e6bf-492b-8a41-8e3f58d299a1 healthy: true lastUpdated: "2021-07-26T07:27:01Z" - catalogSourceRef: apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource name: certified-operators namespace: openshift-marketplace resourceVersion: "152212" uid: 395735b0-ab35-4bb4-bf59-046e477914be healthy: true lastUpdated: "2021-07-26T07:27:01Z" - catalogSourceRef: apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource name: community-operators namespace: openshift-marketplace resourceVersion: "154640" uid: 03f59699-8e30-4285-8b91-fc789ab8c0f5 healthy: true lastUpdated: "2021-07-26T07:27:01Z" - catalogSourceRef: apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource name: qe-app-registry namespace: openshift-marketplace resourceVersion: "150961" uid: 29a843b9-4fd9-454b-928b-471f33fc3037 healthy: true lastUpdated: "2021-07-26T07:27:01Z" - catalogSourceRef: apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource name: redhat-marketplace namespace: openshift-marketplace resourceVersion: "153317" uid: 4179525d-e74d-47c2-b892-db6a69161535 healthy: true lastUpdated: "2021-07-26T07:27:01Z" - catalogSourceRef: apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource name: redhat-operators namespace: openshift-marketplace resourceVersion: "153366" uid: 1e6a7f69-85a8-4fb6-ab34-5011a6877e98 healthy: true lastUpdated: "2021-07-26T07:27:01Z" conditions: - lastTransitionTime: "2021-07-26T07:27:01Z" message: all available catalogsources are healthy reason: AllCatalogSourcesHealthy status: "False" type: CatalogSourcesUnhealthy currentCSV: amqstreams.v1.6.2 installPlanGeneration: 1 installPlanRef: apiVersion: operators.coreos.com/v1alpha1 kind: InstallPlan name: install-95sft namespace: default resourceVersion: "155243" uid: 68bc4c2f-2503-4884-982d-6a2151076479 installedCSV: amqstreams.v1.6.2 installplan: apiVersion: operators.coreos.com/v1alpha1 kind: InstallPlan name: install-95sft uuid: 68bc4c2f-2503-4884-982d-6a2151076479 lastUpdated: "2021-07-26T07:31:24Z" state: UpgradeAvailable [cloud-user@preserve-olm-env jian]$ oc cp amq-operators-jq5cn:/database/index.db bug.db tar: Removing leading `/' from member names [cloud-user@preserve-olm-env jian]$ [cloud-user@preserve-olm-env jian]$ sqlite3 bug.db SQLite version 3.26.0 2018-12-01 12:34:55 Enter ".help" for usage hints. sqlite> .header on sqlite> .mode column sqlite> .table api channel deprecated properties api_provider channel_entry operatorbundle related_image api_requirer dependencies package schema_migrations sqlite> sqlite> sqlite> select * from channel_entry; ... 6331 amq-streams-1.x amq-streams amqstreams.v1.7.2 6332 0 6332 amq-streams-1.x amq-streams amqstreams.v1.7.1 6333 1 6333 amq-streams-1.x amq-streams amqstreams.v1.7.0 6334 2 6334 amq-streams-1.x amq-streams amqstreams.v1.6.3 6335 3 6335 amq-streams-1.x amq-streams amqstreams.v1.6.2 4 6336 stable amq-streams amqstreams.v1.7.2 6337 0 6337 stable amq-streams amqstreams.v1.7.1 6338 1 6338 stable amq-streams amqstreams.v1.7.0 6339 2 6339 stable amq-streams amqstreams.v1.6.3 6340 3 6340 stable amq-streams amqstreams.v1.6.2 4 6341 amq-streams-1.7.x amq-streams amqstreams.v1.7.2 6342 0 6342 amq-streams-1.7.x amq-streams amqstreams.v1.7.1 6343 1 6343 amq-streams-1.7.x amq-streams amqstreams.v1.7.0 6344 2 6344 amq-streams-1.7.x amq-streams amqstreams.v1.6.3 6345 3 6345 amq-streams-1.7.x amq-streams amqstreams.v1.6.2 4 6346 amq-streams-1.6.x amq-streams amqstreams.v1.6.2 6347 0 6347 amq-streams-1.6.x amq-streams amqstreams.v1.6.1 6348 1 6348 amq-streams-1.6.x amq-streams amqstreams.v1.6.0 6349 2 6349 amq-streams-1.6.x amq-streams amqstreams.v1.5.4 3 Moving this back to MODIFIED. Jian, the test you're doing is not valid. There are additional properties at play beyond what's described in the channel_entry table, and the version you have installed 1.6.2 is marked as deprecated and there's no valid upgrade edge that the on cluster OLM resolver will actually act on. There's no error message in this case because that's just expected behavior from the resolver and OLM. See this thread for more details: https://coreos.slack.com/archives/GHMALGJV6/p1627488096388800 Hi Kevin,
> There are additional properties at play beyond what's described in the channel_entry table, and the version you have installed 1.6.2 is marked as deprecated and there's no valid upgrade edge that the on cluster OLM resolver will actually act on.
I only find the amqstreams.v1.6.3 is deprecated in the "deprecated" table, not the v1.6.2.
sqlite> select * from deprecated;
operatorbundle_name
-------------------
amqstreams.v1.6.3
ocs-operator.v4.5.2
ocs-operator.v4.3.0
ocs-operator.v4.4.2
kubevirt-hyperconve
amqstreams.v1.5.4
And, I subscribe to the "amqstreams.v1.6.2" from the "amq-streams-1.6.x" channel, and it works well. And then, I switch it to the `stable` channel, based on the below channel_entry table, it should be upgraded to the channel head: amqstreams.v1.7.2. Or am I missing something?
sqlite> select * from channel_entry where package_name="amq-streams";
entry_id channel_name package_name operatorbundle_name replaces depth
---------- -------------------- ------------ ------------------- ---------- ----------
6331 amq-streams-1.x amq-streams amqstreams.v1.7.2 6332 0
6332 amq-streams-1.x amq-streams amqstreams.v1.7.1 6333 1
6333 amq-streams-1.x amq-streams amqstreams.v1.7.0 6334 2
6334 amq-streams-1.x amq-streams amqstreams.v1.6.3 6335 3
6335 amq-streams-1.x amq-streams amqstreams.v1.6.2 4
6336 stable amq-streams amqstreams.v1.7.2 6337 0
6337 stable amq-streams amqstreams.v1.7.1 6338 1
6338 stable amq-streams amqstreams.v1.7.0 6339 2
6339 stable amq-streams amqstreams.v1.6.3 6340 3
6340 stable amq-streams amqstreams.v1.6.2 4
6341 amq-streams-1.7.x amq-streams amqstreams.v1.7.2 6342 0
6342 amq-streams-1.7.x amq-streams amqstreams.v1.7.1 6343 1
6343 amq-streams-1.7.x amq-streams amqstreams.v1.7.0 6344 2
6344 amq-streams-1.7.x amq-streams amqstreams.v1.6.3 6345 3
6345 amq-streams-1.7.x amq-streams amqstreams.v1.6.2 4
6346 amq-streams-1.6.x amq-streams amqstreams.v1.6.2 6347 0
6347 amq-streams-1.6.x amq-streams amqstreams.v1.6.1 6348 1
6348 amq-streams-1.6.x amq-streams amqstreams.v1.6.0 6349 2
6349 amq-streams-1.6.x amq-streams amqstreams.v1.5.4 3
Hi Kevin, Ben > fwiw the channel_entry table is not the source of truth for valid upgrade edges anymore in the olm resolver, what matters is the values of the skips skiprange and replaces fields on the bundle table, which OLM's resolver uses to recreate the graph at resolution tiime Thanks for your clarification! This "replaces" field of the "channel_name" is really confusing for the users, if it doesn't indicate the upgrade path, can we fill in a new field for the upgrade path? Thanks! It's really helpful when checking the upgrade issue. > these days i would have expected a clear resolutionfailed event text for this, but maybe i'm thinking of 4.9+ stuff here is the catalog contents for the channel of the updated subscription: {"packageName":"amq-streams","channelName":"stable","csvName":"amqstreams.v1.6.2","replaces":null,"deprecated":false} {"packageName":"amq-streams","channelName":"stable","csvName":"amqstreams.v1.6.3","replaces":"amqstreams.v1.6.2","deprecated":true} {"packageName":"amq-streams","channelName":"stable","csvName":"amqstreams.v1.7.0","replaces":"amqstreams.v1.6.3","deprecated":false} {"packageName":"amq-streams","channelName":"stable","csvName":"amqstreams.v1.7.1","replaces":"amqstreams.v1.7.0","deprecated":false} {"packageName":"amq-streams","channelName":"stable","csvName":"amqstreams.v1.7.2","replaces":"amqstreams.v1.7.1","deprecated":false} Since the only upgrade path "amqstreams.v1.6.3" is deprecated, the "amqstreams.v1.6.2" would not be upgraded to the channel head: "amqstreams.v1.7.2". That make sense, but can we alert some warning in the logs for this scenario? Thanks! Verify it first since no `multiple channel heads` error was found. In this case, because it is an upgrade, there is no failure to report: not upgrading the current version is a successful result. That said, you should see a clear error when a resolution attempt fails due to deprecation. The easiest way to reproduce this would be to attempt directly installing the deprecated version via startingCSV in a namespace with nothing already installed. I just ran that scenario and see this: resolution failed: constraints not satisfiable: subscription test-subscription requires test-catalogsource/test-namespace/stable/amqstreams.v1.6.3, subscription test-subscription exists, bundle test-catalogsource/test-namespace/stable/amqstreams.v1.6.3 is deprecated Any updates on this ticket? I notice the ticket is in the "Verified" state, has the issue been resolved? If so, what is the ETA for this reaching OCP 4.7 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.4 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2983 Moving back to assigned because I'm still getting this issue on OCP 4.8.5, please see: $ oc logs -n openshift-operator-lifecycle-manager $(oc get pods -n openshift-operator-lifecycle-manager -l=app=catalog-operator -o name) | grep "found multiple channel heads" | wc -l 354 $ oc version Client Version: 4.7.8 Server Version: 4.8.5 Kubernetes Version: v1.21.1+9807387 I see it on: multiple channel heads: [kubevirt-hyperconverged-operator.v2.5.4 kubevirt-hyperconverged-operator.v4.8.1 kubevirt-hyperconverged-operator.v2.6.1], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [kubevirt-hyperconverged-operator.v2.6.0 kubevirt-hyperconverged-operator.v2.6.5], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [kubevirt-hyperconverged-operator.v4.8.0 kubevirt-hyperconverged-operator.v2.6.1 kubevirt-hyperconverged-operator.v2.5.2], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [cert-utils-operator.v0.1.0 cert-utils-operator.v1.0.0 cert-utils-operator.v0.2.1 cert-utils-operator.v0.0.1 cert-utils-operator.v1.1.0 cert-utils-operator.v1.0.2 cert-utils-operator.v1.0.4], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [cert-utils-operator.v1.0.3 cert-utils-operator.v0.0.1], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [crwoperator.v2.2.0 crwoperator.v2.6.0 crwoperator.v2.4.0 crwoperator.v2.10.1 crwoperator.v2.8.0], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [crwoperator.v2.3.0 crwoperator.v2.7.0], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [crwoperator.v2.4.0 crwoperator.v2.10.0 crwoperator.v2.1.1], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [crwoperator.v2.6.0 crwoperator.v2.9.0 crwoperator.v2.1.1], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [openshift-gitops-operator.v1.2.0 openshift-gitops-operator.v1.1.1], please check the `replaces`/`skipRange` fields of the operator bundles, found multiple channel heads: [web-terminal.v1.1.0 web-terminal.v1.3.0], please check the `replaces`/`skipRange` fields of the operator bundles] multiple channel heads: [redhat-openshift-pipelines-operator.v1.2.3 redhat-openshift-pipelines.v1.4.0], please check the `replaces`/`skipRange` fields of the operator bundles multiple channel heads: [web-terminal.v1.2.1 web-terminal.v1.1.0], please check the `replaces`/`skipRange` fields of the operator bundles On a cluster with: cert-utils-operator.v1.1.0 costmanagement-metrics-operator.1.0.0 crwoperator.v2.10.1 devworkspace-operator.v0.8.0 jaeger-operator.v1.24.0 kubevirt-hyperconverged-operator.v4.8.1 openshift-gitops-operator.v1.2.0 red-hat-camel-k-operator.v1.3.3 redhat-openshift-pipelines.v1.5.0 web-terminal.v1.3.0 I think that in order to correctly reproduce this you have to install one of the aforementioned operators (kubevirt-hyperconverged-operator, cert-utils-operator, crwoperator, openshift-gitops-operator, redhat-openshift-pipelines-operator, web-terminal). Hi Simone, I've created a separate BZ for your report in https://bugzilla.redhat.com/show_bug.cgi?id=1998571. None of the packages mentioned in the errors you observe would have been affected by this bug, so there must be another problem. |