Created attachment 1801144 [details] Warning logs Created attachment 1801144 [details] Warning logs Created attachment 1801144 [details] Warning logs Description of problem: When installing AMQ Streams operator via OperatorHub, the install hangs and never completes. The follow error is given: ``` I0710 01:13:39.895527 1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"", Name:"openshift-operators", UID:"e0ddaf02-3c94-4c2a-b274-4664d9a75ed1", APIVersion:"v1", ResourceVersion:"1941", FieldPath:""}): type: 'Warning' reason: 'ResolutionFailed' found multiple channel heads: [amqstreams.v1.7.2 amqstreams.v1.6.2], please check the `replaces`/`skipRange` fields of the operator bundles ``` How reproducible: 100% Steps to Reproduce: 1. Install AMQ Streams v1.6.2 via OperatorHub using the `amq-streams-1.6.x` channel 2. Upgrade to AMQ Streams v1.7.2 by switching the channel to `stable`, `amq-streams-1.7.x`, or `amq-streams-1.x`. Actual results: The AMQ Streams installation hangs and never completes Expected results: The AMQ Streams installation completes, installing AMQ Stream v1.7.2 Additional info: Looks similar/potentially related to the following tickets [1] [2]. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1969902#c7 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1942522
*** This bug has been marked as a duplicate of bug 1969902 ***
After reviewing this, I actually believe that this issue and the bz I marked as a duplicate are unrelated. This issue appears to be due to the deprecated property, which wasn't being handled correctly by OLM on cluster. There's already a fix upstream for that issue: https://github.com/operator-framework/operator-lifecycle-manager/pull/2154 that still needs to be pulled down. Reopening this bz to track that change making its way downstream, and marking the target release as 4.9.0
This fix will still be backported to OCP 4.7, correct?
Yes, it just needs to make its way back via the ocp backporting process. This has also made its way into master, marking this as modified.
Hi Kevin, Is the PR in downstream for this bug https://github.com/openshift/operator-framework-olm/pull/116? Thanks
Change to Assign to confirm the PR because I do not find the PR information in downstream to fix the issue.
*** Bug 1983010 has been marked as a duplicate of this bug. ***
The number of our customers/users/engineers that are blocked by this issue is increasing everyday, therefore, I am raising the priority to urgent
1, Install an OCP cluster that contains the fixed PR. [cloud-user@preserve-olm-env jian]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-07-20-125820 True False 31m Cluster version is 4.9.0-0.nightly-2021-07-20-125820 [cloud-user@preserve-olm-env jian]$ oc -n openshift-operator-lifecycle-manager exec deploy/catalog-operator -- olm --version OLM version: 0.18.3 git commit: 1dc76f08ed05a635458420ffa979aebbe59a3890 2, Subscribe to AMQ Stream v1.6.2 [cloud-user@preserve-olm-env jian]$ oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL default amq-streams amq-streams qe-app-registry amq-streams-1.6.x [cloud-user@preserve-olm-env jian]$ oc get ip -n default NAME CSV APPROVAL APPROVED install-v2k67 amqstreams.v1.6.3 Automatic true [cloud-user@preserve-olm-env jian]$ oc get sub amq-streams -n default -o yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: creationTimestamp: "2021-07-21T02:54:44Z" generation: 1 labels: operators.coreos.com/amq-streams.default: "" name: amq-streams namespace: default resourceVersion: "48479" uid: f8c2a486-6802-47e3-9179-af2d45895db2 spec: channel: amq-streams-1.6.x installPlanApproval: Automatic name: amq-streams source: qe-app-registry sourceNamespace: openshift-marketplace startingCSV: amqstreams.v1.6.3 3, Upgrade to AMQ Streams v1.7.2 by switching the channel to `stable`. [cloud-user@preserve-olm-env jian]$ oc get ip -n default NAME CSV APPROVAL APPROVED install-hfsp2 amqstreams.v1.7.0 Automatic true install-v2k67 amqstreams.v1.6.3 Automatic true [cloud-user@preserve-olm-env jian]$ oc get csv -n default NAME DISPLAY VERSION REPLACES PHASE amqstreams.v1.6.3 Red Hat Integration - AMQ Streams 1.6.3 amqstreams.v1.6.2 Pending amqstreams.v1.7.0 Red Hat Integration - AMQ Streams 1.7.0 amqstreams.v1.6.3 Pending [cloud-user@preserve-olm-env jian]$ oc logs catalog-operator-7db49d957f-8tv4p | grep "multiple channel heads" No multiple heads found now, looks good, verify it. Note that: the pending reason is this AMQ Stream still uses the v1beta1 CRD that is not supported in OCP4.9(K8s 1.22). Nothing with this bug. 613 E0721 02:55:13.408530 1 queueinformer_operator.go:290] sync {"update" "default/install-v2k67"} failed: the server could not find the requested resource
Is it possible that this issue is blocking any other installation of operators? We just saw the same error around AMQ as described above. When we tried to install Kiali, the Kiali operator just stays in status == Unknown. When you click into the operator it says "Unknown failure" ... nothing else, no event no pod is created. In the catalog-operator pods we see the amq error plus: "an error was encountered during reconciliation" error="Operation cannot be fulfilled on subscriptions.operators.coreos.com \"kiali-ossm\": the object has been modified; please apply you changes the the latest version and try again[...] I reproduced that in my lab environment version = 4.7.19: 1. install AMQ 1.6 2. try to upgrade AMQ 1.7.2 --> the upgrade will never happen because of above 3. install Kiali --> Kiali will never be installed 4. uninstall AMQ and Kiali 5. install Kiali again --> Kiali gets immediately installed
I think that Thomas comment is spot-on, it's preventing other operators instalations I tested installing Amq-streams operator and after install ACS operator, The acs operator never installed. After removing the amq-streams operator, everything worked fine.
Hi Thomas and Raffael, Could you help provide more details? If the AMQ failed to install, and then, any other operator cannot be installed on the same namespace, that's as expected. > "an error was encountered during reconciliation" error="Operation cannot be fulfilled on subscriptions.operators.coreos.com \"kiali-ossm\": the object has been modified; please apply you changes the the latest version and try again[...] @Kevin, @Ben I know this warning is from K8s mechanism, but can we mute it? It's really confusing for the users, thanks!
Hi, Yes, thats true, it is affecting only the namespace where AMQ Streams is installed. But if AMQ is installed under "openshift-operators" it is blocking any other deployments there like it is required for Service Mesh. i.e. IHAC who is using AMQ already since years and would now like to add Service Mesh, which is not possible. br thomas
> Is the PR in downstream for this bug https://github.com/openshift/operator-framework-olm/pull/116? > Change to Assign to confirm the PR because I do not find the PR information in downstream to fix the issue. Can someone get the information Kui(kuiwang) requested so we can move this fix forward? Kevin (krizza), do you know the answer to Kui's question or know someone who does? We are really eager to move this issue forward as we have more and more users hitting this issue everyday and it is starting to block the testing and release of our next product release.
Is there any documentation about this issue available for customers?
This issue has been addressed and backported to OCP 4.7.24. It's worth noting that this issue was not the cause of the AMQ Streams upgrade failure, it was a symptom of a broken upgrade graph in the production OCP 4.7 index. The AMQ Streams upgrade graph has been fixed in the production OCP 4.7 index and the upgrade now works properly This ticket can be closed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759
*** Bug 1986248 has been marked as a duplicate of this bug. ***
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days