2030489 – OLM fails to upgrade operators immediately

Bug 2030489 - OLM fails to upgrade operators immediately

Summary: OLM fails to upgrade operators immediately

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	OLM
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.8.z
Assignee:	Vu Dinh
QA Contact:	xzha
Docs Contact:
URL:
Whiteboard:
Depends On:	2024048
Blocks:
TreeView+	depends on / blocked

Reported:	2021-12-08 22:55 UTC by Vu Dinh
Modified:	2022-08-24 21:19 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-01-25 12:13:09 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift operator-framework-olm pull 222	0	None	open	Bug 2030489: Remove oudated subscription update logic to improve resolution delay	2021-12-08 22:56:41 UTC
Red Hat Product Errata	RHBA-2022:0172	0	None	None	None	2022-01-25 12:13:25 UTC

Description Vu Dinh 2021-12-08 22:55:58 UTC

This bug was initially created as a copy of Bug #2024048

I am copying this bug because: 



This bug was initially created as a copy of Bug #2002276

I am copying this bug because: 
Backporting

Description of problem:
Upgrading descheduler from 4.8 to 4.9 fails, i see that when channel & starting CSV is set i do not see any upgrade starting, jian zhang looked further and below is what he found.

1, Only remove the sub, not the csv, resubscribe the 4.9 one, get the below errors:
  - message: 'constraints not satisfiable: @existing/openshift-kube-descheduler-operator//clusterkubedescheduleroperator.4.8.0-202108312109
      and qe-app-registry/openshift-marketplace/4.9/clusterkubedescheduleroperator.4.9.0-202109071344
      originate from package cluster-kube-descheduler-operator, subscription cluster-kube-descheduler-operator
      requires qe-app-registry/openshift-marketplace/4.9/clusterkubedescheduleroperator.4.9.0-202109071344,
      subscription cluster-kube-descheduler-operator exists, clusterserviceversion
      clusterkubedescheduleroperator.4.8.0-202108312109 exists and is not referenced
      by a subscription'
    reason: ConstraintsNotSatisfiable
    status: "True"
    type: ResolutionFailed



2,, Remove the sub and csv, and then recreate it, it works well.
[cloud-user@preserve-olm-env jian]$ cat sub-descheduler.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: cluster-kube-descheduler-operator
  namespace: openshift-kube-descheduler-operator
spec:
  channel: "4.9"
  installPlanApproval: Automatic
  name: cluster-kube-descheduler-operator
  source: qe-app-registry
  sourceNamespace: openshift-marketplace
  startingCSV: clusterkubedescheduleroperator.4.9.0-202109071344

[cloud-user@preserve-olm-env jian]$ oc get sub
NAME                                PACKAGE                             SOURCE            CHANNEL
cluster-kube-descheduler-operator   cluster-kube-descheduler-operator   qe-app-registry   4.9

[cloud-user@preserve-olm-env jian]$ oc get ip
NAME            CSV                                                 APPROVAL    APPROVED
install-4w9lm   clusterkubedescheduleroperator.4.9.0-202109071344   Automatic   true

[cloud-user@preserve-olm-env jian]$ oc get csv
NAME                                                DISPLAY                            VERSION              REPLACES                          PHASE
clusterkubedescheduleroperator.4.9.0-202109071344   Kube Descheduler Operator          4.9.0-202109071344                                     Succeeded
elasticsearch-operator.5.2.0-60                     OpenShift Elasticsearch Operator   5.2.0-60             elasticsearch-operator.5.1.1-56   Succeeded

3, check the bundle content:
clusterkubedescheduleroperator.4.8.0-202108312109   {"apiVersion":"opera  {"apiVersion":"operators.coreos.com/v1alpha1","kin  registry-proxy.engineering.redhat.com/rh-osbs/openshift-ose-  >=4.6.0 <4.8.0-202108312109     4.8.0-202108312109

clusterkubedescheduleroperator.4.9.0-202109071344   {"apiVersion":"opera  {"apiVersion":"operators.coreos.com/v1alpha1","kin  registry-proxy.engineering.redhat.com/rh-osbs/openshift-ose-  >=4.6.0 <4.9.0-202109071344     4.9.0-202109071344
Version-Release number of selected component (if applicable):

[knarra@knarra ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-09-06-004132   True        False         23h     Cluster version is 4.9.0-0.nightly-2021-09-06-004132

 Clusterkubedescheduleroperator.4.8.0-202108312109

How reproducible:
Install 4.8 cluster
Install 4.8 descheduler operator
Upgrade system to 4.9
Now edit descheduler sub & set channel to 4.9, starting csv to clusterkubedescheduleroperator.4.9.0-202109071344
spec:
  channel: "4.9"
  installPlanApproval: Automatic
  name: cluster-kube-descheduler-operator
  source: qe-app-registry
  sourceNamespace: openshift-marketplace
  startingCSV: clusterkubedescheduleroperator.4.9.0-202109071344


Actual Results:
 Upgrade does not start at all

Expected Results:
 Upgrade should work fine.

Comment 1 Scott Dodson 2022-01-07 20:50:44 UTC

Since the upstream bug here was suspected to have triggered problems with OCS/ODF upgrades[1] can we make sure to discuss with the OCS/ODF folks before we merge the changes in the linked PR?

https://bugzilla.redhat.com/show_bug.cgi?id=2034098#c19
https://bugzilla.redhat.com/show_bug.cgi?id=2035484#c3

Comment 2 Vu Dinh 2022-01-07 21:58:23 UTC

Hi Scott,

I did have a meeting with OCS to discuss their upgrade process. This fix isn't the root cause as it doesn't change anything to do with OLM dependency resolution. The issue was from OCS side on how they install dependent operator and it was specified for 4.9. They did open a BZ and I closed after the meeting: https://bugzilla.redhat.com/show_bug.cgi?id=2035484

Vu

Comment 3 Scott Dodson 2022-01-08 19:51:22 UTC

Vu,

That's fine, I gathered that we didn't believe that the fix with the pending backport was root cause but wanted to make sure they were informed that it was being backported to 4.8 and both teams agreed that was ok to do.

Comment 6 xzha 2022-01-14 06:35:56 UTC

verify:

[root@preserve-olm-agent-test ~]# oc48 version
Client Version: 4.8.0-0.nightly-2022-01-14-012354
Server Version: 4.8.0-0.nightly-2022-01-14-012354
Kubernetes Version: v1.21.6+bb8d50a

[root@preserve-olm-agent-test ~]# oc48 adm release info registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2022-01-14-012354 --commits|grep operator-lifecycle-manager
  operator-lifecycle-manager                     https://github.com/openshift/operator-framework-olm                         b3aabf273e0ac0bd6e84d257332e2eac08f5e6c

1, create project
[root@preserve-olm-agent-test ~]# oc48 adm new-project openshift-kube-descheduler-operator
Created project openshift-kube-descheduler-operator
[root@preserve-olm-agent-test ~]# oc48 project openshift-kube-descheduler-operator
Now using project "openshift-kube-descheduler-operator" on server "https://api.xzha-4.8.qe.devcluster.openshift.com:6443".

2, install sub
[root@preserve-olm-agent-test 2030489]# cat sub.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: cluster-kube-descheduler-operator
  namespace: openshift-kube-descheduler-operator
spec:
  channel: "4.7"
  installPlanApproval: Automatic
  name: cluster-kube-descheduler-operator
  source: qe-app-registry
  sourceNamespace: openshift-marketplace
[root@preserve-olm-agent-test 2030489]# cat og.yaml 
kind: OperatorGroup
apiVersion: operators.coreos.com/v1
metadata:
  name: og-single
  namespace: openshift-kube-descheduler-operator
spec:
  targetNamespaces:
  - openshift-kube-descheduler-operator


[root@preserve-olm-agent-test 2030489]# oc48 apply -f sub.yaml 
subscription.operators.coreos.com/cluster-kube-descheduler-operator created

[root@preserve-olm-agent-test 2030489]# oc48 apply -f og.yaml 
operatorgroup.operators.coreos.com/og-single created

3, check csv
[root@preserve-olm-agent-test 2030489]# oc48 get csv
NAME                                                DISPLAY                            VERSION              REPLACES   PHASE
clusterkubedescheduleroperator.4.7.0-202201082234   Kube Descheduler Operator          4.7.0-202201082234              Succeeded
elasticsearch-operator.5.1.6-27                     OpenShift Elasticsearch Operator   5.1.6-27                        Succeeded


4, edit sub to channel "4.8"
[root@preserve-olm-agent-test 2030489]# oc48 edit sub cluster-kube-descheduler-operator
subscription.operators.coreos.com/cluster-kube-descheduler-operator edited

5, check ip/csv
[root@preserve-olm-agent-test 2030489]# oc48 get ip
NAME            CSV                                                 APPROVAL    APPROVED
install-pxmj9   clusterkubedescheduleroperator.4.8.0-202112141153   Automatic   true
install-r4tjt   clusterkubedescheduleroperator.4.7.0-202201082234   Automatic   true
[root@preserve-olm-agent-test 2030489]# oc48 get csv
NAME                                                DISPLAY                            VERSION              REPLACES                                            PHASE
clusterkubedescheduleroperator.4.7.0-202201082234   Kube Descheduler Operator          4.7.0-202201082234                                                       Replacing
clusterkubedescheduleroperator.4.8.0-202112141153   Kube Descheduler Operator          4.8.0-202112141153   clusterkubedescheduleroperator.4.7.0-202201082234   InstallReady

[root@preserve-olm-agent-test 2030489]#  oc48 get csv
NAME                                                DISPLAY                            VERSION              REPLACES                                            PHASE
clusterkubedescheduleroperator.4.8.0-202112141153   Kube Descheduler Operator          4.8.0-202112141153   clusterkubedescheduleroperator.4.7.0-202201082234   Succeeded
elasticsearch-operator.5.1.6-27                     OpenShift Elasticsearch Operator   5.1.6-27                                                                 Succeeded

LGTM, verified

Comment 9 errata-xmlrpc 2022-01-25 12:13:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.28 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0172

Note You need to log in before you can comment on or make changes to this bug.