Bug 1994038 - OLM upgradeable condition message unclear with MaxOpenShiftVersion set
Summary: OLM upgradeable condition message unclear with MaxOpenShiftVersion set
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.8
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.8.z
Assignee: Ankita Thomas
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On: 1992677
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-16 14:41 UTC by Scott Dodson
Modified: 2021-12-27 07:44 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1992677
Environment:
Last Closed: 2021-09-14 06:57:48 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift operator-framework-olm pull 173 0 None None None 2021-08-23 16:40:08 UTC
Github openshift operator-framework-olm pull 174 0 None None None 2021-08-23 17:21:23 UTC
Red Hat Product Errata RHBA-2021:3429 0 None None None 2021-09-14 06:58:07 UTC

Description Scott Dodson 2021-08-16 14:41:52 UTC
Cloning this so we have a 4.8.z Upgradeblocker to inhibit promotion of upgrades to the stable channel until we've resolved this.

+++ This bug was initially created as a clone of Bug #1992677 +++

Description of problem:

When OLM reconciles the clusteroperator upgradeable condition, it sets that condition to false when installed operators set the maxopenshiftversion to a minor ocp version less than or equal to the current ocp version. This blocks minor version upgrades until those workloads are handled correctly.

However, when we set that condition we also propogate a message to the cluster that describes this. It includes a reference to the maxocpversion, but it appends the patch version in order to describe the semantic version of the cluster. Ex:

Reason: IncompatibleOperatorsInstalled
  Message: Cluster operator operator-lifecycle-manager cannot be upgraded between minor versions: The following operators block OpenShift upgrades: Operator openshift-gitops-operator.v1.2.0 in namespace openshift-operators is not compatible with OpenShift versions greater than 4.8.0

This message is incorrect. This status does not block upgrades for versions greater than 4.8.0, it blocks upgrades for versions >=4.9.0 when the operator sets MaxOpenShiftVersion = "4.8".

This message needs to be updated so that it is clear to end users that this does not block z-stream updates in the 4.8 upgrade path.

Version-Release number of selected component (if applicable):

4.8

How reproducible:

always

Steps to Reproduce:
1. Install an operator on 4.8 that sets the maxopenshiftversion=4.8
2. Look at the clusteroperator status field for the operator-lifecycle-manager clusteroperator object
3.

Actual results:

Status looks like:

Cluster operator operator-lifecycle-manager cannot be upgraded between minor versions: The following operators block OpenShift upgrades: Operator $OperatorName in namespace $OperatorNamespace is not compatible with OpenShift versions greater than 4.8.0

Expected results:

The message should be updated to be clear that z-streams for the current version can still be installed instead of referencing 4.8.0

Additional info:

--- Additional comment from Kevin Rizza on 2021-08-12 08:18:35 EDT ---



--- Additional comment from Jimmy Scott on 2021-08-13 03:57:19 EDT ---

Hi there,

Currently upgrading to 4.8.4 and bumping into a similar issue:

> $ oc get clusterversion
>
> NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
> version   4.7.21    True        True          10h     Unable to apply 4.8.4: the cluster operator authentication has not yet successfully rolled out

> $ oc get clusterversion -o yaml
> ...
>     - lastTransitionTime: "2021-08-12T10:31:28Z"
>       message: 'Cluster operator operator-lifecycle-manager cannot be upgraded between minor versions: The following operators block OpenShift upgrades: Operator ocs-operator.v4.8.0 in namespace openshift-storage is not compatible with OpenShift versions greater than 4.8.0,Operator kubevirt-hyperconverged-operator.v4.8.0 in namespace openshift-cnv is not compatible with OpenShift versions greater than 4.8.0'
>       reason: IncompatibleOperatorsInstalled
>       status: "False"
>       type: Upgradeable
> ...

What would be a valid workaround?

I'm thinking of updating the operator yaml from:

> olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": "4.8"}]'

to:

> olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": "4.8.4"}]'

But suggestions are welcome!

Kind regards,
Jimmy Scott

--- Additional comment from Scott Dodson on 2021-08-16 10:40:03 EDT ---

This bug doesn't affect cluster availability but the messaging is confusing to the point that I believe we should fix this before we promote 4.7 to 4.8 upgrades to the stable channel. So marking this as Upgrades, UpgradeBlocker to help track that and I'll go ahead and clone this.

Comment 4 Ankita Thomas 2021-08-25 17:30:16 UTC
I can see the commit on the release-4.8 branch, https://github.com/openshift/operator-framework-olm/commit/99e57bef036046809a53a9d32d8627b8bebf1cc8. https://github.com/openshift/operator-framework-olm/commit/99e57bef036046809a53a9d32d8627b8bebf1cc8 should be present there. Can you take another look?

Comment 5 Jian Zhang 2021-08-26 01:17:43 UTC
Hi Ankita,

Yes, I know the fixed PR on the release-4.8 branch, but, it wasn't merged in the 4.8.7 payload. See:
[cloud-user@preserve-olm-env jian]$ oc adm release info quay.io/openshift-release-dev/ocp-release:4.8.7-x86_64 --commits|grep lifecycle
  operator-lifecycle-manager                     https://github.com/openshift/operator-framework-olm                         8ff5f22d9336dd8df45d7a839bf756a492bb4332

So, I have to test it in the next z-stream release 4.8.8. Change the status to POST first.

Comment 8 W. Trevor King 2021-08-31 04:50:49 UTC
You shouldn't need to wait for a named release.  Picking a 4.8 nightly from [1], gets me [2], which has:

  $ oc adm release info --commits registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-08-30-145438 | grep olm
    operator-lifecycle-manager                     https://github.com/openshift/operator-framework-olm                         dca2c2c2c2054e1f282843a5dd34111402cc2e9b
    operator-registry                              https://github.com/openshift/operator-framework-olm                         dca2c2c2c2054e1f282843a5dd34111402cc2e9b

Checking the repo:

  $ git --no-pager log --oneline --first-parent -2 origin/release-4.8
  dca2c2c (HEAD -> release-4.8, origin/release-4.8) Merge pull request #173 from timflannagan/release-4.8-bz-1992677-parent-manual-cp
  f936939 Merge pull request #174 from timflannagan/release-4.8-fix-build-root-image

So that nightly has both pulls for this bug.

[1]: https://amd64.ocp.releases.ci.openshift.org/#4.8.0-0.nightly
[2]: https://amd64.ocp.releases.ci.openshift.org/releasestream/4.8.0-0.nightly/release/4.8.0-0.nightly-2021-08-30-145438

Comment 9 Jian Zhang 2021-08-31 06:50:41 UTC
Hi W. Trevor, 

Thanks for your suggestion! I know it, but the cluster version is `4.8.0-0.xxx` via this solution, not the `4.8.x`. 
$ oc get clusterversion version -o=jsonpath={.status.desired.version}

This is not available for verifying this bug. Because this bug needs to verify the z-stream version semver. Change the status to POST first.

Comment 10 W. Trevor King 2021-08-31 16:11:54 UTC
The nightly version names are still valid SemVer, even with their pre-release suffix [1].  And the change being made is dropping everything except major.minor [2], so there should be no functional difference between "4.8.0..." and "4.8.10" (or whatever), right?

[1]: https://semver.org/spec/v2.0.0.html#spec-item-9
[2]: https://github.com/openshift/operator-framework-olm/pull/173/files#diff-76bd4c8f59530318d3b5e7e1644f00e9d813a0e8c5c22b27df79e1b3370d6495R227

Comment 11 Jian Zhang 2021-09-01 01:09:06 UTC
> so there should be no functional difference between "4.8.0..." and "4.8.10" (or whatever), right?

Yeah, there is no difference in the payload, but for "4.8.0..." and "4.8.10", the OLM have different handle logic on them, that's why I have to wait for the "4.8.10" payload. Or can I change the cluster version manually? I guess no since OLM read the cluster version from the 'status' field.
$ oc get clusterversion version -o=jsonpath={.status.desired.version}

Comment 12 W. Trevor King 2021-09-01 20:05:06 UTC
Ah, I'd been missing the "OLM have different handle logic on them" bit.  But... really?  Why would OLM care about the patch version for this sort of thing?  Can you link me to the relevant code or summarize?

But anyway, we now have a 4.8.10 [1,2], so should be good now :)

$ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.8.10-x86_64 | grep olm
  operator-lifecycle-manager                     https://github.com/openshift/operator-framework-olm                         dca2c2c2c2054e1f282843a5dd34111402cc2e9b
  operator-registry                              https://github.com/openshift/operator-framework-olm                         dca2c2c2c2054e1f282843a5dd34111402cc2e9b

[1]: https://amd64.ocp.releases.ci.openshift.org/releasestream/4-stable/release/4.8.10
[2]: https://mirror.openshift.com/pub/openshift-v4/amd64/clients/ocp/4.8.10/

Comment 13 Ankita Thomas 2021-09-01 20:40:04 UTC
The 4.8.10 payload has the fix, it looks good to test.

https://amd64.ocp.releases.ci.openshift.org/releasestream/4-stable/release/4.8.10

Comment 14 W. Trevor King 2021-09-01 22:10:56 UTC
I installed 4.8.10, installed the OpenShift Update Service 4.6.0 operator, and can confirm that the OLM ClusterOperator condition has improved wording:

  IncompatibleOperatorsInstalledClusterServiceVersions blocking cluster upgrade: openshift-update-service/update-service-operator.v4.6.0 is incompatible with OpenShift minor versions greater than 4.8

So hooray :).  I'll leave this ON_QA, in case Jian has any other tires to kick before moving to VERIFIED.

Comment 15 Jian Zhang 2021-09-02 03:03:18 UTC
Thanks! W. Trevor! 

> Why would OLM care about the patch version for this sort of thing?

Sorry for the confusion, you know, nightlies are 4.8.0-nightly so they're always smaller than maxOpenShiftVersion 4.8.0. This bug fix is to stop comparing patch versions, so only 4.y.0 or 4.y should be accepted.

[cloud-user@preserve-olm-env jian]$ oc adm release info quay.io/openshift-release-dev/ocp-release:4.8.10-x86_64 --commits|grep lifecycle
  operator-lifecycle-manager                     https://github.com/openshift/operator-framework-olm                         dca2c2c2c2054e1f282843a5dd34111402cc2e9b

[cloud-user@preserve-olm-env jian]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.10    True        False         7m54s   Cluster version is 4.8.10

1, Install a CatalogSource that consume an index image that contains an operator with the "maxOpenShiftVersion=4.8"

[cloud-user@preserve-olm-env jian]$ cat cs-max.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: max-operators
  namespace: openshift-marketplace
spec:
  displayName: Jian Operators
  image: quay.io/olmqe/etcd-index:bug-1994038
  priority: -200
  publisher: Jian
  sourceType: grpc
  updateStrategy:
    registryPoll:
      interval: 10m0s
[cloud-user@preserve-olm-env jian]$ 
[cloud-user@preserve-olm-env jian]$ oc create -f cs-max.yaml 
catalogsource.operators.coreos.com/max-operators created

[cloud-user@preserve-olm-env jian]$ oc get catalogsource -n openshift-marketplace
NAME                  DISPLAY               TYPE   PUBLISHER   AGE
certified-operators   Certified Operators   grpc   Red Hat     48m
community-operators   Community Operators   grpc   Red Hat     48m
max-operators         Jian Operators        grpc   Jian        70s
redhat-marketplace    Red Hat Marketplace   grpc   Red Hat     48m
redhat-operators      Red Hat Operators     grpc   Red Hat     48m
[cloud-user@preserve-olm-env jian]$ oc get packagemanifest|grep etcd
etcd                                                 Jian Operators        80s
etcd                                                 Community Operators   48m

2, Subscribe to this etcd operator.
[cloud-user@preserve-olm-env jian]$ cat og.yaml 
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: default-og
  namespace: default
spec:
  targetNamespaces:
  - default

[cloud-user@preserve-olm-env jian]$ oc create -f og.yaml 
operatorgroup.operators.coreos.com/default-og created


[cloud-user@preserve-olm-env jian]$ cat sub-etcd.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: etcd
  namespace: default
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Automatic
  name: etcd
  source: max-operators
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.4
[cloud-user@preserve-olm-env jian]$ oc create -f sub-etcd.yaml 
subscription.operators.coreos.com/etcd created

[cloud-user@preserve-olm-env jian]$ oc get sub
NAME   PACKAGE   SOURCE          CHANNEL
etcd   etcd      max-operators   singlenamespace-alpha
[cloud-user@preserve-olm-env jian]$ oc get csv
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.4   etcd      0.9.4                Succeeded


3, Check the Upgradeable info, 
[cloud-user@preserve-olm-env jian]$  oc get csv -o yaml|grep "maxOpenShiftVersion"
      olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": "4.8"}]'

[cloud-user@preserve-olm-env jian]$ oc get co operator-lifecycle-manager -o=jsonpath={.status.conditions[?(@.type==\"Upgradeable\")].status}
False

[cloud-user@preserve-olm-env jian]$ oc get co operator-lifecycle-manager -o=jsonpath={.status.conditions[?(@.type==\"Upgradeable\")].message}
ClusterServiceVersions blocking cluster upgrade: default/etcdoperator.v0.9.4 is incompatible with OpenShift minor versions greater than 4.8

LGTM, verify it.

Comment 17 W. Trevor King 2021-09-09 01:41:37 UTC
Because this was POST when 4.8.10 was created (see around comment 12), it didn't get automatically swept into 4.8.10's errata.  But also as shown in comment 12, the code did ship in 4.8.10.  So it's going to get associated with the next 4.8.z errata, because it's hard to fix errata after they ship, but the fixed release is already live and in stable channels and all that.

Comment 19 errata-xmlrpc 2021-09-14 06:57:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.11 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3429

Comment 20 W. Trevor King 2021-09-15 16:03:38 UTC
No need for UpgradeBlocker now that this shipped in 4.8.10 (comment 17); details in [1].

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1992677#c9


Note You need to log in before you can comment on or make changes to this bug.