Bug 1989711 - Invalid olm.maxOpenShiftVersion properties have unclear/undefined behavior in OLM
Summary: Invalid olm.maxOpenShiftVersion properties have unclear/undefined behavior in...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.z
Assignee: Nick Hale
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On: 1989704
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-03 18:38 UTC by Nick Hale
Modified: 2021-08-16 18:32 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1989704
Environment:
Last Closed: 2021-08-16 18:32:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift operator-framework-olm pull 155 0 None None None 2021-08-04 03:42:40 UTC
Red Hat Product Errata RHBA-2021:3121 0 None None None 2021-08-16 18:32:22 UTC

Description Nick Hale 2021-08-03 18:38:03 UTC
+++ This bug was initially created as a clone of Bug #1989704 +++

Description of problem:

The behavior of OLM when installed operators specify olm.maxOpenShiftVersion:

- with an empty value
- with a poorly formed value; e.g. bad semver
- more than once; ambiguous max

is ill-defined and inconsistent; silently allowing upgrades in the face of invalid values.

Version-Release number of selected component (if applicable): 

OpenShift: 4.9.0-0.nightly-2021-07-27-125952
OLM: version: 0.18.3, git commit: 3475b1d5d8d481394ba90b2823645bed0fb2b076

How reproducible: Always

Steps to Reproduce:

1. Write an operator that specifies an invalid, empty, and/or multiple olm.maxOpenShiftVersion properties
2. Install that operator on a cluster

(Here's a demo that walks through configuring properties, building bundles, and installing on a cluster: https://youtu.be/pa2-YKo07cw?list=PLw471UtJC4H39vNAduRz6CixDetHwljKs)

Expected results:

OLM should set the Upgradeable condition of the operator-lifecycle-manager ClusterOperator resource to False -- i.e. block cluster upgrades -- whenever:

- an installed operator has an invalid olm.maxOpenShiftVersion property; e.g. empty or malformed semver
- an installed operator has declared more than one olm.MaxOpenShiftVersion property
- cluster information is unavailable; e.g. the desired version of the cluster is undefined (non-functional requirement)

Additionally, the message of that condition should identify the operator(s) (and reason for) blocking an upgrade.

Additional info:

This behavior was undefined but originally assumed to fail-open; i.e. invalid properties should be ignored (see https://bugzilla.redhat.com/show_bug.cgi?id=1986753). However, it was decided during review that failing-closed (i.e. invalid properties should block upgrades) with appropriate error messages leaves OpenShift upgrades less susceptible to operator publishing errors.

Upstream PR (merged): https://github.com/operator-framework/operator-lifecycle-manager/pull/2302

--- Additional comment from Nick Hale on 2021-08-03 18:35:18 UTC ---

Comment 5 Jian Zhang 2021-08-10 03:26:28 UTC
[cloud-user@preserve-olm-env jian]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-08-09-135211   True        False         88s     Cluster version is 4.8.0-0.nightly-2021-08-09-135211

[cloud-user@preserve-olm-env jian]$ oc -n openshift-operator-lifecycle-manager  exec deploy/catalog-operator -- olm --version
OLM version: 0.17.0
git commit: 2ab4171de3930b89b6b186724c29bffcf6f39ce2


1, Add olm.properties for etcd-operator CSV:

1) for etcd-operator v0.9.2, add empty value:`olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": " "]' `,  the empty is a invalid value so it should block the 4.9 upgrade.
2) for etcd-operator v0.9.4, add `'[{"type": "olm.maxOpenShiftVersion", "value": "4.9"}]'`, since the max version is 4.9, but the cluster next version is 4.10.0,  4.9 < 4.10.0, so it should block the 4.9 upgrade.
3) for etcd-operator v0.9.2-cluster, add value: `olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": "4.10.0"}]'`, 4.10.0-xxx < 4.10.0, so it should NOT block the 4.9 upgrade.
4) for etcd-operator v0.9.4-cluster, add two values: ` olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": "4.10.0"},{"type": "olm.maxOpenShiftVersion", "value": "4.11.0"}]'`, 
since there are two propreties, so it's a invalid value, so it should block 4.9 cluster upgrading.

2, Create a CatalogSource that consume this index image.

[cloud-user@preserve-olm-env jian]$ cat cs-max.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: max-operators
  namespace: openshift-marketplace
spec:
  displayName: Jian Operators
  image: quay.io/olmqe/etcd-index:upgrade-max3
  priority: -200
  publisher: Jian
  sourceType: grpc
  updateStrategy:
    registryPoll:
      interval: 10m0s
[cloud-user@preserve-olm-env jian]$ oc create -f cs-max.yaml 
catalogsource.operators.coreos.com/max-operators created

3, Subscribe to etcd-operator v0.9.2 with Manual approval. As follows, and approve it.

[cloud-user@preserve-olm-env jian]$ cat og.yaml 
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: default-og
  namespace: default
spec:
  targetNamespaces:
  - default
[cloud-user@preserve-olm-env jian]$ oc create -f og.yaml 
operatorgroup.operators.coreos.com/default-og created

[cloud-user@preserve-olm-env jian]$ cat sub-etcd.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: etcd
  namespace: default
spec:
  channel: singlenamespace-alpha
  installPlanApproval: Manual
  name: etcd
  source: max-operators
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.2

[cloud-user@preserve-olm-env jian]$ oc create -f sub-etcd.yaml 
subscription.operators.coreos.com/etcd created

4, The `empty` value is a invalid value, should block the upgrade.
[cloud-user@preserve-olm-env jian]$ oc get csv
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.2   etcd      0.9.2                Succeeded
[cloud-user@preserve-olm-env jian]$ oc get csv -o yaml|grep "maxOpenShiftVersion"
      olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": " "}]'
      operatorframework.io/properties: '{"properties":[{"type":"olm.gvk","value":{"group":"etcd.database.coreos.com","kind":"EtcdBackup","version":"v1beta2"}},{"type":"olm.gvk","value":{"group":"etcd.database.coreos.com","kind":"EtcdCluster","version":"v1beta2"}},{"type":"olm.gvk","value":{"group":"etcd.database.coreos.com","kind":"EtcdRestore","version":"v1beta2"}},{"type":"olm.maxOpenShiftVersion","value":"


5, Approve it and make it upgrade to v0.9.4, and then check the status of the Upgradeable.
[cloud-user@preserve-olm-env jian]$ oc get csv -o yaml|grep "maxOpenShiftVersion"
      olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": "4.9"}]'
...

Since the max version is 4.9, but the cluster next version is 4.9.0-xxx,  4.9.0-xxx < 4.9.0, so it should NOT block the 4.8 upgrade.
[cloud-user@preserve-olm-env jian]$ oc get co operator-lifecycle-manager -o=jsonpath={.status.conditions[?(@.type==\"Upgradeable\")].status}
True

6, Uninstall this v0.9.4 etcd operator, and check the status of the Upgradeable.

[cloud-user@preserve-olm-env jian]$  oc get co operator-lifecycle-manager -o=jsonpath={.status.conditions[?(@.type==\"Upgradeable\")].status}
True

7, Subscribe to etcd-operator v0.9.2-cluster with Manual approval. As follows, and approve it. 
[cloud-user@preserve-olm-env jian]$ oc get ip
NAME            CSV                               APPROVAL   APPROVED
install-2gpng   etcdoperator.v0.9.4-clusterwide   Manual     false
install-4pmls   etcdoperator.v0.9.2-clusterwide   Manual     true
[cloud-user@preserve-olm-env jian]$ oc get csv
NAME                              DISPLAY   VERSION             REPLACES   PHASE
etcdoperator.v0.9.2-clusterwide   etcd      0.9.2-clusterwide              Succeeded

8, Check the status of the Upgradeable, looks good.
[cloud-user@preserve-olm-env jian]$ oc get csv -o yaml|grep "maxOpenShiftVersion"
      olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": "4.10.0"}]'
[cloud-user@preserve-olm-env jian]$ oc get co operator-lifecycle-manager -o=jsonpath={.status.conditions[?(@.type==\"Upgradeable\")].status}
True

9, Approve it  and make it upgrade to v0.9.4-cluster, and then check the status of the Upgradeable.
[cloud-user@preserve-olm-env jian]$ oc get csv -o yaml|grep "maxOpenShiftVersion"
      olm.properties: '[{"type": "olm.maxOpenShiftVersion", "value": "4.11.0"},{"type":
        "olm.maxOpenShiftVersion", "value": "4.10.0"}]'

Although both the 4.11.0 and 4.10.0 > 4.9.0-xxx, but there are two values for the maxOpenShiftVersion, so they are invalid values, so it should block the cluster upgrade. Looks good.
[cloud-user@preserve-olm-env jian]$  oc get co operator-lifecycle-manager -o=jsonpath={.status.conditions[?(@.type==\"Upgradeable\")].status}
False

10, Uninstall this v0.9.4-cluster etcd operator, and check the status of the Upgradeable. Its value should be `True`, not block the cluster upgrade. Looks good.
[cloud-user@preserve-olm-env jian]$ oc get co operator-lifecycle-manager -o=jsonpath={.status.conditions[?(@.type==\"Upgradeable\")].status}
True

verify it.

Comment 7 errata-xmlrpc 2021-08-16 18:32:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.5 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3121


Note You need to log in before you can comment on or make changes to this bug.