Bug 2011951 - [4.9] ClusterVersion Upgradeable=False MultipleReasons should include all messages
Summary: [4.9] ClusterVersion Upgradeable=False MultipleReasons should include all mes...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.8
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.9.0
Assignee: Over the Air Updates
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On: 2011896
Blocks: 2011954
TreeView+ depends on / blocked
 
Reported: 2021-10-07 19:21 UTC by W. Trevor King
Modified: 2022-05-06 12:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 2011896
: 2011954 (view as bug list)
Environment:
Last Closed: 2021-10-18 17:52:32 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-version-operator pull 671 0 None open Bug 2011951: pkg/cvo/upgradeable: Include messages for multiple-reason Upgradeable=False 2021-10-07 19:23:33 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:52:42 UTC

Description W. Trevor King 2021-10-07 19:21:44 UTC
+++ This bug was initially created as a clone of Bug #2011896 +++

Because:

Upgradeable=False

  Reason: MultipleReasons
  Message: Cluster cannot be upgraded between minor versions for multiple reasons: AdminAckRequired,IncompatibleOperatorsInstalled

doesn't include all the useful information needed to resolve those issues.  We should pivot to using the same approach we use today when aggregating multiple Upgradeable=False ClusterOperators, and use a bulleted list to append all the constituent messages.

The CVO's current logic goes way back, but the need to urgently fix this begins in 4.8.14, when we grew admin-ack via bug 1999092, colliding with OLM's IncompatibleOperatorsInstalled, which a lot of 4.8 clusters were already experiencing.

--- Additional comment from W. Trevor King on 2021-10-07 17:24:30 UTC ---

Verification should look something like:

1. Install a version with the fix.
2. Put something in spec.overrides to trigger ClusterVersionOverridesSet:

     $ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/overrides", "value": [{"kind": "Deployment", "group": "apps/v1", "name": "network-operator", "namespace": "openshift-network-operator", "unmanaged": true}]}]'

3. Create a ClusterOperator to trigger ClusterOperatorsNotUpgradeable:

     $ cat co.yaml 
     apiVersion: config.openshift.io/v1
     kind: ClusterOperator
     metadata:
       name: testing
     spec: {}
     $ oc apply -f co.yaml
     $ oc proxy &  # working around the lack of --subresource: https://github.com/kubernetes/kubernetes/pull/99556
     [1] 16920
     Starting to serve on 127.0.0.1:8001
     $ curl -k -XPATCH -H "Accept: application/json" -H "Content-Type: application/json-patch+json" 'http://127.0.0.1:8001/apis/config.openshift.io/v1/clusteroperators/testing/status' -d '[{"op": "add", "path": "/status", "value": {"conditions": [{"lastTransitionTime": "2021-08-31T01:01:01Z", "type": "Upgradeable", "status": "False", "reason": "Testing", "message": "Testing upgradeable https://example.com/a."}]}}]'
     $ fg
     oc proxy
     ^C

3. Wait a minute or so for the CVO to notice.

4. Check the 'oc adm upgrade' output.  It should include:

     Upgradeable=False

     Reason: MultipleReasons
     Message: Cluster should not be upgraded between minor versions for multiple reasons: ClusterVersionOverridesSet,Testing
     * Disabling ownership via cluster version overrides prevents upgrades. Please remove overrides before continuing.
     * Cluster operator testing should not be upgraded between minor versions: Testing upgradeable https://example.com/a.

5. Check the web-console output at:

   * The cluster settings page: ${CONSOLE}/settings/cluster
   * The ClusterVersion detail page ${CONSOLE}/k8s/cluster/config.openshift.io~v1~ClusterVersion/version

   They should both include the full message, clearly formatted.

Comment 1 Scott Dodson 2021-10-08 15:58:51 UTC
This bug is not a blocker, but I've labeled the PR for merge as we'd like to get this UX improvement into 4.8 ASAP.

Comment 5 Johnny Liu 2021-10-09 09:28:09 UTC
Verified this bug with 4.9.0-0.nightly-2021-10-08-232649, and PASS.

Install a private disconnected cluster on aws with manuall cco.

[root@preserve-jialiu-ansible ~]# oc adm upgrade
Cluster version is 4.9.0-0.nightly-2021-10-08-232649

Upgradeable=False

  Reason: MissingUpgradeableAnnotation
  Message: Cluster operator cloud-credential should not be upgraded between minor versions: Upgradeable annotation cloudcredential.openshift.io/upgradeable-to on cloudcredential.operator.openshift.io/cluster object needs updating before upgrade. See Manually Creating IAM documentation for instructions on preparing a cluster for upgrade.


[root@preserve-jialiu-ansible ~]# oc -n openshift-config-managed patch cm admin-gates --patch '{"data":{"ack-4.9-dummy":"testing"}}' --type=merge
configmap/admin-gates patched


[root@preserve-jialiu-ansible ~]# oc adm upgrade
Cluster version is 4.9.0-0.nightly-2021-10-08-232649

Upgradeable=False

  Reason: MultipleReasons
  Message: Cluster should not be upgraded between minor versions for multiple reasons: AdminAckRequired,MissingUpgradeableAnnotation
* testing
* Cluster operator cloud-credential should not be upgraded between minor versions: Upgradeable annotation cloudcredential.openshift.io/upgradeable-to on cloudcredential.operator.openshift.io/cluster object needs updating before upgrade. See Manually Creating IAM documentation for instructions on preparing a cluster for upgrade.


All the Upgradeable=False reason message is listed in multiple lines.

Comment 7 errata-xmlrpc 2021-10-18 17:52:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.