Bug 2011954 - [4.8] ClusterVersion Upgradeable=False MultipleReasons should include all messages
Summary: [4.8] ClusterVersion Upgradeable=False MultipleReasons should include all mes...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.8
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.8.z
Assignee: Over the Air Updates
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On: 2011951
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-07 19:24 UTC by W. Trevor King
Modified: 2022-05-06 12:34 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 2011951
Environment:
Last Closed: 2021-10-19 20:35:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-version-operator pull 672 0 None open Bug 2011954: pkg/cvo/upgradeable: Include messages for multiple-reason Upgradeable=False 2021-10-07 19:32:48 UTC
Red Hat Product Errata RHBA-2021:3821 0 None None None 2021-10-19 20:35:32 UTC

Description W. Trevor King 2021-10-07 19:24:54 UTC
+++ This bug was initially created as a clone of Bug #2011951 +++

+++ This bug was initially created as a clone of Bug #2011896 +++

Because:

Upgradeable=False

  Reason: MultipleReasons
  Message: Cluster cannot be upgraded between minor versions for multiple reasons: AdminAckRequired,IncompatibleOperatorsInstalled

doesn't include all the useful information needed to resolve those issues.  We should pivot to using the same approach we use today when aggregating multiple Upgradeable=False ClusterOperators, and use a bulleted list to append all the constituent messages.

The CVO's current logic goes way back, but the need to urgently fix this begins in 4.8.14, when we grew admin-ack via bug 1999092, colliding with OLM's IncompatibleOperatorsInstalled, which a lot of 4.8 clusters were already experiencing.

--- Additional comment from W. Trevor King on 2021-10-07 17:24:30 UTC ---

Verification should look something like:

1. Install a version with the fix.
2. Put something in spec.overrides to trigger ClusterVersionOverridesSet:

     $ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/overrides", "value": [{"kind": "Deployment", "group": "apps/v1", "name": "network-operator", "namespace": "openshift-network-operator", "unmanaged": true}]}]'

3. Create a ClusterOperator to trigger ClusterOperatorsNotUpgradeable:

     $ cat co.yaml 
     apiVersion: config.openshift.io/v1
     kind: ClusterOperator
     metadata:
       name: testing
     spec: {}
     $ oc apply -f co.yaml
     $ oc proxy &  # working around the lack of --subresource: https://github.com/kubernetes/kubernetes/pull/99556
     [1] 16920
     Starting to serve on 127.0.0.1:8001
     $ curl -k -XPATCH -H "Accept: application/json" -H "Content-Type: application/json-patch+json" 'http://127.0.0.1:8001/apis/config.openshift.io/v1/clusteroperators/testing/status' -d '[{"op": "add", "path": "/status", "value": {"conditions": [{"lastTransitionTime": "2021-08-31T01:01:01Z", "type": "Upgradeable", "status": "False", "reason": "Testing", "message": "Testing upgradeable https://example.com/a."}]}}]'
     $ fg
     oc proxy
     ^C

3. Wait a minute or so for the CVO to notice.

4. Check the 'oc adm upgrade' output.  It should include:

     Upgradeable=False

     Reason: MultipleReasons
     Message: Cluster should not be upgraded between minor versions for multiple reasons: ClusterVersionOverridesSet,Testing
     * Disabling ownership via cluster version overrides prevents upgrades. Please remove overrides before continuing.
     * Cluster operator testing should not be upgraded between minor versions: Testing upgradeable https://example.com/a.

5. Check the web-console output at:

   * The cluster settings page: ${CONSOLE}/settings/cluster
   * The ClusterVersion detail page ${CONSOLE}/k8s/cluster/config.openshift.io~v1~ClusterVersion/version

   They should both include the full message, clearly formatted.

Comment 1 Johnny Liu 2021-10-09 09:33:38 UTC
Verified this bug with pre-merged build, and PASS.


Install a private disconnected cluster on aws with manuall cco.

[root@preserve-jialiu-ansible ~]# oc adm upgrade
Cluster version is 4.8.0-0.ci.test-2021-10-09-074151-ci-ln-g0icijt-latest

Upgradeable=False

  Reason: MultipleReasons
  Message: Cluster should not be upgraded between minor versions for multiple reasons: AdminAckRequired,MissingUpgradeableAnnotation
* Kubernetes 1.22 and therefore OpenShift 4.9 remove several APIs which require admin consideration. Please see
the knowledge article https://access.redhat.com/articles/6329921 for details and instructions.

* Cluster operator cloud-credential should not be upgraded between minor versions: Upgradeable annotation cloudcredential.openshift.io/upgradeable-to on cloudcredential.operator.openshift.io/cluster object needs updating before upgrade. See Manually Creating IAM documentation for instructions on preparing a cluster for upgrade.


All the Upgradeable=False reason message is listed in multiple lines.

Comment 6 errata-xmlrpc 2021-10-19 20:35:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.15 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3821


Note You need to log in before you can comment on or make changes to this bug.