Bug 2089950 - Upgrade fails with message Cluster operator console is not available
Summary: Upgrade fails with message Cluster operator console is not available
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.12.0
Assignee: Jakub Hadvig
QA Contact: Yanping Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-24 19:07 UTC by Ian Miller
Modified: 2023-01-17 19:49 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-17 19:49:26 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift console-operator pull 663 0 None open Bug 2089950: Deleting downloads deployment should not fail if already deleted 2022-07-11 11:25:49 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:49:41 UTC

Description Ian Miller 2022-05-24 19:07:21 UTC
Description of problem: Some upgrades failed during scale testing with messages indicating the console operator is not available. In total 5 out of 2200 clusters failed with this pattern.

These clusters are all configured with the Console operator disabled in order to reduce overall OCP cpu use in the Telecom environment. The following CR is applied:
apiVersion: operator.openshift.io/v1
kind: Console
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "false"
    include.release.openshift.io/self-managed-high-availability: "false"
    include.release.openshift.io/single-node-developer: "false"
    release.openshift.io/create-only: "true"
    ran.openshift.io/ztp-deploy-wave: "10"
  name: cluster
spec:
  logLevel: Normal
  managementState: Removed
  operatorLogLevel: Normal


From one cluster (sno01175) the ClusterVersion conditions show:

# oc get clusterversion version -o jsonpath='{.status.conditions}' | jq
[
  {
    "lastTransitionTime": "2022-05-19T01:44:13Z",
    "message": "Done applying 4.9.26",
    "status": "True",
    "type": "Available"
  },
  {
    "lastTransitionTime": "2022-05-24T14:57:50Z",
    "message": "Cluster operator console is degraded",
    "reason": "ClusterOperatorDegraded",
    "status": "True",
    "type": "Failing"
  },
  {
    "lastTransitionTime": "2022-05-24T13:49:43Z",
    "message": "Unable to apply 4.10.13: wait has exceeded 40 minutes for these operators: console",
    "reason": "ClusterOperatorDegraded",
    "status": "True",
    "type": "Progressing"
  },
  {
    "lastTransitionTime": "2022-05-21T02:07:06Z",
    "status": "True",
    "type": "RetrievedUpdates"
  },
  {
    "lastTransitionTime": "2022-05-24T13:53:05Z",
    "message": "Payload loaded version=\"4.10.13\" image=\"quay.io/openshift-release-dev/ocp-release@sha256:4f516616baed3cf84585e753359f7ef2153ae139c2e80e0191902fbd073c4143\"",
    "reason": "PayloadLoaded",
    "status": "True",
    "type": "ReleaseAccepted"
  },
  {
    "lastTransitionTime": "2022-05-24T13:57:05Z",
    "message": "Cluster operator kube-apiserver should not be upgraded between minor versions: KubeletMinorVersionUpgradeable: Kubelet minor version (1.22.5+5c84e52) on node sno01175 will not be supported in the next OpenShift minor version upgrade.",
    "reason": "KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade",
    "status": "False",
    "type": "Upgradeable"
  }
]

Another cluster (sno01959) has very similar conditions with slight variation in the Failing and Progressing messages:
  {
    "lastTransitionTime": "2022-05-24T14:32:42Z",
    "message": "Cluster operator console is not available",
    "reason": "ClusterOperatorNotAvailable",
    "status": "True",
    "type": "Failing"
  },
  {
    "lastTransitionTime": "2022-05-24T13:52:04Z",
    "message": "Unable to apply 4.10.13: the cluster operator console has not yet successfully rolled out",
    "reason": "ClusterOperatorNotAvailable",
    "status": "True",
    "type": "Progressing"
  },


Version-Release number of selected component (if applicable): 4.9.26 upgrade to 4.10.13


How reproducible: 5 out of 2200


Steps to Reproduce:
1. Disable console with managementState: Removed
2. Starting OCP version 4.9.26
3. Initiate upgrade to 4.10.13 via ClusterVersion CR

Actual results: Cluster upgrade is stuck (no longer progressing) for 5+ hours


Expected results: Cluster upgrade completes


Additional info:

Comment 3 Yanping Zhang 2022-08-16 09:06:55 UTC
Steps to verify:
1.Create a cluster with payload 4.12.0-0.nightly-2022-08-12-053438 
# oc get clusterversions.config.openshift.io 
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.12.0-0.nightly-2022-08-12-053438   True        False         89m     Cluster version is 4.12.0-0.nightly-2022-08-12-053438

2.Set console as Removed in console operator.
spec:
  logLevel: Normal
  managementState: Removed

3. Update the cluster to new build:
# oc adm upgrade 
info: An upgrade is in progress. Working towards 4.12.0-0.nightly-2022-08-15-092951

Upstream: https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/graph
Channel: stable-4.12

Recommended updates:

  VERSION                            IMAGE
  4.12.0-0.nightly-2022-08-15-150248 registry.ci.openshift.org/ocp/release@sha256:acbff11e154fef25f7244d20b7cda9c3b30c7ef062a23ccccb1c164a45a7f32b

4.Wait for upgrade to finish successfully.
# oc adm upgrade 
Cluster version is 4.12.0-0.nightly-2022-08-15-092951

Upstream: https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/graph
Channel: stable-4.12

Recommended updates:

  VERSION                            IMAGE
  4.12.0-0.nightly-2022-08-15-150248 registry.ci.openshift.org/ocp/release@sha256:acbff11e154fef25f7244d20b7cda9c3b30c7ef062a23ccccb1c164a45a7f32b
# oc get clusterversions.config.openshift.io 
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.12.0-0.nightly-2022-08-15-092951   True        False         29m     Cluster version is 4.12.0-0.nightly-2022-08-15-092951
 oc get co
NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.12.0-0.nightly-2022-08-15-092951   True        False         False      179m    
baremetal                                  4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h13m   
cloud-controller-manager                   4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h14m   
cloud-credential                           4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h16m   
cluster-autoscaler                         4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h13m   
config-operator                            4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h14m   
console                                    4.12.0-0.nightly-2022-08-15-092951   True        False         False      99s     
csi-snapshot-controller                    4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h14m   
dns                                        4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h13m   
etcd                                       4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h13m   
image-registry                             4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h7m    
ingress                                    4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h7m    
insights                                   4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h8m    
kube-apiserver                             4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h11m   
kube-controller-manager                    4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h12m   
kube-scheduler                             4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h11m   
kube-storage-version-migrator              4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h14m   
machine-api                                4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h8m    
machine-approver                           4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h13m   
machine-config                             4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h13m   
marketplace                                4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h13m   
monitoring                                 4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h5m    
network                                    4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h15m   
node-tuning                                4.12.0-0.nightly-2022-08-15-092951   True        False         False      52m     
openshift-apiserver                        4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h8m    
openshift-controller-manager               4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h8m    
openshift-samples                          4.12.0-0.nightly-2022-08-15-092951   True        False         False      54m     
operator-lifecycle-manager                 4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h14m   
operator-lifecycle-manager-catalog         4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h14m   
operator-lifecycle-manager-packageserver   4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h8m    
service-ca                                 4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h14m   
storage                                    4.12.0-0.nightly-2022-08-15-092951   True        False         False      3h14m   

5.Set console to Managed in console operator. Console could be accessed normally.

The bug is fixed.

Comment 7 errata-xmlrpc 2023-01-17 19:49:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.