Bug 1862156 - Cannot upgrade a cluster when adding Performance Profile Operator
Summary: Cannot upgrade a cluster when adding Performance Profile Operator
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.5
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: 4.5.z
Assignee: Stefan Schimanski
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On: 1861431
Blocks: 1863076
TreeView+ depends on / blocked
 
Reported: 2020-07-30 15:05 UTC by Martin Sivák
Modified: 2020-08-11 20:12 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Cause: A bug in the feature gate upgradeability logic. Consequence: The CVO was marking the cluster as not upgradeable to the next minor version when LatencySensitive FeatureGate was in use. Workaround (if any): Upgrade to a version that has this bug fixed. Result: Upgrade is performed and the upgraded version includes this bug fix so CVO no longer treats LatencySensitive FeatureGate as blocking for minor-version upgrades.
Clone Of: 1861431
: 1863076 (view as bug list)
Environment:
Last Closed: 2020-08-10 13:50:53 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 921 None closed [release-4.5] Bug 1862156: LatencySensitive feature gate allows upgrades 2020-09-04 00:53:01 UTC
Red Hat Product Errata RHBA-2020:3188 None None None 2020-08-10 13:51:03 UTC

Comment 3 Ke Wang 2020-08-03 15:31:25 UTC
Verified with OCP 4.5.0-0.nightly-2020-08-01-204100, steps see below,

Case 1: Upgrade 4.4.z to 4.5.z

$ oc edit featuregate/cluster

$ oc describe featuregate/cluster
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  release.openshift.io/create-only: true
API Version:  config.openshift.io/v1
Kind:         FeatureGate
...
Spec:
  Feature Set:  LatencySensitive
Events:         <none>

$ cat topologymanager-kubeletconfig.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: cpumanager-enabled
spec:
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: cpumanager-enabled
  kubeletConfig:
     cpuManagerPolicy: static
     cpuManagerReconcilePeriod: 5s
     topologyManagerPolicy: single-numa-node
     
$ oc create -f topologymanager-kubeletconfig.yaml 
kubeletconfig.machineconfiguration.openshift.io/cpumanager-enabled created

$ oc get KubeletConfig
NAME                 AGE
cpumanager-enabled   35s

$  oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge
clusterversion.config.openshift.io/version patched

$ oc adm upgrade
Cluster version is 4.4.15

Updates:

VERSION                      IMAGE
4.4.0-0.ci-2020-07-31-153948 registry.svc.ci.openshift.org/ocp/release@sha256:816d581120c2f4e42ae99600cd5e475be0e253a42a416dc12ea418fd4c7697a3

$ oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100 --allow-explicit-upgrade --force
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates.  You have used --allow-explicit-upgrade to the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.15    True        True          22m     Working towards 4.5.0-0.nightly-2020-08-01-204100: 79% complete

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-08-01-204100   True        False         52m     Cluster version is 4.5.0-0.nightly-2020-08-01-204100

$ oc get clusterversion -o json | jq .items[0].status
{
  "availableUpdates": null,
  "conditions": [
    {
      "lastTransitionTime": "2020-08-03T12:12:06Z",
      "message": "Done applying 4.5.0-0.nightly-2020-08-01-204100",
      "status": "True",
      "type": "Available"
    },
    {
      "lastTransitionTime": "2020-08-03T14:21:16Z",
      "status": "False",
      "type": "Failing"
    },
    {
      "lastTransitionTime": "2020-08-03T14:22:31Z",
      "message": "Cluster version is 4.5.0-0.nightly-2020-08-01-204100",
      "status": "False",
      "type": "Progressing"
    },
    {
      "lastTransitionTime": "2020-08-03T13:36:24Z",
      "status": "True",
      "type": "RetrievedUpdates"
    },
    {
      "lastTransitionTime": "2020-08-03T12:14:48Z",
      "message": "Cluster operator marketplace cannot be upgraded between minor versions: The cluster has custom OperatorSource, which is deprecated in future versions. Please visit this link for further details: https://docs.openshift.com/container-platform/4.4/release_notes/ocp-4-4-release-notes.html#ocp-4-4-marketplace-apis-deprecated",
      "reason": "DeprecatedAPIsInUse",
      "status": "False",
      "type": "Upgradeable"
    }
  ],
  "desired": {
    "force": true,
    "image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100",
    "version": "4.5.0-0.nightly-2020-08-01-204100"
  },
  "history": [
    {
      "completionTime": "2020-08-03T14:22:31Z",
      "image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100",
      "startedTime": "2020-08-03T13:37:47Z",
      "state": "Completed",
      "verified": false,
      "version": "4.5.0-0.nightly-2020-08-01-204100"
    },
    {
      "completionTime": "2020-08-03T12:12:06Z",
      "image": "quay.io/openshift-release-dev/ocp-release@sha256:cf3f799779fb0646c43dd16d376bf67fddd29597009d21223f956f5dd7a4c02f",
      "startedTime": "2020-08-03T11:48:16Z",
      "state": "Completed",
      "verified": false,
      "version": "4.4.15"
    }
  ],
  "observedGeneration": 3,
  "versionHash": "BQVhuXCVbRE="
}

-----------------------------------------

Case 2:Upgrade between 4.5.z
Did the same setting for featuregate likes above.

$  oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge
clusterversion.config.openshift.io/version patched

$ oc adm upgrade
Cluster version is 4.5.4

No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and result in downtime or data loss.

$ oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100 --allow-explicit-upgrade --force
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates.  You have used --allow-explicit-upgrade to the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.4     True        True          12m     Working towards 4.5.0-0.nightly-2020-08-01-204100: 77% complete

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-08-01-204100   True        False         48m     Cluster version is 4.5.0-0.nightly-2020-08-01-204100

$ oc get clusterversion -o json | jq .items[0].status
{
  "availableUpdates": [
    {
      "force": false,
      "image": "registry.svc.ci.openshift.org/ocp/release@sha256:0a2171761c02ca895b33887d5d4991e932f4e87c3e07670557a06fca9923ff87",
      "version": "4.5.0-0.nightly-2020-08-03-123303"
    }
  ],
  "conditions": [
    {
      "lastTransitionTime": "2020-08-03T12:04:27Z",
      "message": "Done applying 4.5.0-0.nightly-2020-08-01-204100",
      "status": "True",
      "type": "Available"
    },
    {
      "lastTransitionTime": "2020-08-03T14:09:15Z",
      "status": "False",
      "type": "Failing"
    },
    {
      "lastTransitionTime": "2020-08-03T14:26:39Z",
      "message": "Cluster version is 4.5.0-0.nightly-2020-08-01-204100",
      "status": "False",
      "type": "Progressing"
    },
    {
      "lastTransitionTime": "2020-08-03T11:34:59Z",
      "status": "True",
      "type": "RetrievedUpdates"
    },
    {
      "lastTransitionTime": "2020-08-03T12:07:09Z",
      "message": "Cluster operator marketplace cannot be upgraded between minor versions: The cluster has custom OperatorSource, which is deprecated in future versions. Please visit this link for further details: https://docs.openshift.com/container-platform/4.4/release_notes/ocp-4-4-release-notes.html#ocp-4-4-marketplace-apis-deprecated",
      "reason": "DeprecatedAPIsInUse",
      "status": "False",
      "type": "Upgradeable"
    }
  ],
  "desired": {
    "force": true,
    "image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100",
    "version": "4.5.0-0.nightly-2020-08-01-204100"
  },
  "history": [
    {
      "completionTime": "2020-08-03T14:26:39Z",
      "image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100",
      "startedTime": "2020-08-03T13:48:01Z",
      "state": "Completed",
      "verified": false,
      "version": "4.5.0-0.nightly-2020-08-01-204100"
    },
    {
      "completionTime": "2020-08-03T12:04:27Z",
      "image": "quay.io/openshift-release-dev/ocp-release@sha256:02dfcae8f6a67e715380542654c952c981c59604b1ba7f569b13b9e5d0fbbed3",
      "startedTime": "2020-08-03T11:34:59Z",
      "state": "Completed",
      "verified": false,
      "version": "4.5.4"
    }
  ],
  "observedGeneration": 3,
  "versionHash": "BQVhuXCVbRE="
}

From above test results, we can see the fix works fine, so move the bug Verified.

Comment 4 Scott Dodson 2020-08-04 19:37:00 UTC
The Upgradeable=False condition applies to the currently running version and blocks you from upgrading to the next minor version, therefore this shouldn't block 4.4 to 4.5 upgrades but would block upgrades from 4.5 to 4.6. So the minimum version for the 4.5 to 4.6 upgrade should be at least 4.5.5 and we should consider raising the minimum 4.4 version once the fix has been backported to 4.4. It appears there are currently four affected clusters based on telemetry data.

Comment 6 errata-xmlrpc 2020-08-10 13:50:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5.5 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3188

Comment 7 W. Trevor King 2020-08-11 20:12:33 UTC
Updating the doc text as described in [1], although with the release shipped it may be too late for updates ;).

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1863076#c5


Note You need to log in before you can comment on or make changes to this bug.