Cause:
A bug in the feature gate upgradeability logic.
Consequence:
The CVO was marking the cluster as not upgradeable to the next minor version when LatencySensitive FeatureGate was in use.
Workaround (if any):
Upgrade to a version that has this bug fixed.
Result:
Upgrade is performed and the upgraded version includes this bug fix so CVO no longer treats LatencySensitive FeatureGate as blocking for minor-version upgrades.
Verified with OCP 4.5.0-0.nightly-2020-08-01-204100, steps see below,
Case 1: Upgrade 4.4.z to 4.5.z
$ oc edit featuregate/cluster
$ oc describe featuregate/cluster
Name: cluster
Namespace:
Labels: <none>
Annotations: release.openshift.io/create-only: true
API Version: config.openshift.io/v1
Kind: FeatureGate
...
Spec:
Feature Set: LatencySensitive
Events: <none>
$ cat topologymanager-kubeletconfig.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: cpumanager-enabled
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: cpumanager-enabled
kubeletConfig:
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 5s
topologyManagerPolicy: single-numa-node
$ oc create -f topologymanager-kubeletconfig.yaml
kubeletconfig.machineconfiguration.openshift.io/cpumanager-enabled created
$ oc get KubeletConfig
NAME AGE
cpumanager-enabled 35s
$ oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge
clusterversion.config.openshift.io/version patched
$ oc adm upgrade
Cluster version is 4.4.15
Updates:
VERSION IMAGE
4.4.0-0.ci-2020-07-31-153948 registry.svc.ci.openshift.org/ocp/release@sha256:816d581120c2f4e42ae99600cd5e475be0e253a42a416dc12ea418fd4c7697a3
$ oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100 --allow-explicit-upgrade --force
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.4.15 True True 22m Working towards 4.5.0-0.nightly-2020-08-01-204100: 79% complete
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.5.0-0.nightly-2020-08-01-204100 True False 52m Cluster version is 4.5.0-0.nightly-2020-08-01-204100
$ oc get clusterversion -o json | jq .items[0].status
{
"availableUpdates": null,
"conditions": [
{
"lastTransitionTime": "2020-08-03T12:12:06Z",
"message": "Done applying 4.5.0-0.nightly-2020-08-01-204100",
"status": "True",
"type": "Available"
},
{
"lastTransitionTime": "2020-08-03T14:21:16Z",
"status": "False",
"type": "Failing"
},
{
"lastTransitionTime": "2020-08-03T14:22:31Z",
"message": "Cluster version is 4.5.0-0.nightly-2020-08-01-204100",
"status": "False",
"type": "Progressing"
},
{
"lastTransitionTime": "2020-08-03T13:36:24Z",
"status": "True",
"type": "RetrievedUpdates"
},
{
"lastTransitionTime": "2020-08-03T12:14:48Z",
"message": "Cluster operator marketplace cannot be upgraded between minor versions: The cluster has custom OperatorSource, which is deprecated in future versions. Please visit this link for further details: https://docs.openshift.com/container-platform/4.4/release_notes/ocp-4-4-release-notes.html#ocp-4-4-marketplace-apis-deprecated",
"reason": "DeprecatedAPIsInUse",
"status": "False",
"type": "Upgradeable"
}
],
"desired": {
"force": true,
"image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100",
"version": "4.5.0-0.nightly-2020-08-01-204100"
},
"history": [
{
"completionTime": "2020-08-03T14:22:31Z",
"image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100",
"startedTime": "2020-08-03T13:37:47Z",
"state": "Completed",
"verified": false,
"version": "4.5.0-0.nightly-2020-08-01-204100"
},
{
"completionTime": "2020-08-03T12:12:06Z",
"image": "quay.io/openshift-release-dev/ocp-release@sha256:cf3f799779fb0646c43dd16d376bf67fddd29597009d21223f956f5dd7a4c02f",
"startedTime": "2020-08-03T11:48:16Z",
"state": "Completed",
"verified": false,
"version": "4.4.15"
}
],
"observedGeneration": 3,
"versionHash": "BQVhuXCVbRE="
}
-----------------------------------------
Case 2:Upgrade between 4.5.z
Did the same setting for featuregate likes above.
$ oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge
clusterversion.config.openshift.io/version patched
$ oc adm upgrade
Cluster version is 4.5.4
No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and result in downtime or data loss.
$ oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100 --allow-explicit-upgrade --force
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.5.4 True True 12m Working towards 4.5.0-0.nightly-2020-08-01-204100: 77% complete
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.5.0-0.nightly-2020-08-01-204100 True False 48m Cluster version is 4.5.0-0.nightly-2020-08-01-204100
$ oc get clusterversion -o json | jq .items[0].status
{
"availableUpdates": [
{
"force": false,
"image": "registry.svc.ci.openshift.org/ocp/release@sha256:0a2171761c02ca895b33887d5d4991e932f4e87c3e07670557a06fca9923ff87",
"version": "4.5.0-0.nightly-2020-08-03-123303"
}
],
"conditions": [
{
"lastTransitionTime": "2020-08-03T12:04:27Z",
"message": "Done applying 4.5.0-0.nightly-2020-08-01-204100",
"status": "True",
"type": "Available"
},
{
"lastTransitionTime": "2020-08-03T14:09:15Z",
"status": "False",
"type": "Failing"
},
{
"lastTransitionTime": "2020-08-03T14:26:39Z",
"message": "Cluster version is 4.5.0-0.nightly-2020-08-01-204100",
"status": "False",
"type": "Progressing"
},
{
"lastTransitionTime": "2020-08-03T11:34:59Z",
"status": "True",
"type": "RetrievedUpdates"
},
{
"lastTransitionTime": "2020-08-03T12:07:09Z",
"message": "Cluster operator marketplace cannot be upgraded between minor versions: The cluster has custom OperatorSource, which is deprecated in future versions. Please visit this link for further details: https://docs.openshift.com/container-platform/4.4/release_notes/ocp-4-4-release-notes.html#ocp-4-4-marketplace-apis-deprecated",
"reason": "DeprecatedAPIsInUse",
"status": "False",
"type": "Upgradeable"
}
],
"desired": {
"force": true,
"image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100",
"version": "4.5.0-0.nightly-2020-08-01-204100"
},
"history": [
{
"completionTime": "2020-08-03T14:26:39Z",
"image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100",
"startedTime": "2020-08-03T13:48:01Z",
"state": "Completed",
"verified": false,
"version": "4.5.0-0.nightly-2020-08-01-204100"
},
{
"completionTime": "2020-08-03T12:04:27Z",
"image": "quay.io/openshift-release-dev/ocp-release@sha256:02dfcae8f6a67e715380542654c952c981c59604b1ba7f569b13b9e5d0fbbed3",
"startedTime": "2020-08-03T11:34:59Z",
"state": "Completed",
"verified": false,
"version": "4.5.4"
}
],
"observedGeneration": 3,
"versionHash": "BQVhuXCVbRE="
}
From above test results, we can see the fix works fine, so move the bug Verified.
The Upgradeable=False condition applies to the currently running version and blocks you from upgrading to the next minor version, therefore this shouldn't block 4.4 to 4.5 upgrades but would block upgrades from 4.5 to 4.6. So the minimum version for the 4.5 to 4.6 upgrade should be at least 4.5.5 and we should consider raising the minimum 4.4 version once the fix has been backported to 4.4. It appears there are currently four affected clusters based on telemetry data.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (OpenShift Container Platform 4.5.5 bug fix update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2020:3188
Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1]. If you feel like this bug still needs to be a suspect, please add keyword again.
[1]: https://github.com/openshift/enhancements/pull/475
Verified with OCP 4.5.0-0.nightly-2020-08-01-204100, steps see below, Case 1: Upgrade 4.4.z to 4.5.z $ oc edit featuregate/cluster $ oc describe featuregate/cluster Name: cluster Namespace: Labels: <none> Annotations: release.openshift.io/create-only: true API Version: config.openshift.io/v1 Kind: FeatureGate ... Spec: Feature Set: LatencySensitive Events: <none> $ cat topologymanager-kubeletconfig.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: cpumanager-enabled spec: machineConfigPoolSelector: matchLabels: custom-kubelet: cpumanager-enabled kubeletConfig: cpuManagerPolicy: static cpuManagerReconcilePeriod: 5s topologyManagerPolicy: single-numa-node $ oc create -f topologymanager-kubeletconfig.yaml kubeletconfig.machineconfiguration.openshift.io/cpumanager-enabled created $ oc get KubeletConfig NAME AGE cpumanager-enabled 35s $ oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge clusterversion.config.openshift.io/version patched $ oc adm upgrade Cluster version is 4.4.15 Updates: VERSION IMAGE 4.4.0-0.ci-2020-07-31-153948 registry.svc.ci.openshift.org/ocp/release@sha256:816d581120c2f4e42ae99600cd5e475be0e253a42a416dc12ea418fd4c7697a3 $ oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100 --allow-explicit-upgrade --force warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. Updating to release image registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.4.15 True True 22m Working towards 4.5.0-0.nightly-2020-08-01-204100: 79% complete $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-08-01-204100 True False 52m Cluster version is 4.5.0-0.nightly-2020-08-01-204100 $ oc get clusterversion -o json | jq .items[0].status { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2020-08-03T12:12:06Z", "message": "Done applying 4.5.0-0.nightly-2020-08-01-204100", "status": "True", "type": "Available" }, { "lastTransitionTime": "2020-08-03T14:21:16Z", "status": "False", "type": "Failing" }, { "lastTransitionTime": "2020-08-03T14:22:31Z", "message": "Cluster version is 4.5.0-0.nightly-2020-08-01-204100", "status": "False", "type": "Progressing" }, { "lastTransitionTime": "2020-08-03T13:36:24Z", "status": "True", "type": "RetrievedUpdates" }, { "lastTransitionTime": "2020-08-03T12:14:48Z", "message": "Cluster operator marketplace cannot be upgraded between minor versions: The cluster has custom OperatorSource, which is deprecated in future versions. Please visit this link for further details: https://docs.openshift.com/container-platform/4.4/release_notes/ocp-4-4-release-notes.html#ocp-4-4-marketplace-apis-deprecated", "reason": "DeprecatedAPIsInUse", "status": "False", "type": "Upgradeable" } ], "desired": { "force": true, "image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100", "version": "4.5.0-0.nightly-2020-08-01-204100" }, "history": [ { "completionTime": "2020-08-03T14:22:31Z", "image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100", "startedTime": "2020-08-03T13:37:47Z", "state": "Completed", "verified": false, "version": "4.5.0-0.nightly-2020-08-01-204100" }, { "completionTime": "2020-08-03T12:12:06Z", "image": "quay.io/openshift-release-dev/ocp-release@sha256:cf3f799779fb0646c43dd16d376bf67fddd29597009d21223f956f5dd7a4c02f", "startedTime": "2020-08-03T11:48:16Z", "state": "Completed", "verified": false, "version": "4.4.15" } ], "observedGeneration": 3, "versionHash": "BQVhuXCVbRE=" } ----------------------------------------- Case 2:Upgrade between 4.5.z Did the same setting for featuregate likes above. $ oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge clusterversion.config.openshift.io/version patched $ oc adm upgrade Cluster version is 4.5.4 No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and result in downtime or data loss. $ oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100 --allow-explicit-upgrade --force warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. Updating to release image registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.4 True True 12m Working towards 4.5.0-0.nightly-2020-08-01-204100: 77% complete $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-08-01-204100 True False 48m Cluster version is 4.5.0-0.nightly-2020-08-01-204100 $ oc get clusterversion -o json | jq .items[0].status { "availableUpdates": [ { "force": false, "image": "registry.svc.ci.openshift.org/ocp/release@sha256:0a2171761c02ca895b33887d5d4991e932f4e87c3e07670557a06fca9923ff87", "version": "4.5.0-0.nightly-2020-08-03-123303" } ], "conditions": [ { "lastTransitionTime": "2020-08-03T12:04:27Z", "message": "Done applying 4.5.0-0.nightly-2020-08-01-204100", "status": "True", "type": "Available" }, { "lastTransitionTime": "2020-08-03T14:09:15Z", "status": "False", "type": "Failing" }, { "lastTransitionTime": "2020-08-03T14:26:39Z", "message": "Cluster version is 4.5.0-0.nightly-2020-08-01-204100", "status": "False", "type": "Progressing" }, { "lastTransitionTime": "2020-08-03T11:34:59Z", "status": "True", "type": "RetrievedUpdates" }, { "lastTransitionTime": "2020-08-03T12:07:09Z", "message": "Cluster operator marketplace cannot be upgraded between minor versions: The cluster has custom OperatorSource, which is deprecated in future versions. Please visit this link for further details: https://docs.openshift.com/container-platform/4.4/release_notes/ocp-4-4-release-notes.html#ocp-4-4-marketplace-apis-deprecated", "reason": "DeprecatedAPIsInUse", "status": "False", "type": "Upgradeable" } ], "desired": { "force": true, "image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100", "version": "4.5.0-0.nightly-2020-08-01-204100" }, "history": [ { "completionTime": "2020-08-03T14:26:39Z", "image": "registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-08-01-204100", "startedTime": "2020-08-03T13:48:01Z", "state": "Completed", "verified": false, "version": "4.5.0-0.nightly-2020-08-01-204100" }, { "completionTime": "2020-08-03T12:04:27Z", "image": "quay.io/openshift-release-dev/ocp-release@sha256:02dfcae8f6a67e715380542654c952c981c59604b1ba7f569b13b9e5d0fbbed3", "startedTime": "2020-08-03T11:34:59Z", "state": "Completed", "verified": false, "version": "4.5.4" } ], "observedGeneration": 3, "versionHash": "BQVhuXCVbRE=" } From above test results, we can see the fix works fine, so move the bug Verified.