Ben added UpgradeBlocker, but I don't think this is something we're going to block update recommendations on [1], is it? If I'm missing something, please restore the keyword and explain which edges are impacted. I do expect we'll be raising minor_min in our build suggestions [2] to ensure future 4.8 -> 4.9 recommendations have the backported-to-4.8 version of this code. [1]: https://github.com/openshift/enhancements/pull/475 [2]: https://github.com/openshift/cincinnati-graph-data/blob/master/build-suggestions/4.9.yaml
> Ben added UpgradeBlocker, but I don't think this is something we're going to block update recommendations on [1], is it? yes, it is an upgradeblocker (blocks upgrades to 4.9.0). We cannot create an upgrade edge from any 4.8.z to 4.9.0 until we are confident that admins cannot upgrade from that 4.8.z to 4.9.0 without first acknowledging the apis that are being removed in 4.9.0. Technically this is a bug against 4.8.z(that is where the code change needs to go), but ultimately the release of(and upgrade to) 4.9.0 is what needs to be blocked.
Verification plan will be something like: 1. Launch a 4.9 nightly that has the new code. 2. Confirm Upgradeable=True condition in the ClusterVersion object. 3. Add an 'ack-4.9-testing: Acknowledge this to unblock updates to 4.10.' entry to the admin-gates ConfigMap in the openshift-config-managed namespace. 4. Confirm Upgradeable=False condition in the ClusterVersion object with a AdminAckRequired reason and message including the value from step 3. 5. Add an 'ack-4.9-testing: true' entry to the admin-acks ConfigMap in the openshift-config namespace. 6. Confirm Upgradeable=True condition in the ClusterVersion object.
Trevor's steps in comment 8 are correct. Reference enhancement [1] for more details on ConfigMap formats. [1] https://github.com/openshift/enhancements/blob/master/enhancements/update/upgrades-blocking-on-ack.md
Maybe it should be a separate bug since i don't want to block verifying the gating function, but part of calling this work "complete" also means delivering product documentation that walks users through the verification process, and ensuring that the gating message directs the user to that documentation.
From enhancement doc, per my understanding, the ideal customer case for this feature should be 4.8 -> 4.9 upgrade. So that means 4.8.z will eventually pick up the PR after we verify this feature works on 4.9->4.10 upgrade path, am I right?
> From enhancement doc, per my understanding, the ideal customer case for this feature should be 4.8 -> 4.9 upgrade. So that means 4.8.z will eventually pick up the PR after we verify this feature works on 4.9->4.10 upgrade path, am I right? you are correct
I play around with it a bit, but sound like I can not trigger the upgrade=false alert. Here is my steps: 1. Install a cluster using 4.9.0-0.nightly-2021-08-25-185404 2. From the fresh install, there is no "Upgradeable=True" condition in the ClusterVersion object. 3. Run `oc adm upgrade`, the output did not prompt me "Upgradable=False", so that means user is allowed to upgrade. [root@preserve-jialiu-ansible ~]# oc adm upgrade Cluster version is 4.9.0-0.nightly-2021-08-25-185404 Upstream: https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/graph Channel: stable-4.9 No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and may result in downtime or data loss. 4. Add an 'ack-4.9-testing: Acknowledge this to unblock updates to 4.10.' entry to the admin-gates ConfigMap in the openshift-config-managed namespace [root@preserve-jialiu-ansible ~]# oc -n openshift-config-managed patch cm admin-gates --patch '{"data":{"ack-4.9-testing":"Acknowledge this to unblock updates to 4.10"}}' --type=merge configmap/admin-gates patched [root@preserve-jialiu-ansible ~]# oc -n openshift-config-managed get cm admin-gates -o yaml apiVersion: v1 data: ack-4.9-testing: Acknowledge this to unblock updates to 4.10 kind: ConfigMap metadata: annotations: include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2021-08-26T08:14:51Z" name: admin-gates namespace: openshift-config-managed ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 1b591818-d938-45f0-9f35-5cbe8ebae251 resourceVersion: "133524" uid: 5dae5651-f8a0-406a-b979-58b3e719c94e 5. No "Upgradeable=False" condition is seen in the ClusterVersion object. [root@preserve-jialiu-ansible ~]# oc get clusterversion -ojson | jq -r '.items[].status.conditions[] | select(.type == "Upgradeable" and .status == "True")' [root@preserve-jialiu-ansible ~]# oc get clusterversion -ojson | jq -r '.items[].status.conditions[] | select(.type == "Upgradeable")' 6. Tried to upgrade the cluster to 4.9.0-0.nightly-2021-08-26-040328 even without adding an 'ack-4.9-testing: true' entry to the admin-acks ConfigMap in the openshift-config namespace, the upgrade was rolling out, this is not expected. [root@preserve-jialiu-ansible ~]# oc adm upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:664f96b82baddeecc57d41de9d9369d0456c566daa0762d4bbcfd0bc17de46fd --allow-explicit-upgrade warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway Updating to release image registry.ci.openshift.org/ocp/release@sha256:664f96b82baddeecc57d41de9d9369d0456c566daa0762d4bbcfd0bc17de46fd [root@preserve-jialiu-ansible ~]# oc adm upgrade info: An upgrade is in progress. Working towards 4.9.0-0.nightly-2021-08-26-040328: 9 of 732 done (1% complete) Do I miss something?
And think more about this enhancement feature, if this PR is landed onto 4.8.z, per the logic, when user try to upgrade the cluster from 4.8.z to 4.8.z+1, will also need user to add an 'ack-4.9-testing: true', user acknowledge should only be required when the upgrade is from 4.8.z to 4.9 scenario, am I right?
your steps look correct, but it's possible your upgrades to the admin gate configmap got stomped by the CVO, did you confirm your changes persisted after you saved your changes? with respect to your other questions: The way the ack keys work, the format is "ack-$version-description" where $version is the version you expect the user to be on, when you want to block them from going to the next y-stream. So in your case you were on 4.9, you added a "ack-4.9" gate, which is correct, it should block the cluster from going to 4.10. After this is backported to 4.8, we will have 4.8 create an "ack-4.8" gate, to prevent users from going to 4.9 until they have performed the ack. this is described in the EP here: https://github.com/openshift/enhancements/blob/master/enhancements/update/upgrades-blocking-on-ack.md#48z-specific-implementation
> did you confirm your changes persisted after you saved your changes? Yeah, I played with it for hours, and check the setting was there forwards and backwards when I found it did not work. > After this is backported to 4.8, we will have 4.8 create an "ack-4.8" gate, to prevent users from going to 4.9 until they have performed the ack. When my cluster the backported 4.8.N version, the "ack-4.8" gate is installed, if user want to upgrade to 4.8.(N+1), but not 4.9, then what happened? Per the code logic, I did not see it will check the target version, it just ask user acknowledge the gate.
> Yeah, I played with it for hours, and check the setting was there forwards and backwards when I found it did not work. ok. your procedure sounds ok to me, guess you'll have to work with Jack on why it didn't work as expected. > When my cluster the backported 4.8.N version, the "ack-4.8" gate is installed, if user want to upgrade to 4.8.(N+1), but not 4.9, then what happened? upgradeable=false will be set on the cluster, but upgradeable=false does not prevent z-stream upgrades, so you can still upgrade to 4.8.n+1 without doing the ack.
With a cluster-bot cluster: $ oc get clusterversion -o jsonpath='{.status.desired.version}{"\n"}' version 4.9.0-0.nightly-2021-08-26-164418 $ oc adm release info --commits | grep cluster-version-operator cluster-version-operator https://github.com/openshift/cluster-version-operator 111166dbb45a8e1693685d6095674c77d02bb2cf $ git --no-pager log --oneline --first-parent -3 111166db (HEAD -> master, origin/release-4.9, origin/release-4.10, origin/master, origin/HEAD) Merge pull request #642 from jottofar/bug-1986707 17d9690b Merge pull request #643 from wking/UpdateAvailable-labels d2adc87e Merge pull request #637 from jottofar/etcd-225-backup $ oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' 2021-08-26T23:30:59Z RetrievedUpdates=False NoChannel: The update channel has not been configured. 2021-08-26T23:49:31Z Available=True : Done applying 4.9.0-0.nightly-2021-08-26-164418 2021-08-26T23:48:31Z Failing=False : 2021-08-26T23:49:31Z Progressing=False : Cluster version is 4.9.0-0.nightly-2021-08-26-164418 On into the test: $ oc -n openshift-config-managed patch configmap admin-gates --patch '{"data":{"ack-4.9-testing":"Acknowledge this to unblock updates to 4.10"}}' --type=merge Indeed, no Upgradeable yet: $ oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' 2021-08-26T23:30:59Z RetrievedUpdates=False NoChannel: The update channel has not been configured. 2021-08-26T23:49:31Z Available=True : Done applying 4.9.0-0.nightly-2021-08-26-164418 2021-08-26T23:48:31Z Failing=False : 2021-08-26T23:49:31Z Progressing=False : Cluster version is 4.9.0-0.nightly-2021-08-26-164418 Looking in the CVO logs, it did notice the change: $ oc -n openshift-cluster-version logs cluster-version-operator-5c7778c6d4-4c5x2 | grep upgradeable.go | grep -v 'were recently checked' I0826 23:41:18.029482 1 upgradeable.go:396] ConfigMap openshift-config-managed/admin-gates added. I0826 23:41:35.545618 1 upgradeable.go:396] ConfigMap openshift-config/admin-acks added. I0827 00:03:11.661379 1 upgradeable.go:407] ConfigMap openshift-config-managed/admin-gates updated. although the setUpgradeableConditions call that should trigger is currently commented out [1]. So instead we wait out the usual syncUpgradeable poll. We have some of those: $ oc -n openshift-cluster-version logs cluster-version-operator-5c7778c6d4-4c5x2 | grep 'upgradeable.go\|Started syncing upgradeable' | tail -n3 I0827 00:03:11.661379 1 upgradeable.go:407] ConfigMap openshift-config-managed/admin-gates updated. I0827 00:04:29.298743 1 cvo.go:602] Started syncing upgradeable "openshift-cluster-version/version" (2021-08-27 00:04:29.298737078 +0000 UTC m=+1715.248783130) I0827 00:07:41.859516 1 cvo.go:602] Started syncing upgradeable "openshift-cluster-version/version" (2021-08-27 00:07:41.859511373 +0000 UTC m=+1907.809557429) Aha, apparently we did not include this check in 4.9 [2], so it's all dead code. I'm not sure how we can verify this in 4.9 before backporting. Or if we're even getting any benefit from soaking in 4.9. I'll file a PR to make this more alive, and we can revisit. [1]: https://github.com/openshift/cluster-version-operator/pull/633/files#diff-45df4901f88b6867d7fbf7e50690f376812864a0b85ec80dda5f77e6e19097b9R409 [2]: https://github.com/openshift/cluster-version-operator/pull/633/files#diff-45df4901f88b6867d7fbf7e50690f376812864a0b85ec80dda5f77e6e19097b9R383
Ok, here's some pre-merge testing with 'launch 4.9,openshift/cluster-version-operator#645': $ oc get clusterversion -o jsonpath='{.status.desired.version}{"\n"}' version 4.9.0-0.ci.test-2021-08-27-035113-ci-ln-f10sqvt-latest $ oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' 2021-08-27T04:19:17Z Available=True : Done applying 4.9.0-0.ci.test-2021-08-27-035113-ci-ln-f10sqvt-latest 2021-08-27T04:19:17Z Failing=False : 2021-08-27T04:19:17Z Progressing=False : Cluster version is 4.9.0-0.ci.test-2021-08-27-035113-ci-ln-f10sqvt-latest 2021-08-27T03:57:26Z RetrievedUpdates=False NoChannel: The update channel has not been configured. $ oc -n openshift-config-managed patch configmap admin-gates --patch '{"data":{"ack-4.9-testing":"Acknowledge this to unblock updates to 4.10"}}' --type=merge $ oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' 2021-08-27T04:19:17Z Available=True : Done applying 4.9.0-0.ci.test-2021-08-27-035113-ci-ln-f10sqvt-latest 2021-08-27T04:19:17Z Failing=False : 2021-08-27T04:19:17Z Progressing=False : Cluster version is 4.9.0-0.ci.test-2021-08-27-035113-ci-ln-f10sqvt-latest 2021-08-27T03:57:26Z RetrievedUpdates=False NoChannel: The update channel has not been configured. 2021-08-27T04:25:17Z Upgradeable=False AdminAckRequired: Acknowledge this to unblock updates to 4.10 $ oc -n openshift-config patch configmap admin-acks --patch '{"data":{"ack-4.9-testing":"true"}}' --type=merge $ oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' 2021-08-27T04:19:17Z Available=True : Done applying 4.9.0-0.ci.test-2021-08-27-035113-ci-ln-f10sqvt-latest 2021-08-27T04:19:17Z Failing=False : 2021-08-27T04:19:17Z Progressing=False : Cluster version is 4.9.0-0.ci.test-2021-08-27-035113-ci-ln-f10sqvt-latest 2021-08-27T03:57:26Z RetrievedUpdates=False NoChannel: The update channel has not been configured. So looks good. None of the explicit Upgradeable=True I was expecting in comment 8, but a lack of Upgradeable condition in ClusterVersion has the same effect.
Verified this bug with pre-merge build. Scenario 1: [root@preserve-jialiu-ansible ~]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest True False 3m9s Cluster version is 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest [root@preserve-jialiu-ansible ~]# oc adm upgrade Cluster version is 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.9 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest not found in the "stable-4.9" channel [root@preserve-jialiu-ansible ~]# oc -n openshift-config-managed patch cm admin-gates --patch '{"data":{"ack-4.9-testing":"Acknowledge this to unblock updates to 4.10"}}' --type=merge configmap/admin-gates patched [root@preserve-jialiu-ansible ~]# oc -n openshift-config-managed get cm admin-gates -o yaml apiVersion: v1 data: ack-4.9-testing: Acknowledge this to unblock updates to 4.10 kind: ConfigMap metadata: annotations: include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2021-08-30T08:20:21Z" name: admin-gates namespace: openshift-config-managed ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 6e87b1b0-5721-4f64-a8e8-97f82f7a6f55 resourceVersion: "30822" uid: a0932a39-88a0-4994-832d-282de44cf71f [root@preserve-jialiu-ansible ~]# oc get clusterversion -ojson | jq -r '.items[].status.conditions[] | select(.type == "Upgradeable")' { "lastTransitionTime": "2021-08-30T08:49:01Z", "message": "Acknowledge this to unblock updates to 4.10", "reason": "AdminAckRequired", "status": "False", "type": "Upgradeable" } [root@preserve-jialiu-ansible ~]# oc adm upgrade Cluster version is 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest Upgradeable=False Reason: AdminAckRequired Message: Acknowledge this to unblock updates to 4.10 Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.9 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest not found in the "stable-4.9" channel Trigger a z-stream upgrade to ensure it is not blocked. [root@preserve-jialiu-ansible ~]# oc adm upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:57c87dff1c29de881608160da96ff4243ce05444c8271cddca41006191d70aac --allow-explicit-upgrade warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway Updating to release image registry.ci.openshift.org/ocp/release@sha256:57c87dff1c29de881608160da96ff4243ce05444c8271cddca41006191d70aac [root@preserve-jialiu-ansible ~]# oc adm upgrade info: An upgrade is in progress. Working towards 4.9.0-0.nightly-2021-08-29-010334: 95 of 732 done (12% complete), waiting on kube-apiserver Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.9 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.9.0-0.nightly-2021-08-29-010334 not found in the "stable-4.9" channel Scenario 2: Fresh install another cluster to validate the upgrade between minor release. [root@preserve-jialiu-ansible ~]# oc adm upgrade Cluster version is 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.9 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest not found in the "stable-4.9" channel [root@preserve-jialiu-ansible ~]# oc -n openshift-config-managed patch cm admin-gates --patch '{"data":{"ack-4.9-testing":"Acknowledge this to unblock updates to 4.10"}}' --type=merge configmap/admin-gates patched [root@preserve-jialiu-ansible ~]# oc -n openshift-config-managed get cm admin-gates -o yaml apiVersion: v1 data: ack-4.9-testing: Acknowledge this to unblock updates to 4.10 kind: ConfigMap metadata: annotations: include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2021-08-30T10:11:00Z" name: admin-gates namespace: openshift-config-managed ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: f2f2b070-4d24-4024-b709-f1a9df0c4e48 resourceVersion: "33920" uid: 5cc246cd-3009-4fa1-a5f6-5cd948ad6710 [root@preserve-jialiu-ansible ~]# oc get clusterversion -ojson | jq -r '.items[].status.conditions[] | select(.type == "Upgradeable" and .status == "False")' { "lastTransitionTime": "2021-08-30T10:54:29Z", "message": "Acknowledge this to unblock updates to 4.10", "reason": "AdminAckRequired", "status": "False", "type": "Upgradeable" } [root@preserve-jialiu-ansible ~]# oc adm upgrade Cluster version is 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest Upgradeable=False Reason: AdminAckRequired Message: Acknowledge this to unblock updates to 4.10 Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.9 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest not found in the "stable-4.9" channel upgrade to a 4.8 version, e.g: 4.8.8 [root@preserve-jialiu-ansible ~]# oc adm upgrade --to-image quay.io/openshift-release-dev/ocp-release@sha256:d7a39773aec3cb5e3599be828ac101e062c0b587c9e922ed1f3a8cc71b01a93f --allow-explicit-upgrade warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway Updating to release image quay.io/openshift-release-dev/ocp-release@sha256:d7a39773aec3cb5e3599be828ac101e062c0b587c9e922ed1f3a8cc71b01a93f [root@preserve-jialiu-ansible ~]# oc adm upgrade info: An upgrade is in progress. Unable to apply 4.8.8: it may not be safe to apply this update Upgradeable=False Reason: AdminAckRequired Message: Acknowledge this to unblock updates to 4.10 Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.9 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest not found in the "stable-4.9" channel [root@preserve-jialiu-ansible ~]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.ci.test-2021-08-30-074010-ci-ln-yqqs6bt-latest True True 3m42s Unable to apply 4.8.8: it may not be safe to apply this update ack the upgrade [root@preserve-jialiu-ansible ~]# oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-4.9-testing":"true"}}' --type=merge configmap/admin-acks patched [root@preserve-jialiu-ansible ~]# oc -n openshift-config get cm admin-acks -o yaml apiVersion: v1 data: ack-4.9-testing: "true" kind: ConfigMap metadata: annotations: include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" release.openshift.io/create-only: "true" creationTimestamp: "2021-08-30T10:11:02Z" name: admin-acks namespace: openshift-config ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: f2f2b070-4d24-4024-b709-f1a9df0c4e48 resourceVersion: "38204" uid: 607ebfaf-5d20-4758-9192-ab69a4dd523c [root@preserve-jialiu-ansible ~]# oc get clusterversion -ojson | jq -r '.items[].status.conditions[] | select(.type == "Upgradeable")' upgrade is ongoing. [root@preserve-jialiu-ansible ~]# oc adm upgrade info: An upgrade is in progress. Working towards 4.8.8: 69 of 678 done (10% complete) Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.9 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.8.8 not found in the "stable-4.9" channel
I've created the clone to 4.8.z so we can start the backport process once this bug is verified: https://bugzilla.redhat.com/show_bug.cgi?id=1999092
Per the comment 21, move this bug to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759
everything from comment 7 has been addressed, so we can remove the UpgradeBlocker reminder keyword now.