Document URL: https://docs.openshift.com/container-platform/4.6/updating/updating-cluster-cli.html#update-upgrading-cli_updating-cluster-cli
Section Number and Name: 7
Describe the issue: It does not describe checking the node's status after checking the 'oc get clusterversion' command as it only shows the control plane's update progress.
Suggestions for improvement: Add 8th procedure step to check the nodes status and its version.
Additional information: https://coreos.slack.com/archives/C2ZA5QGMV/p1620053053198000
@Trevor can you confirm that adding an additional step for verifying the node statuses during a cluster upgrade (oc get nodes) is useful on top of the current recommendation of checking the cluster version (oc get clusterversion)?
Querying nodes (or MachineConfigPools, for machine-config-managed nodes) might be useful in some cases (e.g. you want to deploy a workload that requires your compute to all be v1.2.3 or greater). I'm fuzzy on how bring-your-own RHEL and such fit in, but for MachineConfigPools, the machine-config operator will complain about them if they get stuck. So if folks are blocking some action on "wait until $POOL reaches $VERSION", then yeah, some kind of polling/waiting command suggestion seems useful. But for folks who are not blocking an action, it seems easier to wait and react to alert push-notifications. Maybe ask the machine-config folks if they have opinions?
@Jerry do you have any opinions for the question I posed in comment#1? And/or suggestions for use cases we should suggest checking nodes that would be helpful outside of the generic 'oc get clusterversion' upgrade confirmation? Thanks!
So as of 4.8, the MCO team has modified that a bit such that a degraded worker pool will now block upgrade completion, so in that sense "oc get clusterversion" will also be reporting if workers fail. In the future the worker pool will also be considered required for upgrade, or at least that is still in discussion.
With that in mind, I think I'm ok with either way. For older versions, `oc get nodes` will serve to double check if nodes have completed fully, so maybe its worth having there as a just-in-case
@Jia can you confirm this doc change? https://github.com/openshift/openshift-docs/pull/32232. Thanks!
Yes, the pr lgtm.
This has been merged. I'll provide the live links when they're available.
This is now live: