Description of problem: Create a EUS Control loop to check for API server and node versions skew. If find pools that are greater than the n-2 skew, then emit an event with the message to warn the user to upgrade the node to kube-apiserver supported version. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Hi, Qi Can you tell me how to pause the worker MachineConfigPool before upgrade?
(In reply to MinLi from comment #2) > Hi, Qi > Can you tell me how to pause the worker MachineConfigPool before upgrade? Run `oc edit machineconfigpool/worker` and set the paused: true or `oc patch mcp/worker paused=true`
Hi, Qi I test this bug as the following steps, but don't find an event to warn the user to upgrade the node to kube-apiserver supported version. Please help to check: 1.create a 4.6 nightly build cluster 2.$oc edit mcp worker, and set "paused: true" 3.upgrade the cluster to 4.7.24, and succeed 4.upgrade the cluster to 4.8.5, and succeed 5.check many ClusterOperator description, such as config-operator($ oc describe co config-operator), kube-apiserver, kube-controller-manager, machine-api, machine-config, etc, but don't see tips like "kubelet version skew" 6.$ oc get events -A | grep "an unsupported kubelet version skew" , the output is empty And when I set "paused: false" for mcp worker, then the kubelet version roll out to the same with API server.
(In reply to MinLi from comment #4) > Hi, Qi > > I test this bug as the following steps, but don't find an event to warn the > user to upgrade the node to kube-apiserver supported version. > Please help to check: > > 1.create a 4.6 nightly build cluster > 2.$oc edit mcp worker, and set "paused: true" > 3.upgrade the cluster to 4.7.24, and succeed > 4.upgrade the cluster to 4.8.5, and succeed > 5.check many ClusterOperator description, such as config-operator($ oc > describe co config-operator), kube-apiserver, kube-controller-manager, > machine-api, machine-config, etc, but don't see tips like "kubelet version > skew" > 6.$ oc get events -A | grep "an unsupported kubelet version skew" , the > output is empty > > And when I set "paused: false" for mcp worker, then the kubelet version roll > out to the same with API server. Sorry, I forget to mention the wording is different when the version difference is 2, compared with greater than 2. Pause the pool at 4.6, and upgrade to 4.8, the version difference is 2 (kubelet 1.19 and kube-apiserver is 1.21, I haven't checked), the status is KubeletSkewPresent, like: $ oc describe ClusterOperator Spec: Status: ..... Message: "Current kubelet version 1.19 will not be supported by newer kube-apiserver. Please upgrade the kubelet first if plan to upgrade the kube-apiserver. Reason: KubeletSkewPresent You can upgrade to 4.9 to see the KubeletSkewUnsupported, if the kube-apiserver is 1.22. On the current 4.8.5 cluster, can you check if the above KubeletSkewPresent appears in the ClusterOperator? and upgrade to 4.9 to see if there is the KubeletSkewUnsupported warning? I think the kube-apiserver version of 4.9 is upgraded to 1.22 now.
Pause the pool at 4.6, and upgrade to 4.8, kubelet 1.19 and kube-apiserver is 1.21, I can't find relevant message: [lyman@localhost env]$ oc describe ClusterOperator | grep -i Skew [lyman@localhost env]$ Pause the pool at 4.6, and upgrade to 4.9, kubelet 1.19 and kube-apiserver is 1.22, find message like this: [lyman@localhost env]$ oc describe ClusterOperator | grep -i Skew Message: One or more nodes have an unsupported kubelet version skew. Please see `oc get nodes` for details and upgrade all nodes so that they have a kubelet version of at least 1.20.0. Reason: KubeletSkewUnsupported @Qi Wang, is this expected? when the version skew is 2, will not prompt relevant message.
(In reply to MinLi from comment #6) > Pause the pool at 4.6, and upgrade to 4.8, kubelet 1.19 and kube-apiserver > is 1.21, I can't find relevant message: > [lyman@localhost env]$ oc describe ClusterOperator | grep -i Skew > [lyman@localhost env]$ > > > Pause the pool at 4.6, and upgrade to 4.9, kubelet 1.19 and kube-apiserver > is 1.22, find message like this: > [lyman@localhost env]$ oc describe ClusterOperator | grep -i Skew > Message: One or more nodes have an unsupported kubelet > version skew. Please see `oc get nodes` for details and upgrade all nodes so > that they have a kubelet version of at least 1.20.0. > Reason: KubeletSkewUnsupported > > @Qi Wang, is this expected? when the version skew is 2, will not prompt > relevant message. Yes, that's expected. I just realized that the version skew check only starts from openshift 4.9. The version skew is 2, but it's on 4.8 cluster and the version has not been checked.
verified. upgrade path:4.6.0-0.nightly-2021-08-22-084748 -> 4.7.0-0.nightly-2021-08-21-153346 -> 4.8.0-0.nightly-2021-08-22-035234 -> 4.9.0-0.nightly-2021-08-22-0704054.6.0-0.nightly-2021-08-22-084748 -> 4.7.0-0.nightly-2021-08-21-153346 -> 4.8.0-0.nightly-2021-08-22-035234 -> 4.9.0-0.nightly-2021-08-22-070405
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759