Bug 1786274
Summary: | RHEL7 worker nodes may go to NotReady,SchedulingDisabled while upgrading from 4.2.12 to 4.3.0 | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Weinan Liu <weinliu> | |
Component: | Documentation | Assignee: | Kathryn Alexander <kalexand> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Gaoyun Pei <gpei> | |
Severity: | high | Docs Contact: | Vikram Goyal <vigoyal> | |
Priority: | high | |||
Version: | 4.3.0 | CC: | amurdaca, aos-bugs, gpei, jokerman, juzhao, kalexand, mifiedle, rphillips, scuppett, sdodson, sjenning, wking, wsun, xtian | |
Target Milestone: | --- | Keywords: | Regression, TestBlocker | |
Target Release: | 4.3.0 | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1792139 (view as bug list) | Environment: | ||
Last Closed: | 2020-01-24 21:02:30 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1792139 |
Description
Weinan Liu
2019-12-24 07:23:07 UTC
Dec 24 00:33:52 weinliu-1223-7674g-rhel-0 hyperkube[19781]: F1224 00:33:52.417436 19781 server.go:206] unrecognized feature gate: LegacyNodeRoleBehavior Ryan can you take a look at this? not sure if it's the cause of the upgrade failure on rhel nodes but worth checking as I don't see anything wrong MCO-wise openshift/api PR that put in this feature gate (new to 4.3) https://github.com/openshift/api/pull/467 The kubelet config controller in the MCO currently assumes that the set of OCP features gates is equal to the set of kube feature gates. LegacyNodeRoleBehavior, and the others introduced in that PR, are the first to introduce a OCP feature gate that is not a kube feature gate, leading to this issue. https://github.com/openshift/machine-config-operator/blob/master/pkg/controller/kubelet-config/kubelet_config_features.go#L189-L212 Wow, ok completely wrong. LegacyNodeRoleBehavior _is_ an upstream feature gate introduced in 1.16: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/ So this is a skew issue. The kubelet needs to be updated on the node before the the machine-config operator is updated. Updating the MCO will update the MCC thus the kubelet-config controller, which imports the new openshift/api and will include new feature gates in the config that may (and in this case, are) incompatible with the old kubelet. https://github.com/openshift/api/blob/master/config/v1/types_feature.go#L113-L124 Additionally, we test this upgrade with RHCOS workers in CI and it works. This issue is only for RHEL workers. I'm pretty sure that when the MachineConfig changes during RHCOS upgrade, we pivot the ostree first, upgrading the kubelet, then reboot and get the new files, including the new kubelet config file. That explains why we don't see this in the RHCOS worker case. Antonio, can you confirm this? (In reply to Seth Jennings from comment #9) > Additionally, we test this upgrade with RHCOS workers in CI and it works. > This issue is only for RHEL workers. > > I'm pretty sure that when the MachineConfig changes during RHCOS upgrade, we > pivot the ostree first, upgrading the kubelet, then reboot and get the new > files, including the new kubelet config file. That explains why we don't > see this in the RHCOS worker case. > > Antonio, can you confirm this? That is the case indeed and explains why we only see this on RHEL7 workers, is this something we need to take care on the MCC-kubelet controller or is RHEL/ansible responsability to do this? It seems to me that the RHEL worker upgrade pattern should be to upgrade the workers first, then upgrade the cluster. Newer kubelet will be compatible with the older config, at least n-1 skewed. Am I missing some obvious issue with that? Attempting this, the playbook currently reads the running cluster version and will only install openshift rpms that match the cluster version, even if a repo that has the newer version is installed https://github.com/openshift/openshift-ansible/blob/91645ed18b8e0b6c84dcc0229d02aee77db3fae2/roles/openshift_node/tasks/install.yml#L23-L60 This currently forces the "upgrade cluster then upgrade workers" ordering. (In reply to Seth Jennings from comment #11) > It seems to me that the RHEL worker upgrade pattern should be to upgrade the > workers first, then upgrade the cluster. Newer kubelet will be compatible > with the older config, at least n-1 skewed. > > Am I missing some obvious issue with that? Just that we'd have to ensure that the API were upgraded prior to kubelet because we don't support kubelet > api. The behavior during a 4.2 to 4.3 upgrade is that when the MCO rolls out the new configuration it will cordon and mark unavailable the number of hosts specified by the `maxUnavailable` field on the machine configuration pool. It will then apply new configuration and reboot the host. When doing so on a RHEL Worker this process does not update the kubelet therefore configuration specified by 4.3 will be applied to a 4.2 kubelet, because of this the host never returns to Ready state. This will stop the rollout until that host becomes available again and under the assumption that maxUnavailable, which defaults to 1, has been configured at a level acceptable to ensure normal cluster operation this should not be seen as a critical situation. Therefore, we will amend the documentation to make it clear that this will happen during 4.2 to 4.3 upgrades in clusters with RHEL workers and that the admin will need to run the RHEL Worker upgrade playbooks to complete the upgrade. Running the upgrade playbooks will update the kubelet on all specified RHEL workers and reboot them one by one. Once the RHEL worker has been updated it will return to ready state and the upgrade will complete as expected. This process also ensures that the API will have been upgraded prior to upgrading the kubelets where as other patterns may not. We will evaluate additional changes in the future to make this a more seamless upgrade. Junqi, did you install the 4.3 repo on the RHEL worker? I would have thought the upgrade playbook would fail if you had not, but maybe not (In reply to Seth Jennings from comment #18) > Junqi, did you install the 4.3 repo on the RHEL worker? I would have thought > the upgrade playbook would fail if you had not, but maybe not Hmm, I'd assumed that's in the docs for rhel worker upgrade but I'm not finding that. We need to make sure that the repo toggling bits are added too. Roughly the same as the subscription manager snippet here https://docs.openshift.com/container-platform/3.11/upgrading/automated_upgrades.html#preparing-for-an-automated-upgrade PR's here: https://github.com/openshift/openshift-docs/pull/19059 Gaoyun Pei, will you PTAL? Add comment to the doc PR. The proposed doc PR lgtm, move this bug to verified for 4.3.0. And also cloned the bug to 4.4.0 to see if we could have some better solution. This change is live on docs.openshift: https://docs.openshift.com/container-platform/4.3/updating/updating-cluster-rhel-compute.html#rhel-compute-updating_updating-cluster-rhel-compute And on the portal: https://access.redhat.com/documentation/en-us/openshift_container_platform/4.3/html-single/updating_clusters/index#rhel-compute-updating_updating-cluster-rhel-compute |