Bug 1979433
Summary: | Default PodTopologySpread dones't work in non-CloudProvider env in OpenShift 4.7 | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Takayoshi Kimura <tkimura> |
Component: | kube-scheduler | Assignee: | Jan Chaloupka <jchaloup> |
Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.7 | CC: | ahoffer, aos-bugs, kahara, mfojtik, mori |
Target Milestone: | --- | ||
Target Release: | 4.7.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-08-17 12:12:09 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Takayoshi Kimura
2021-07-06 02:57:41 UTC
> The PodTopologySpread does not work when the required labels are not defined. > > It seems this change is not well documented in 4.7, putting a lot of users under risks of non-HA, no node level failure tolerant pod placement. > > Most of Baremetal UPI users who don't have zone label and rely on the default scheduler will be affected. Possibly RHV and vSphere users as well. This is a well known limitation. Documented upstream: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/#prerequisites. It's also mentioned in https://docs.openshift.com/container-platform/4.7/nodes/scheduling/nodes-scheduler-pod-topology-spread-constraints.html#nodes-scheduler-pod-topology-spread-constraints-configuring_nodes-scheduler-pod-topology-spread-constraints: ``` Prerequisites A cluster administrator has added the required labels to nodes. ``` It's true that migration from SelectorSpread to PodTopologySpread is performed under the hood. So a user may not know in advance all the relevant nodes have to set the labels to have the PodTopologySpread work properly. Something we might stress in the release notes. Something like: ``` Starting 4.7, SelectorSpread plugin is replaced by PodTopologySpread. Some of the original SelectorSpread plugin functionality is emulated by the PodTopologySpread plugin. It's strongly recommended to switch to the PodTopologySpread plugin directly. In both case, all nodes has to be correctly labeled from now on as described in https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/#prerequisites so both plugins work as expected. ``` or similar. Andrea, can you take a look at this? Known upstream issue: https://github.com/kubernetes/kubernetes/issues/102136 Moving the bug to verified state as the required changes to the cluster which ensures pod replicas are spread properly are already present in the 4.7 release notes and they work well. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.24 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3032 |