Description of problem: Node affinitiy is an alpha feature but can not be disabled in OpenShift. As a result a user in one project can set this causing scheduling issues across the cluster. The fix for the issue was merged in 3.6 with this upstream PR. https://github.com/kubernetes/kubernetes/pull/45352 Version-Release number of selected component (if applicable): 3.5 Additional info: 57s 4m 15 backend-27-mc7jp Pod Warning FailedScheduling {default-scheduler } pod (backend-2-xxxx) failed to fit in any node fit failure summary on nodes : CheckServiceAffinity (12), MatchInterPodAffinity (5), MatchNodeSelector (12) When increasing to log level 10 the master controllers log shows. Cannot schedule project2/backend-2-xxxx onto node node1.example.com,because of PodAntiAffinityTerm &{&LabelSelector{MatchLabels:map[string]string{},MatchExpressions:[{deploymentconfig In [backend]}],} [] For every node that matches the default node selector region=east. The affinity rule of another user and project is causing the scheduling failure: dc/backend in project bugtest-1: scheduler.alpha.kubernetes.io/affinity: | { "podAntiAffinity": { "requiredDuringSchedulingIgnoredDuringExecution": [{ "labelSelector": { "matchExpressions": [{ "key": "deploymentconfig", "operator": "In", "values":["backend"] }] }, "topologyKey": "kubernetes.io/hostname" }] } }
Ryan, I was able to reproduce the issue in Origin 1.5. Following are my observations: - The fix you have mentioned upstream( https://github.com/kubernetes/kubernetes/pull/45352) solves this problem. I tested it against 1.5(we are not seeing this behaviour, once the patch has been applied). - But OCP 3.6(which is based on Origin branch release-3.6) has the same problem(https://github.com/openshift/origin/blob/v3.6.0/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/algorithm/predicates/predicates.go#L1077). The cherry-pick to kube 1.6 branch happened on May5th and our kube rebase happened on April 27th.
Hi, ravig Could you give detailed reproduce steps? Thanks.
Hi Weihua Meng, Steps to reproduce: Create 2 node OCP 3.5 cluster(1 master + 2 nodes + (1 optional infra-node)). Use the following file http://pastebin.test.redhat.com/512145 (after saving it a sample.yaml) - oc create project sample - oc create project sample1 (Alternatively the yaml could include a namespace and avoid below 2 steps). - oc create -f sample.yaml -n sample - oc create -f sample.yaml -n sample1 If we do a - oc get pods -n sample1 We will see error related to one of pods not coming to running state. The original upstream issue could be found for Kube 1.6 at https://github.com/kubernetes/kubernetes/issues/45484(This could be used for ocp 3.6 testing).
Thanks ravig It is very helpful. I tried, This bug not only cause pending for the same user, but also may cause pending for different users. Just curious whether we officially announce this alpha feature to customer in 3.5.
Yes this would happen to any user. As a matter of fact, the multi-tenancy is at project level. I believe it is in tech-preview mode in 3.5(which is based on origin 1.5). https://github.com/openshift/origin/tree/release-1.5 contains a list of features table. Pod affinity and anti-affinity are in tech preview mode but I am not sure about the support terms.
This is a techpreview correct, but there is no way to disable any techpreview (alpha/beta) features. Due to this any user and implement this and cause an issue in the cluster that is why we are requesting this be backported to 3.5 and 3.6. Thank you
Verified on openshift v3.6.173.0.37 Fixed. All pods are scheduled in different projects.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3049