Bug 1250416 - [scheduler] It will keep trying to put the pod on the node which matches node selector but does not match other predicates [NEEDINFO]
[scheduler] It will keep trying to put the pod on the node which matches node...
Status: NEW
Product: OpenShift Origin
Classification: Red Hat
Component: Pod (Show other bugs)
Unspecified Unspecified
medium Severity low
: ---
: ---
Assigned To: Abhishek Gupta
Jianwei Hou
Depends On:
  Show dependency treegraph
Reported: 2015-08-05 06:59 EDT by Meng Bo
Modified: 2017-11-06 12:15 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
abhgupta: needinfo? (ccoleman)

Attachments (Terms of Use)

  None (edit)
Description Meng Bo 2015-08-05 06:59:09 EDT
Description of problem:
Create scheduler.json without all the predicates added, (MatchNodeSelector is required). Try to create rc with pods which matches the node selector but does not match other predicates, eg, host port, it will keep generating pods on the nodes which match the node selector.

Version-Release number of selected component (if applicable):
openshift v1.0.4-59-g786896f
kubernetes v1.0.0

How reproducible:

Steps to Reproduce:
1. Setup env with 4 nodes

2. Config the scheduler without all the predicates 
# cat /tmp/scheduler.json 
        "kind" : "Policy",
        "version" : "v1",
        "predicates" : [
                {"name" : "MatchNodeSelector"}
        "priorities" : [

3. Add label to some of them
# oc get node 
NAME                 LABELS                                                  STATUS
master.bmeng.local   kubernetes.io/hostname=master.bmeng.local,node=master   Ready
node1.bmeng.local    kubernetes.io/hostname=node1.bmeng.local,node=master    Ready
node2.bmeng.local    kubernetes.io/hostname=node2.bmeng.local,node=node      Ready
node3.bmeng.local    kubernetes.io/hostname=node3.bmeng.local,node=node      Ready

4. Create router with --selector which matches the label above but the replica number greater than the nodes with label
# oadm router --create --credentials=/root/openshift.local.config/master/openshift-router.kubeconfig --replicas=3 --service-account=default --selector=node=master

Actual results:
After the 1st two pods are put on the nodes which match the selector, it will keep trying to place the 3rd pod on the same nodes. But it will fail due to the host port conflicted.

Expected results:
Should report fail when there no place to schedule the pods but not keep trying.

Additional info:
# oc get po
NAME             READY     STATUS             RESTARTS   AGE
router-1-01a6j   0/1       HostPortConflict   0          3m
router-1-02rj6   0/1       HostPortConflict   0          3m
router-1-03fdf   0/1       HostPortConflict   0          48s
router-1-04aev   0/1       HostPortConflict   0          30s
router-1-04z8v   0/1       HostPortConflict   0          3m
router-1-07euh   0/1       HostPortConflict   0          3m
router-1-09mfu   0/1       HostPortConflict   0          4m
router-1-0cvb6   0/1       HostPortConflict   0          4m
router-1-0gn3i   0/1       HostPortConflict   0          53s
router-1-0lav2   0/1       HostPortConflict   0          3m
router-1-0mr7e   0/1       HostPortConflict   0          1m
router-1-0ozgu   0/1       HostPortConflict   0          34s
router-1-0qlvt   0/1       HostPortConflict   0          3m
router-1-0sz9c   0/1       HostPortConflict   0          3m
router-1-jdsk9   1/1       Running            0          4m

router-1-mnjct   1/1       Running            0          4m
Comment 1 Abhishek Gupta 2015-08-05 14:10:08 EDT
If port conflicts need to be avoided, then the appropriate predicate needs to be specified in the scheduler configuration. You cannot exclude the port conflict predicate and then expect the scheduler to deal with it. Predicates are the intended mechanisms to perform the required node filtering

However, the current behavior calls into question the ability in the scheduler configuration to exclude these default predicates. Since the kubelet is doing its own validation (similar to a couple of predicates), we should perhaps ALWAYS include such predicates in the scheduler configuration.
Comment 2 Abhishek Gupta 2015-08-17 12:48:11 EDT
Clayton: I would like to get your inputs on the comment above. Should we prevent the users from excluding these predicates that perform the sort of node-filtering that the system (kubelet) itself is validating/checking for?

MatchNodeSelector, HostName, (also PortConflict ?) would be the predicates that we should consider always adding to the scheduler configuration.
Comment 3 Abhishek Gupta 2015-08-17 12:52:17 EDT
Reducing severity since the user is expected to specify the appropriate predicates if he/she wishes for the restrictions to be applied correctly.

Note You need to log in before you can comment on or make changes to this bug.