Description of problem: It looks like scheduler.spec.defaultNodeSelector + pod.spec.nodeSelector are intersection (less likely union) When a "nodeSelector" is already defined for the resource, the scheduler will still append the "defaultNodeSelector" value to the pod resource. This causes the pod to be "unschedulable" if (what is common) no node matches both selectors. However, descriptions are misleading and one could think that it works in a different way. Version-Release number of selected component (if applicable): OCP 4.x (atm 4.4 the latest)
Adding to this from offline discussion, to clarify the problem doesn't seem to be with the scheduler itself but the description that's provided here: https://github.com/openshift/api/blob/d0b31d707c464221d1eb24846b0d0bbe57040102/config/v1/types_scheduling.go#L31-L51 Will open a PR to update this
Hi Mike, Saw that bug fix here is only the description and nothing related to the code itself, Below are the steps i performed to verify the bug, can you please take a look and let me know if the verification looks good ? Thanks !! Test 1: ================================ 1) Add defaultNodeSelector in the scheduler spec as below: spec: defaultNodeSelector: disktype=ssd mastersSchedulable: false policy: name: "" status: {} 2) Add the label disktype=ssd to one of the node in the cluster by running the command below: oc label nodes node1 disktype=sdd 3) Now schedule a pod using the spec below and see that pod gets scheduled. [ramakasturinarra@dhcp35-60 ~]$ cat podtest.yaml apiVersion: v1 kind: Pod metadata: name: nginx1 labels: env: test spec: containers: - name: nginx image: nginx imagePullPolicy: IfNotPresent 4) pod gets scheduled and runs with out any issues. Test 2: ====================================== 1) Keep the defaultNodeSelector in the scheduler spec 2) Now add the label to the node in the cluster by running the command below: oc label nodes node2 disktype=hdd 3) Now schedule a pod using the spec below and see that pod gets scheduled. [ramakasturinarra@dhcp35-60 ~]$ cat podtest.yaml apiVersion: v1 kind: Pod metadata: name: nginx1 labels: env: test spec: containers: - name: nginx image: nginx imagePullPolicy: IfNotPresent nodeSelector: disktype: hdd 4) Verify that pod runs successfully on node2. nginx1 1/1 Running 0 10s x.x.x.x <node2> <none> <none>
Verified with the payload below: [ramakasturinarra@dhcp35-60 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-06-20-011219 True False 3h49m Cluster version is 4.6.0-0.nightly-2020-06-20-011219
Yes, we can't change anything with the actual behavior for this bug so it is only testing that the description of this field now matches the observed behavior.
Verified with the payload below and i see that description of the field now matches the observed behaviour. [ramakasturinarra@dhcp35-60 verification-tests]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-06-23-053310 True False 64m Cluster version is 4.6.0-0.nightly-2020-06-23-053310 Tested the description by reading the "defaultNodeSelector" by running the command "oc get crd schedulers.config.openshift.io -o yaml" file and as per comment 5 the behaviour matches as well. Based on the above moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196
No qe_test_coverage to add here, it is just a description change in the code.