Description of problem: The desiredNumberScheduled of DS is incorrect, ref: https://bugzilla.redhat.com/show_bug.cgi?id=1501514#c25 Version-Release number of selected component (if applicable): openshift v3.10.0-0.27.0 kubernetes v1.10.0+b81c8f8 How reproducible: Always Steps to Reproduce: 1. Create a ds and check the desiredNumberScheduled [root@ip-172-18-9-197 ~]# oc get no NAME STATUS ROLES AGE VERSION ip-172-18-11-225.ec2.internal Ready compute 5h v1.10.0+b81c8f8 ip-172-18-12-238.ec2.internal Ready compute 5h v1.10.0+b81c8f8 ip-172-18-9-197.ec2.internal Ready master 5h v1.10.0+b81c8f8 [root@ip-172-18-9-197 ~]# oc get ds -n dma NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE hello-daemonset 3 2 2 2 2 <none> 1h [root@ip-172-18-9-197 ~]# oc get ds hello-daemonset -n dma -o yaml apiVersion: extensions/v1beta1 kind: DaemonSet metadata: creationTimestamp: 2018-04-24T05:13:28Z generation: 1 labels: name: hello-daemonset name: hello-daemonset namespace: dma resourceVersion: "37816" selfLink: /apis/extensions/v1beta1/namespaces/dma/daemonsets/hello-daemonset uid: 3935778e-477e-11e8-8311-0e11fb53aa4e spec: revisionHistoryLimit: 10 selector: matchLabels: name: hello-daemonset template: metadata: creationTimestamp: null labels: name: hello-daemonset spec: containers: - image: openshift/hello-openshift imagePullPolicy: Always name: registry ports: - containerPort: 80 protocol: TCP resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 10 templateGeneration: 1 updateStrategy: type: OnDelete status: currentNumberScheduled: 2 desiredNumberScheduled: 3 numberAvailable: 2 numberMisscheduled: 0 numberReady: 2 numberUnavailable: 1 observedGeneration: 1 updatedNumberScheduled: 2 Actual results: 1. desiredNumberScheduled is 3 Expected results: 1. desiredNumberScheduled is 2 Additional info:
upstream tracked issue: https://github.com/kubernetes/kubernetes/issues/53023
Not sure that's it. I think in this case this is caused by the carry patch we have to target right nodes when project default node selector is present and we didn't patch the part counting status, but I'd have to check.
> I think in this case this is caused by the carry patch we have to target right nodes when project default node selector is present and we didn't patch the part counting status, but I'd have to check Adding David. There was no node selector on the DS, indicating it wanted to run on all nodes, which is not allowed by the project's node selector. I think the current behavior is accurate.
if we wanted to act as though the DS limited itself to the nodes allowed by the project selector, we could do this: if matches, matchErr := dsc.namespaceNodeSelectorMatches(node, ds); matchErr != nil { return false, false, false, matchErr } else if !matches { - shouldSchedule = false - shouldContinueRunning = false + // This matches the behavior in the ErrNodeSelectorNotMatch case above + return false, false, false, nil } but that would make status not accurately reflect the intent expressed in the DS spec
The current behavior seems reasonable to me. You tried to place yourself on every node and only got two. I don't think I'm concerned enough about revealing the number of nodes in the cluster to adjust it.