Bug 1571111 - The desiredNumberScheduled of DS is incorrect
Summary: The desiredNumberScheduled of DS is incorrect
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Master
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.10.0
Assignee: Tomáš Nožička
QA Contact: Wang Haoran
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-24 06:34 UTC by DeShuai Ma
Modified: 2018-05-03 20:36 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-03 20:36:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description DeShuai Ma 2018-04-24 06:34:49 UTC
Description of problem:
The desiredNumberScheduled of DS is incorrect, ref: https://bugzilla.redhat.com/show_bug.cgi?id=1501514#c25

Version-Release number of selected component (if applicable):
openshift v3.10.0-0.27.0
kubernetes v1.10.0+b81c8f8

How reproducible:
Always

Steps to Reproduce:
1. Create a ds and check the desiredNumberScheduled
[root@ip-172-18-9-197 ~]# oc get no
NAME                            STATUS    ROLES     AGE       VERSION
ip-172-18-11-225.ec2.internal   Ready     compute   5h        v1.10.0+b81c8f8
ip-172-18-12-238.ec2.internal   Ready     compute   5h        v1.10.0+b81c8f8
ip-172-18-9-197.ec2.internal    Ready     master    5h        v1.10.0+b81c8f8
[root@ip-172-18-9-197 ~]# oc get ds -n dma
NAME              DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
hello-daemonset   3         2         2         2            2           <none>          1h
[root@ip-172-18-9-197 ~]# oc get ds hello-daemonset -n dma -o yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  creationTimestamp: 2018-04-24T05:13:28Z
  generation: 1
  labels:
    name: hello-daemonset
  name: hello-daemonset
  namespace: dma
  resourceVersion: "37816"
  selfLink: /apis/extensions/v1beta1/namespaces/dma/daemonsets/hello-daemonset
  uid: 3935778e-477e-11e8-8311-0e11fb53aa4e
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: hello-daemonset
  template:
    metadata:
      creationTimestamp: null
      labels:
        name: hello-daemonset
    spec:
      containers:
      - image: openshift/hello-openshift
        imagePullPolicy: Always
        name: registry
        ports:
        - containerPort: 80
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: default
      serviceAccountName: default
      terminationGracePeriodSeconds: 10
  templateGeneration: 1
  updateStrategy:
    type: OnDelete
status:
  currentNumberScheduled: 2
  desiredNumberScheduled: 3
  numberAvailable: 2
  numberMisscheduled: 0
  numberReady: 2
  numberUnavailable: 1
  observedGeneration: 1
  updatedNumberScheduled: 2

Actual results:
1. desiredNumberScheduled is 3

Expected results:
1. desiredNumberScheduled is 2

Additional info:

Comment 1 Wang Haoran 2018-04-25 02:42:23 UTC
upstream tracked issue:
https://github.com/kubernetes/kubernetes/issues/53023

Comment 2 Tomáš Nožička 2018-04-25 05:35:33 UTC
Not sure that's it. I think in this case this is caused by the carry patch we have to target right nodes when project default node selector is present and we didn't patch the part counting status, but I'd have to check.

Comment 3 Jordan Liggitt 2018-05-03 15:49:10 UTC
> I think in this case this is caused by the carry patch we have to target right nodes when project default node selector is present and we didn't patch the part counting status, but I'd have to check

Adding David.

There was no node selector on the DS, indicating it wanted to run on all nodes, which is not allowed by the project's node selector.

I think the current behavior is accurate.

Comment 4 Jordan Liggitt 2018-05-03 15:51:12 UTC
if we wanted to act as though the DS limited itself to the nodes allowed by the project selector, we could do this:

if matches, matchErr := dsc.namespaceNodeSelectorMatches(node, ds); matchErr != nil {
  return false, false, false, matchErr
} else if !matches {
-  shouldSchedule = false
-  shouldContinueRunning = false
+  // This matches the behavior in the ErrNodeSelectorNotMatch case above
+  return false, false, false, nil
}


but that would make status not accurately reflect the intent expressed in the DS spec

Comment 5 David Eads 2018-05-03 16:25:01 UTC
The current behavior seems reasonable to me.  You tried to place yourself on every node and only got two.  I don't think I'm concerned enough about revealing the number of nodes in the cluster to adjust it.


Note You need to log in before you can comment on or make changes to this bug.