Bug 1203054

Summary: [origin_infrastructure_372] Unschedulable intent is not fullfiled by k8s
Product: OKD Reporter: Jianwei Hou <jhou>
Component: ContainersAssignee: Ravi Sankar <rpenta>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.xCC: lxia, mmccomas, rpenta
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-07 23:44:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jianwei Hou 2015-03-18 02:52:42 UTC
Description of problem:
After updating a node spec to unschedulable, k8s does not make this node unschedulable, when getting node information, the node is always in 'Ready' status.

Version-Release number of selected component (if applicable):
Client Version: version.Info{Major:"0", Minor:"12+", GitVersion:"v0.12.0-705-g879bc3a677fcf9-dirty", GitCommit:"879bc3a677fcf92d70bea982e891560158ffbe95", GitTreeState:"dirty"}
Server Version: version.Info{Major:"0", Minor:"12+", GitVersion:"v0.12.0-705-g879bc3a677fcf9-dirty", GitCommit:"879bc3a677fcf92d70bea982e891560158ffbe95", GitTreeState:"dirty"}


How reproducible:
Always

Steps to Reproduce:
1. Start kubernetes using cluster/local-up-cluster.sh using latest kubernetes upstream source code
2. Update the node spec to unschedulable
cluster/kubectl.sh update nodes 127.0.0.1 --patch='{"apiVersion": "v1beta1", "unschedulable": true}'
3. Wait for some time(more than 10 minutes in my case), and retrieve the node information


Actual results:
After step 3: The node status is not updated to 'NotSchedulable'

# cluster/kubectl.sh get nodes
current-context: "local"
Running: cluster/../cluster/gce/../../cluster/../_output/local/bin/linux/amd64/kubectl get nodes
NAME                LABELS              STATUS
127.0.0.1           <none>              Ready

# curl http://localhost:8080/api/v1beta3/nodes/127.0.0.1
{
  "kind": "Node",
  "apiVersion": "v1beta3",
  "metadata": {
    "name": "127.0.0.1",
    "selfLink": "/api/v1beta3/nodes/127.0.0.1",
    "uid": "bf84232d-cd15-11e4-9e78-4437e66a7eb3",
    "resourceVersion": "45",
    "creationTimestamp": "2015-03-18T10:23:21+08:00"
  },
  "spec": {
    "capacity": {
      "cpu": "4",
      "memory": "7811Mi"
    },
    "unschedulable": true
  },
  "status": {
    "conditions": [
      {
        "type": "Ready",
        "status": "Full",
        "lastProbeTime": "2015-03-18T10:35:10+08:00",
        "lastTransitionTime": null,
        "reason": "kubelet is posting ready status"
      }
    ],
    "addresses": [
      {
        "type": "LegacyHostIP",
        "address": "127.0.0.1"
      }
    ],
    "nodeInfo": {
      "machineID": "2513208d79ec42e9af88a2ff0cc0c094",
      "systemUUID": "FE1B5514-004D-5590-086A-55BC232355EC"
    }
  }
}

Expected results:
The node status should be updated to 'NotSchedulable' when you `kubectl get nodes`, and the 'NodeSchedulable' should be updated to 'ConditionFull' after the above curl command

Additional info:

Comment 1 Ravi Sankar 2015-03-19 00:58:41 UTC
Recent k8s changes broke node 'unschedulable' feature.

Now kube-controller-manager has --sync_node_status flag to periodically sync node status and this flag is disabled by default.
If you start kube-controller-manager with these options below, you should not see any issue.
/usr/local/bin/kube-controller-manager --master=127.0.0.1:8080 --minion_regexp=.* --cloud_provider=vagrant --sync_node_status=true --v=2

Either we should not expose a flag to control syncing node status or we need better way to satisfy intent in node spec.

Comment 2 Ravi Sankar 2015-04-07 23:34:46 UTC
Fixed in k8s upstream.
'sync_node_status' flag will be removed from kube-controller-manager once this pr: https://github.com/GoogleCloudPlatform/kubernetes/pull/6058 gets merged.
Node status will be periodically updated by the kubelet.

Comment 3 Jianwei Hou 2015-04-14 06:45:22 UTC
Couldn't successfully do `make release` to the kubernetes project. I'll verify this bug once this problem is resolved.

Comment 4 Jianwei Hou 2015-04-15 08:09:35 UTC
Using the latest deployed vagrant cluster environment, I'm still unable to get the expected result. 

Running 'sudo journalctl -r -u kube-controller-manager' and I have discovered following error:
<----------------------------------------------->
Apr 15 08:02:31 kubernetes-master kube-controller-manager[3393]: E0415 08:02:31.681160    3393 nodecontroller.go:178] Error syncing cloud: Post http://127.0.0.1:8000/login: EOF
Apr 15 08:01:50 kubernetes-master kube-controller-manager[3393]: E0415 08:01:50.089736    3393 nodecontroller.go:178] Error syncing cloud: Post http://127.0.0.1:8000/login: read tcp 127.0.0.1:8000: connection reset by peer
Apr 15 08:01:08 kubernetes-master kube-controller-manager[3393]: E0415 08:01:08.553348    3393 nodecontroller.go:178] Error syncing cloud: Post http://127.0.0.1:8000/login: EOF
Apr 15 08:00:48 kubernetes-master kube-controller-manager[3393]: E0415 08:00:48.029594    3393 nodecontroller.go:178] Error syncing cloud: Post http://127.0.0.1:8000/login: EOF
Apr 15 08:00:06 kubernetes-master kube-controller-manager[3393]: E0415 08:00:06.503249    3393 nodecontroller.go:178] Error syncing cloud: Post http://127.0.0.1:8000/login: EOF
Apr 15 07:59:36 kubernetes-master kube-controller-manager[3393]: I0415 07:59:36.732702    3393 nodecontroller.go:451] Creating timestamp entry for newly observed Node 10.245.1.3
<----------------------------------------------->

My running processes
root      3393     1  0 07:08 ?        00:00:27 /usr/local/bin/kube-controller-manager --master=127.0.0.1:8080 --cluster_name=kubernetes --minion_regexp=.* --cloud_provider=vagrant --sync_nodes=true --v=2
root      3496     1  0 07:08 ?        00:00:22 /usr/local/bin/kube-scheduler --master=127.0.0.1:8080 --v=2
root      4290     1  0 07:09 ?        00:00:00 /bin/bash /etc/kubernetes/kube-addons.sh
root      4353  4290  0 07:09 ?        00:00:00 /bin/bash /etc/kubernetes/kube-addons.sh
vagrant  10963 10705  0 08:05 pts/1    00:00:00 grep --color=auto kube
root     27477     1  1 07:09 ?        00:00:57 /usr/local/bin/kubelet --api_servers=https://10.245.1.2:6443 --config=/etc/kubernetes/manifests --allow_privileged=False --v=2 --cluster_dns=10.247.0.10 --cluster_domain=kubernetes.local
root     32518     1  0 07:23 ?        00:00:00 sudo /usr/local/bin/kube-apiserver --logtostderr=true --v=0 --etcd_servers=http://127.0.0.1:4001 --address=0.0.0.0 --port=8080 --allow_privileged=false --portal_net=10.254.0.0/16
root     32519 32518  1 07:23 ?        00:00:35 /usr/local/bin/kube-apiserver --logtostderr=true --v=0 --etcd_servers=http://127.0.0.1:4001 --address=0.0.0.0 --port=8080 --allow_privileged=false --portal_net=10.254.0.0/16


No idea what went wrong here

Comment 5 Jianwei Hou 2015-04-17 09:28:33 UTC
After talking with @ravips, the expected result should be when spec.unschedulable is true, any pods creation will not be scheduled on that node. This is working correctly. Verified.

Tested on version:
Client Version: version.Info{Major:"0", Minor:"15+", GitVersion:"v0.15.0-248-g23f2401b459244-dirty", GitCommit:"23f2401b4592445c34e5d7d9bb05a8839fbf4161", GitTreeState:"dirty"}
Server Version: version.Info{Major:"0", Minor:"15+", GitVersion:"v0.15.0-248-g23f2401b459244-dirty", GitCommit:"23f2401b4592445c34e5d7d9bb05a8839fbf4161", GitTreeState:"dirty"}