Hide Forgot
Description of problem: When node become DiskPressure=true, shouldn't schedule pod on those nodes. Version-Release number of selected component (if applicable): openshift v3.4.0.12 kubernetes v1.4.0+776c994 etcd 3.1.0-alpha.1 How reproducible: Always Steps to Reproduce: 1.Create a pod on node and create large file in pod $ oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/hello-pod-tmp-hostpath.yaml $ oc exec hello-pod -- dd if=/dev/zero of=/tmp/test1 bs=10M count=1024 2.When node become 'DiskPressure=True', create a another pod $ oc describe node openshift-128.lab.sjc.redhat.com|grep DiskPressure $ oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/hello-pod.yaml 3.Check the second pod status $ oc describe pod hello-pod Actual results: 3.The second pod become 'Evicted' [root@openshift-105 ~]# oc describe pod hello-pod Name: hello-pod Namespace: default Security Policy: anyuid Node: openshift-128.lab.sjc.redhat.com/ Start Time: Mon, 17 Oct 2016 22:07:59 -0400 Labels: name=hello-pod Status: Failed Reason: Evicted Message: Pod The node was low on compute resources. IP: Controllers: <none> Containers: hello-pod: Image: docker.io/deshuai/hello-pod:latest Port: 8080/TCP Volume Mounts: /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-7t1xx (ro) Environment Variables: <none> Volumes: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default-token-7t1xx: Type: Secret (a volume populated by a Secret) SecretName: default-token-7t1xx QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 18m 18m 1 {default-scheduler } Normal Scheduled Successfully assigned hello-pod to openshift-128.lab.sjc.redhat.com 18m 18m 1 {kubelet openshift-128.lab.sjc.redhat.com} Warning Evicted The node was low on compute resources. Expected results: 3.The pod should pending and scheduler Shouldn't schedule pod on node when node become 'DiskPressure=True' Additional info:
latest kubernetes don't have this issue.
Can you include the kubeletArguments snippet that you used to configure the node?
FWIW, I tried to repro using a simple reproduction that just set the nodefs.available<$(high_value) so a node will automatically report DiskPressure and saw that pods were not scheduled as expected. It's possible the scheduler cache could have been latent, but it would be good to see the full node-config.yaml.
Even I tried on latest ose(its close to 3.4.0.12) and can not reproduce and it works as expected: #oc describe node --config=./openshift.local.config/master/admin.kubeconfig | grep DiskPres DiskPressure True Thu, 20 Oct 2016 12:21:59 -0400 Thu, 20 Oct 2016 12:18:38 -0400 KubeletHasDiskPressure kubelet has disk pressure 3m 3m 2 {kubelet 192.168.124.61} Normal NodeHasNoDiskPressure Node 192.168.124.61 status is now: NodeHasNoDiskPressure 3m 3m 1 {kubelet 192.168.124.61} Normal NodeHasDiskPressure Node 192.168.124.61 status is now: NodeHasDiskPressure And the pod status is pending with the following event: Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 2m 7s 14 {default-scheduler } Warning FailedScheduling pod (hello-pod) failed to fit in any node fit failure on node (192.168.124.61): NodeUnderDiskPressure
Hi DeShaui, In my setup to simulate Disk Pressure, I had: kubeletArguments: eviction-hard: - "nodefs.available<12Gi" In my setup and Derek's setup, we could not reproduce it. One thing as Derek said could be related to latent scheduler cache. Anyway, would be good to look at your node-config.yaml to see what it has. I am closing it for time being. Please reopen if you see it consistently.