| Summary: | [infrastructure_public_371]Shouldn't schedule pod on node when node become 'DiskPressure=True' | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | DeShuai Ma <dma> |
| Component: | Node | Assignee: | Avesh Agarwal <avagarwa> |
| Status: | CLOSED WORKSFORME | QA Contact: | DeShuai Ma <dma> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.4.0 | CC: | aos-bugs, decarr, jokerman, mmccomas |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-10-20 16:35:12 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
DeShuai Ma
2016-10-18 02:37:04 UTC
latest kubernetes don't have this issue. Can you include the kubeletArguments snippet that you used to configure the node? FWIW, I tried to repro using a simple reproduction that just set the nodefs.available<$(high_value) so a node will automatically report DiskPressure and saw that pods were not scheduled as expected. It's possible the scheduler cache could have been latent, but it would be good to see the full node-config.yaml. Even I tried on latest ose(its close to 3.4.0.12) and can not reproduce and it works as expected:
#oc describe node --config=./openshift.local.config/master/admin.kubeconfig | grep DiskPres
DiskPressure True Thu, 20 Oct 2016 12:21:59 -0400 Thu, 20 Oct 2016 12:18:38 -0400 KubeletHasDiskPressure kubelet has disk pressure
3m 3m 2 {kubelet 192.168.124.61} Normal NodeHasNoDiskPressure Node 192.168.124.61 status is now: NodeHasNoDiskPressure
3m 3m 1 {kubelet 192.168.124.61} Normal NodeHasDiskPressure Node 192.168.124.61 status is now: NodeHasDiskPressure
And the pod status is pending with the following event:
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
2m 7s 14 {default-scheduler } Warning FailedScheduling pod (hello-pod) failed to fit in any node
fit failure on node (192.168.124.61): NodeUnderDiskPressure
Hi DeShaui,
In my setup to simulate Disk Pressure, I had:
kubeletArguments:
eviction-hard:
- "nodefs.available<12Gi"
In my setup and Derek's setup, we could not reproduce it. One thing as Derek said could be related to latent scheduler cache. Anyway, would be good to look at your node-config.yaml to see what it has.
I am closing it for time being. Please reopen if you see it consistently.
|