Bug 1624697

Summary: [starter-us-west-1] The node was low on disk resource
Product: OpenShift Online Reporter: Junqi Zhao <juzhao>
Component: WebsiteAssignee: Justin Pierce <jupierce>
Status: VERIFIED --- QA Contact: Junqi Zhao <juzhao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.xCC: aos-bugs
Target Milestone: ---Keywords: OnlineStarter
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Junqi Zhao 2018-09-03 06:19:13 UTC
Description of problem:
# oc get pod -n openshift-devops-monitor  |grep -i pending 
prometheus-node-exporter-4frjj   0/1       Pending   0          1s
prometheus-node-exporter-68nx6   0/1       Pending   0          3s
prometheus-node-exporter-8dz2f   0/1       Pending   0          0s
prometheus-node-exporter-9sqwl   0/1       Pending   0          2s

Described one pod, error is: The node was low on resource: [DiskPressure].
Events:
  Type     Reason   Age   From                                                  Message
  ----     ------   ----  ----                                                  -------
  Warning  Evicted  25s   kubelet, ip-172-31-17-168.us-west-1.compute.internal  The node was low on resource: [DiskPressure].

Version-Release number of selected component (if applicable):
3.10.14

How reproducible:
Always

Steps to Reproduce:
1. Check pods under openshift-devops-monitor
2.
3.

Actual results:
The node was low on resource: [DiskPressure].

Expected results:
prometheus-node-exporter pods should be in running status

Additional info:

Comment 1 Paul Gier 2018-09-04 14:47:02 UTC
List of nodes having problem:

$ oc get pods --field-selector status.phase=Pending -o go-template --template="{{range .items}}{{.metadata.name}} {{.spec.nodeName}}{{\"\n\"}}{{end}}"

prometheus-node-exporter-j4wrv ip-172-31-28-204.us-west-1.compute.internal
prometheus-node-exporter-kds8p ip-172-31-24-113.us-west-1.compute.internal
prometheus-node-exporter-nw699 ip-172-31-31-229.us-west-1.compute.internal
prometheus-node-exporter-w75rz ip-172-31-17-168.us-west-1.compute.internal

Comment 2 Justin Pierce 2018-09-11 14:15:35 UTC
I've cleared /var on nodes in this cluster. Pods are no longer pending.

Comment 3 Junqi Zhao 2018-09-12 08:55:15 UTC
prometheus-node-exporter pods are all running now
openshift v3.11.0-0.32.0