Bug 1624697

Summary:	[starter-us-west-1] The node was low on disk resource
Product:	OpenShift Online	Reporter:	Junqi Zhao <juzhao>
Component:	Website	Assignee:	Justin Pierce <jupierce>
Status:	VERIFIED ---	QA Contact:	Junqi Zhao <juzhao>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	3.x	CC:	aos-bugs
Target Milestone:	---	Keywords:	OnlineStarter
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Junqi Zhao 2018-09-03 06:19:13 UTC

Description of problem:
# oc get pod -n openshift-devops-monitor  |grep -i pending 
prometheus-node-exporter-4frjj   0/1       Pending   0          1s
prometheus-node-exporter-68nx6   0/1       Pending   0          3s
prometheus-node-exporter-8dz2f   0/1       Pending   0          0s
prometheus-node-exporter-9sqwl   0/1       Pending   0          2s

Described one pod, error is: The node was low on resource: [DiskPressure].
Events:
  Type     Reason   Age   From                                                  Message
  ----     ------   ----  ----                                                  -------
  Warning  Evicted  25s   kubelet, ip-172-31-17-168.us-west-1.compute.internal  The node was low on resource: [DiskPressure].

Version-Release number of selected component (if applicable):
3.10.14

How reproducible:
Always

Steps to Reproduce:
1. Check pods under openshift-devops-monitor
2.
3.

Actual results:
The node was low on resource: [DiskPressure].

Expected results:
prometheus-node-exporter pods should be in running status

Additional info:

Comment 1 Paul Gier 2018-09-04 14:47:02 UTC

List of nodes having problem:

$ oc get pods --field-selector status.phase=Pending -o go-template --template="{{range .items}}{{.metadata.name}} {{.spec.nodeName}}{{\"\n\"}}{{end}}"

prometheus-node-exporter-j4wrv ip-172-31-28-204.us-west-1.compute.internal
prometheus-node-exporter-kds8p ip-172-31-24-113.us-west-1.compute.internal
prometheus-node-exporter-nw699 ip-172-31-31-229.us-west-1.compute.internal
prometheus-node-exporter-w75rz ip-172-31-17-168.us-west-1.compute.internal

Comment 2 Justin Pierce 2018-09-11 14:15:35 UTC

I've cleared /var on nodes in this cluster. Pods are no longer pending.

Comment 3 Junqi Zhao 2018-09-12 08:55:15 UTC

prometheus-node-exporter pods are all running now
openshift v3.11.0-0.32.0