Description of problem: # oc get pod -n openshift-devops-monitor |grep -i pending prometheus-node-exporter-4frjj 0/1 Pending 0 1s prometheus-node-exporter-68nx6 0/1 Pending 0 3s prometheus-node-exporter-8dz2f 0/1 Pending 0 0s prometheus-node-exporter-9sqwl 0/1 Pending 0 2s Described one pod, error is: The node was low on resource: [DiskPressure]. Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Evicted 25s kubelet, ip-172-31-17-168.us-west-1.compute.internal The node was low on resource: [DiskPressure]. Version-Release number of selected component (if applicable): 3.10.14 How reproducible: Always Steps to Reproduce: 1. Check pods under openshift-devops-monitor 2. 3. Actual results: The node was low on resource: [DiskPressure]. Expected results: prometheus-node-exporter pods should be in running status Additional info:
List of nodes having problem: $ oc get pods --field-selector status.phase=Pending -o go-template --template="{{range .items}}{{.metadata.name}} {{.spec.nodeName}}{{\"\n\"}}{{end}}" prometheus-node-exporter-j4wrv ip-172-31-28-204.us-west-1.compute.internal prometheus-node-exporter-kds8p ip-172-31-24-113.us-west-1.compute.internal prometheus-node-exporter-nw699 ip-172-31-31-229.us-west-1.compute.internal prometheus-node-exporter-w75rz ip-172-31-17-168.us-west-1.compute.internal
I've cleared /var on nodes in this cluster. Pods are no longer pending.
prometheus-node-exporter pods are all running now openshift v3.11.0-0.32.0