Bug 1248662 - Scheduler failing to launch pod when node is running low on disk space
Summary: Scheduler failing to launch pod when node is running low on disk space
Keywords:
Status: CLOSED DUPLICATE of bug 1252520
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.0.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Paul Morie
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks: 1267746
TreeView+ depends on / blocked
 
Reported: 2015-07-30 14:26 UTC by Steve Speicher
Modified: 2019-10-10 10:01 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-02-03 16:51:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Steve Speicher 2015-07-30 14:26:18 UTC
Description of problem:
Only 1 node was close to being out of disk space(had ~200megs free) all the others had ~7GB free. So if it really was the issue the bug should be open against the scheduler to account for disk space when selecting the node.

(from Wesley Hearn)

> > Here is the end of the output about the builder pod:
> >
> > $ oc get pods nodejs-example-3-build -o json
> >
> >     "status": {
> >         "phase": "Failed",
> >         "message": "Pod cannot be started due to lack of disk space.",
> >         "startTime": "2015-07-28T20:49:57Z"
> >     }
> >
> > Good ole out of disk space.
> >
> > To be clear, build 1 failed clearly. Builds 2 & 3 failed but are reported
> > as succeeded.
> >
> > $ oc get builds
> > NAME               TYPE      STATUS     POD
> > nodejs-example-1   Source    Failed     nodejs-example-1-build
> > nodejs-example-2   Source    Complete   nodejs-example-2-build
> > nodejs-example-3   Source    Complete   nodejs-example-3-build
> >

This is running on https://console.stg.openshift.com/console/

Version-Release number of selected component (if applicable): 3.0.0


How reproducible:
Create a project, select the sample (say nodejs-ex) and attempt to build it


Steps to Reproduce:
1.
2.
3.

Actual results: fails immediately, due to inability to start a pod


Expected results: pod starts, build runs, result image pushed


Additional info:
The project showing this behavior is https://console.stg.openshift.com/console/project/ldpjs/overview

Comment 2 Ben Parees 2015-07-30 15:47:56 UTC
Not clear why you assigned this to me, Paul?

Comment 3 Paul Weil 2015-07-30 15:55:42 UTC
Two things I see on this:

1. The build status doesn't seem to be properly conveyed.  The build was failed but marked as completed.  
2. A possible bug with the scheduler not taking node disk space into account - I pinged pmorie about that portion

Comment 4 Ben Parees 2015-07-30 17:07:55 UTC
The build status issue was already fixed here:
https://github.com/openshift/origin/pull/3936

this bug was opened for the scheduler issue, which is really an RFE more than a bug imho.

Comment 5 Derek Carr 2016-02-03 16:51:42 UTC

*** This bug has been marked as a duplicate of bug 1252520 ***


Note You need to log in before you can comment on or make changes to this bug.