Bug 844765 - [RFE] In Beaker add support to auto-retry job with a different server
[RFE] In Beaker add support to auto-retry job with a different server
Status: CLOSED WONTFIX
Product: Beaker
Classification: Community
Component: scheduler (Show other bugs)
0.9
Unspecified Unspecified
unspecified Severity unspecified (vote)
: ---
: ---
Assigned To: Bill Peck
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-31 13:24 EDT by Scott Poore
Modified: 2012-09-28 17:24 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-09-27 21:17:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Scott Poore 2012-07-31 13:24:00 EDT
We run into several cases from time to time where jobs are aborted.  It would be useful if these could be automatically re-tried.  I understand that power events are retried but, when that fails, a job has to be aborted.  

When a Beaker job fails (at least like that), it would be helpful if the scheduler could automatically try to use a different server.  This should also take multihost jobs into account where only one server isn't responding.  It seems like it would hopefully be unnecessary to retry the entire job but, just the one recipe that is failing to start.
Comment 1 Dan Callaghan 2012-09-27 21:17:14 EDT
It doesn't necessarily always make sense to retry a job which has aborted. A human needs to intervene to figure out what went wrong.

Note that there is a beaker-jobwatch script floating around, which tries to automate that process. You could give that a go.
Comment 2 Scott Poore 2012-09-28 17:24:05 EDT
I don't know how beaker-jobwatch works but, I will look into that.

Would you consider adding an option to allow a retry similar to how beaker-jobwatch does it?  Maybe just something set in the job xml and with a maximum (or selectable number of retry attempts before it gives up completely?

Note You need to log in before you can comment on or make changes to this bug.