Bug 1270649 - broken system detection logic fires if *any* task is Aborted, rather than *all* tasks Aborted
broken system detection logic fires if *any* task is Aborted, rather than *al...
Status: CLOSED CURRENTRELEASE
Product: Beaker
Classification: Community
Component: scheduler (Show other bugs)
21
Unspecified Unspecified
unspecified Severity unspecified (vote)
: 21.1
: ---
Assigned To: Roman Joost
tools-bugs
: NeedsTestCase, Patch, Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-12 00:05 EDT by Dan Callaghan
Modified: 2015-10-20 23:25 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-20 23:25:28 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dan Callaghan 2015-10-12 00:05:15 EDT
Description of problem:
Due to bug 714937 fixed in 21.0, a recipe is now Aborted if any task in the recipe is Aborted. Previously it was only Aborted if all tasks in the recipe are Aborted.

As a consequence, the broken system detection logic (which is currently triggered based on the recipe status) will consider a recipe to be a "suspicious abort" if any task in the recipe is Aborted. It should only consider recipes where every task is Aborted. 

Version-Release number of selected component (if applicable):
21.0

How reproducible:
somewhat easily

Steps to Reproduce:
1. Schedule a recipe for a particular system, using a released distro, with /distribution/install and /distribution/reservesys (use a small value for the RESERVETIME parameter to make testing easier)
2. Schedule another one so that they run consecutively
3. Wait for each recipe to start and then the watchdog timer to expire

Actual results:
System is marked as broken due to two consecutive Aborted recipes.

Expected results:
System should not be marked broken because the /distribution/install task completes successfully.

Additional info:
This has a high impact because it's quite common for /distribution/reservesys to be Aborted, if the job owner does not explicitly return the system before the reservation time runs out.
Comment 1 Roman Joost 2015-10-14 02:03:22 EDT
Patch available on gerrit:

https://gerrit.beaker-project.org/#/c/4432/
Comment 4 Dan Callaghan 2015-10-20 23:25:28 EDT
Beaker 21.1 has been released.

Note You need to log in before you can comment on or make changes to this bug.