Bug 1123249 - beaker-watchdog aborting guest recipes 4 hours after host recipe finishes
Summary: beaker-watchdog aborting guest recipes 4 hours after host recipe finishes
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: lab controller
Version: 0.17
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 0.17.2
Assignee: Dan Callaghan
QA Contact: matt jia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-25 07:16 UTC by Dan Callaghan
Modified: 2018-02-06 00:41 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-30 02:35:05 UTC
Embargoed:


Attachments (Terms of Use)

Description Dan Callaghan 2014-07-25 07:16:48 UTC
Description of problem:
beaker-watchdog is aborting recipes too early.

Version-Release number of selected component (if applicable):
0.17.1

Steps to Reproduce:
currently unknown

Additional info:
Have not confirmed this yet, but from looking at this commit:
https://git.beaker-project.org/cgit/beaker/commit/Server/bkr/server/model/scheduler.py?id=525ef3e29d83dd2dcdba24facd0669826ae49912
it seems like Beaker will return a watchdog as 'expired' if *any* recipe in the set has kill time in the past. It should be only if *all* recipes in the set have kill time in the past. However, this doesn't yet fully explain the behaviour we are seeing.

Comment 3 Dan Callaghan 2014-07-27 23:17:18 UTC
I noticed that the guest recipes are always aborted exactly 4 hours after the host recipe finished /distribution/virt/start, I feel like that must be a clue as to what exactly is going wrong...

Comment 4 Dan Callaghan 2014-07-28 00:26:36 UTC
Okay so the mystery of the 4 hours is solved: beah extends the watchdog by 4 hours when a task finishes (I don't know what for).

The issue here is really just with the Watchdog.by_status query.

Comment 5 Dan Callaghan 2014-07-28 01:09:48 UTC
On Gerrit: http://gerrit.beaker-project.org/3218

Comment 6 Dan Callaghan 2014-07-29 00:31:44 UTC
Steps to reproduce:

1. Submit a job with a host recipe containing a guest recipe, with the following tasks:
Host
  /distribution/install
  /distribution/virt/install
  /distribution/virt/start
Guest
  /distribution/install
  /distribution/reservesys

Actual results:
Job runs up until reservesys in the guest, but instead of staying reserved for 24 hours the job is terminated by external watchdog 4 hours after /distribution/virt/start completed.

Expected results:
Job runs successfully. reservesys task in the guest reserves for 24 hours.

Comment 10 Amit Saha 2014-07-30 02:35:05 UTC
beaker-0.17.2 has been released.


Note You need to log in before you can comment on or make changes to this bug.