745960 – [RFE] When task grows over limit (time, size...), reserve machine as-is for user to investigate

Bug 745960 - [RFE] When task grows over limit (time, size...), reserve machine as-is for user to investigate

Summary: [RFE] When task grows over limit (time, size...), reserve machine as-is for u...

Keywords:
Status:	CLOSED DUPLICATE of bug 639938
Alias:	None
Product:	Beaker
Classification:	Retired
Component:	beah
Sub Component:
Version:	0.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Nick Coghlan
QA Contact:
Docs Contact:
URL:
Whiteboard:	Misc
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-10-13 14:13 UTC by David Kutálek
Modified:	2014-08-12 04:34 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2013-04-15 05:17:08 UTC
Embargoed:

Attachments	(Terms of Use)

Description David Kutálek 2011-10-13 14:13:57 UTC

Description of problem:

When some test fails in a way it cycle infinitely and/or grows its log files over limits, it is being handled by watchdogs. I do not know how these watchdogs work exactly, but often it means end of complete job in warning state. In better case rest is proccessed, but system may be in unexpected state.

I propose new (optional) behaviour of watchdog(s):
 - stop such a problematic task 
 - hold the system as is and run reservesys

This way I will be able to immediately catch bugs in my tasks and save beaker machine resources by not having to run whole job once more.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Nick Coghlan 2012-10-17 04:34:37 UTC

Bulk reassignment of issues as Bill has moved to another team.

Comment 2 Min Shin 2012-11-07 07:22:38 UTC

This bugs is closed as it is either not in the current Beaker scope or we could not find sufficient data in the bug report for consideration.
Please feel free to reopen the bug with additional information and/or business cases behind it.

Comment 3 David Kutálek 2012-11-07 10:00:03 UTC

Either out of scope or insufficient data?

Please tell me more:
 - which one applies? 
 - if scope, why?
 - if data, what more data do you need?

David

(In reply to comment #2)
> This bugs is closed as it is either not in the current Beaker scope or we
> could not find sufficient data in the bug report for consideration.
> Please feel free to reopen the bug with additional information and/or
> business cases behind it.

Comment 4 Dan Callaghan 2012-11-07 23:04:07 UTC

(In reply to comment #3)

This bug might have been miscategorized. Your suggestion sounds reasonable, the only problem is that it's not possible to change the tasks in a recipe after it is scheduled. So it would have to be a feature of the harness that when local watchdog is triggered, the current task is suspended and its run time is extended for some amount of time (24 hours?). The only problem then is how will the user be notified? The reservation e-mail is sent by /distribution/reservesys. The answer to this might be bug 639938: treating reservation differently than other tasks.

We would definitely also want this behaviour to be opt-in, since we wouldn't want every local watchdog to hold onto the machine for 24 hours. That would create a huge amount of waste.

Comment 5 David Kutálek 2012-11-08 13:21:56 UTC

Thank you for response. Yes it should be most probably implemented in harness and should be configurable: What to do when local watchdog expires?

a) recipe is cancelled
b) task is cancelled and recipe execution continues
c) machine is reserved by harness and e-mail is sent

Reservation time should be also configurable, usually something like 2 hours may be sufficient.

Comment 6 Nick Coghlan 2013-04-15 05:17:08 UTC

Closing this as a duplicate of #639938.

We won't be adding any implicit reservation behaviour, but we will be adding the capability to request post-execution reservation of the system independent of the executionof the tasks.

*** This bug has been marked as a duplicate of bug 639938 ***

Note You need to log in before you can comment on or make changes to this bug.