Description of problem: When some test fails in a way it cycle infinitely and/or grows its log files over limits, it is being handled by watchdogs. I do not know how these watchdogs work exactly, but often it means end of complete job in warning state. In better case rest is proccessed, but system may be in unexpected state. I propose new (optional) behaviour of watchdog(s): - stop such a problematic task - hold the system as is and run reservesys This way I will be able to immediately catch bugs in my tasks and save beaker machine resources by not having to run whole job once more. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Bulk reassignment of issues as Bill has moved to another team.
This bugs is closed as it is either not in the current Beaker scope or we could not find sufficient data in the bug report for consideration. Please feel free to reopen the bug with additional information and/or business cases behind it.
Either out of scope or insufficient data? Please tell me more: - which one applies? - if scope, why? - if data, what more data do you need? David (In reply to comment #2) > This bugs is closed as it is either not in the current Beaker scope or we > could not find sufficient data in the bug report for consideration. > Please feel free to reopen the bug with additional information and/or > business cases behind it.
(In reply to comment #3) This bug might have been miscategorized. Your suggestion sounds reasonable, the only problem is that it's not possible to change the tasks in a recipe after it is scheduled. So it would have to be a feature of the harness that when local watchdog is triggered, the current task is suspended and its run time is extended for some amount of time (24 hours?). The only problem then is how will the user be notified? The reservation e-mail is sent by /distribution/reservesys. The answer to this might be bug 639938: treating reservation differently than other tasks. We would definitely also want this behaviour to be opt-in, since we wouldn't want every local watchdog to hold onto the machine for 24 hours. That would create a huge amount of waste.
Thank you for response. Yes it should be most probably implemented in harness and should be configurable: What to do when local watchdog expires? a) recipe is cancelled b) task is cancelled and recipe execution continues c) machine is reserved by harness and e-mail is sent Reservation time should be also configurable, usually something like 2 hours may be sufficient.
Closing this as a duplicate of #639938. We won't be adding any implicit reservation behaviour, but we will be adding the capability to request post-execution reservation of the system independent of the executionof the tasks. *** This bug has been marked as a duplicate of bug 639938 ***