(In reply to Dan Callaghan from comment #1) ... > EWD on a task in the middle of the recipe means that the harness didn't > correctly mark the task as finished in Beaker after it was done running it. > So it continued on with the rest of the tasks and finished the recipe, but > the EWD saw the task as still "Running" so after the watchdog timer expires > it records an EWD. Is this a valid use-case for the external watchdog? Since the Beaker recipe is a sequential list of tasks and the subsequent tasks has been reported back to Beaker, the machine clearly isn't completely stuck.
(In reply to Ales Zelinka from comment #2) > Is this a valid use-case for the external watchdog? Since the Beaker recipe > is a sequential list of tasks and the subsequent tasks has been reported > back to Beaker, the machine clearly isn't completely stuck. Yes it's debateable... Currently the harness will of course always run tasks sequentially in order but Beaker itself doesn't actually assume or care about that. All it enforces is that a task cannot be restarted after it's finished. The recipe is therefore finished when all tasks are finished. Beaker could be stricter about enforcing all tasks to be run in sequence but that wouldn't really help anything. I guess what you are really suggesting is that when the next task starts Beaker could automatically mark previous ones as Completed as a workaround for the fact that the Completed call was missed. But I think that might lead to more problems. I think the problem is that Beah doesn't retry infinite times, and/or doesn't retry in the case of a 500 response, it continues with the rest of recipe. That is arguably a valid approach (favour progress over completeness of results) but I think a better approach is what restraint does: it enters an infinite retry loop with delay, so that the whole recipe will not progress until the call succeeds. That means if Beaker is really broken the recipe will just EWD but if it is a transient error like this one, eventually the recipe will proceed and no results will be lost or out of order. But either way I don't think Beaker should be enforcing or working around this on the server side at all.