Bug 974636 - Provide and document better version of /distribution/virt/taskwait that waits for guest recipes to finish in case they Panic
Summary: Provide and document better version of /distribution/virt/taskwait that waits...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Beaker
Classification: Retired
Component: reports
Version: 0.12
Hardware: Unspecified
OS: Unspecified
low
unspecified
Target Milestone: ---
Assignee: beaker-dev-list
QA Contact: tools-bugs
URL:
Whiteboard:
: 1306031 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-14 17:10 UTC by PaulB
Modified: 2020-10-21 14:14 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-21 14:14:27 UTC
Embargoed:


Attachments (Terms of Use)

Description PaulB 2013-06-14 17:10:17 UTC
Description of problem:
 During automated Beaker job, the system PANIC'd. Beaker reported a result of PASS rather than PANIC. Cloning the job I see the xml does contain <watchdog panic="None"/>.

Version-Release number of selected component (if applicable):
 Beaker Version - 0.12.1

How reproducible:
 Unknown

Steps to Reproduce:
1.
2.
3.

Actual results:
 Beaker reports a result of PASS for a system that PANIC'd.
 https://beaker.engineering.redhat.com/jobs/431309

Expected results:
 Beaker reports the PANIC

Additional info:

Comment 2 Bill Peck 2013-06-14 18:01:52 UTC
ignore my previous comment. 

If you look at the job paul references and click on the first recipe and go to the /virt/start task you will see that the watchdog did in fact report the panic.  But the recipe was already completed and so it could not change the already finished recipe result from Pass to Panic.

Comment 3 PaulB 2013-06-14 18:40:28 UTC
All,
Note the job "Result" says Pass for R:906661 , when the system PANIC'd:
https://beaker.engineering.redhat.com/recipes/906661

Best,
-pbunyan

Comment 4 Dan Callaghan 2013-06-17 05:42:48 UTC
This was a change in Beaker 0.12. Beaker is now stricter about not allowing the result of a task to change after it's finished. Beaker 0.13 (unreleased) will actually be getting even stricter in this regard, it will no longer even allow the new result to be recorded when the task is finished...

It's a problem for virt recipes because of the way the host recipe just drops off the end and "finishes" even while the guests are still running.

I vaguely recall that Gurhan had written a task which you can run in the host recipe which will keep it running while the guest recipes are still running. That way the Panic will be recorded correctly. Gurhan, do you have anything like that?

Comment 5 Gurhan Ozen 2013-06-17 14:40:48 UTC
(In reply to Dan Callaghan from comment #4)
> This was a change in Beaker 0.12. Beaker is now stricter about not allowing
> the result of a task to change after it's finished. Beaker 0.13 (unreleased)
> will actually be getting even stricter in this regard, it will no longer
> even allow the new result to be recorded when the task is finished...
> 
> It's a problem for virt recipes because of the way the host recipe just
> drops off the end and "finishes" even while the guests are still running.
> 
> I vaguely recall that Gurhan had written a task which you can run in the
> host recipe which will keep it running while the guest recipes are still
> running. That way the Panic will be recorded correctly. Gurhan, do you have
> anything like that?

  Yes there is wait4guesttasks script that will wait for the tasks in the guests. If no tasks are given then it'll wait for all tasks inside the guest. 

  However, is there a particular reason why beaker is doing this? So long as a recipeset is run, results should be recorded , no? This actually is a good testcase, because it looks like the host paniced while the guests are running.

Comment 6 Nick Coghlan 2013-06-18 01:15:59 UTC
There are various assumptions in Beaker that break if a task/recipe/etc ever goes "backwards" in the state machine, which is why we've been progressively tightening these rules (so any failure happens promptly at the time of the error rather than causing more obscure errors later on).

However, it may be possible to tweak the rules to allow "Completed Pass" to be converted to a different *result*, since that's a sideways shuffle rather than actually going backwards.

Comment 7 Gurhan Ozen 2013-06-18 04:58:09 UTC
(In reply to Nick Coghlan from comment #6)
> There are various assumptions in Beaker that break if a task/recipe/etc ever
> goes "backwards" in the state machine, which is why we've been progressively
> tightening these rules (so any failure happens promptly at the time of the
> error rather than causing more obscure errors later on).
> 
> However, it may be possible to tweak the rules to allow "Completed Pass" to
> be converted to a different *result*, since that's a sideways shuffle rather
> than actually going backwards.

Ok, we have a situation with picking our favorite poison here. 
If we have to workaround this issue from the test side, then we have to create a new task that will utilize the wait4guesttasks script and ask every tester to append that task as the very last task of in the host/dom0 of every virtual workflow. I think this will be a major headache. 
 We can't do this as part of the virtinstall or virt/start tasks because there are some jobs that needs synchronization between tasks in the hosts&guests, hence was the reason to write that script at the first place.

I think it's a cleaner and more logical solution to tweak the rules to allow the result to a different result on the beaker side, especially since this a very valid testcase. If something happening inside the guest is causing the host to panic beaker should be able to handle and report it correctly.

Comment 9 Dan Callaghan 2016-02-11 03:28:43 UTC
*** Bug 1306031 has been marked as a duplicate of this bug. ***

Comment 10 Dan Callaghan 2016-02-11 03:33:07 UTC
I still think it would be cleanest to have a task in the host recipe which waits for the guest recipes to finish. That way the Panic will be recorded in a sensible place. It doesn't really make sense if the host has /distribution/virt/start (meaning, start all the guests) which completes successfully and then 30 minutes after it finishes there is a Panic result. The Panic didn't happen while the guests were starting, it happened while the host was letting them run.

It looks like we have /distribution/virt/taskwait, written by Gurhan in Feb 2013, which should do this. It requires a GUESTNAME parameter though, which is a bit unfortunate. Ideally it would just wait on all guest recipes by default.

If we cleaned up, published, and documented this task would it be enough to consider this solved?

Comment 11 Jeff Burke 2016-03-18 14:54:51 UTC
Hi Dan,
 I think that would be a good solution/workaround. 

Regards,
Jeff


Note You need to log in before you can comment on or make changes to this bug.