Bug 603908 - external watchdog on reboot (plymouth related)
Summary: external watchdog on reboot (plymouth related)
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Beaker
Classification: Retired
Component: beah
Version: 0.5
Hardware: All
OS: Linux
high
high
Target Milestone: ---
Assignee: Marian Csontos
QA Contact:
URL:
Whiteboard:
Depends On: 598631 599003 613756 613768
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-06-14 20:09 UTC by Martin Jenner
Modified: 2012-01-23 16:35 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-01-23 16:35:40 UTC
Embargoed:


Attachments (Terms of Use)

Description Martin Jenner 2010-06-14 20:09:33 UTC
Description of problem:

I have seen this a few times with jobs queued from the same xml, an example job is

  https://beaker.engineering.redhat.com/jobs/3118


The task before a reserve task hits a /LOCALWATCHDOG event, instead of just moving onto the reserve task and reserving the system for debugging it then hits an External watchdog and the system is taken back into beaker.

It may be worth noting that the failing test does do a reboot of the host under test.


Version-Release number of selected component (if applicable):


How reproducible:

I have seen this happen a few times using the xml used in job 3118


Steps to Reproduce:
1. clone the job listed above and watch job.

Comment 2 Marian Csontos 2010-06-15 03:51:57 UTC
The problem is 'shutdown -r now' does not work due to bug in plymouth.

Comment 3 Martin Jenner 2010-06-15 16:25:09 UTC
So I understand the reboot does not work because of a plymouth bug, this causes a localwatchdog timeout on my task which seems reasonable. But shouldn't the harness still move to the next task (in my case a reserve system) and run that task, net result the system should not external watchdog.

Just trying to understand the event sequence...

Comment 4 Bill Peck 2010-06-15 16:32:12 UTC
I believe the problem is because we currently reboot after a localwatchdog to try and get the system in a "clean" state.  This reboot never finishes.

Comment 5 Marian Csontos 2010-06-15 16:39:34 UTC
Local watchdog is most likely caused by Bug 599003 which I am trying to fix now. As Bill said this asks for reboot. Next task is not run as the machine is not in clean state and we are waiting for reboot which never happens thanks to Bug 598631 and External watchdog kills the recipe.

Once these two are fixed, I hope to close this one as duplicate.

Comment 6 Marian Csontos 2010-06-21 13:23:01 UTC
NAKing the beaker-blocker as this does not look like our bug.

Comment 7 Marian Csontos 2011-01-26 11:42:19 UTC
Is this still an issue?

Seems the recent problem were purely with infrastructure and this works now fine.

Comment 8 Martin Jenner 2011-03-04 22:00:49 UTC
Close out this bz I have not seen for a long time.

Comment 9 Martin Jenner 2012-01-23 16:35:40 UTC
closing not seen for a long time, will re-open if re-occurs.


Note You need to log in before you can comment on or make changes to this bug.