Bug 603908 - external watchdog on reboot (plymouth related)
external watchdog on reboot (plymouth related)
Status: CLOSED NOTABUG
Product: Beaker
Classification: Community
Component: beah (Show other bugs)
0.5
All Linux
high Severity high (vote)
: ---
: ---
Assigned To: Marian Csontos
:
Depends On: 598631 599003 613756 613768
Blocks:
  Show dependency treegraph
 
Reported: 2010-06-14 16:09 EDT by Martin Jenner
Modified: 2012-01-23 11:35 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-01-23 11:35:40 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Martin Jenner 2010-06-14 16:09:33 EDT
Description of problem:

I have seen this a few times with jobs queued from the same xml, an example job is

  https://beaker.engineering.redhat.com/jobs/3118


The task before a reserve task hits a /LOCALWATCHDOG event, instead of just moving onto the reserve task and reserving the system for debugging it then hits an External watchdog and the system is taken back into beaker.

It may be worth noting that the failing test does do a reboot of the host under test.


Version-Release number of selected component (if applicable):


How reproducible:

I have seen this happen a few times using the xml used in job 3118


Steps to Reproduce:
1. clone the job listed above and watch job.
Comment 2 Marian Csontos 2010-06-14 23:51:57 EDT
The problem is 'shutdown -r now' does not work due to bug in plymouth.
Comment 3 Martin Jenner 2010-06-15 12:25:09 EDT
So I understand the reboot does not work because of a plymouth bug, this causes a localwatchdog timeout on my task which seems reasonable. But shouldn't the harness still move to the next task (in my case a reserve system) and run that task, net result the system should not external watchdog.

Just trying to understand the event sequence...
Comment 4 Bill Peck 2010-06-15 12:32:12 EDT
I believe the problem is because we currently reboot after a localwatchdog to try and get the system in a "clean" state.  This reboot never finishes.
Comment 5 Marian Csontos 2010-06-15 12:39:34 EDT
Local watchdog is most likely caused by Bug 599003 which I am trying to fix now. As Bill said this asks for reboot. Next task is not run as the machine is not in clean state and we are waiting for reboot which never happens thanks to Bug 598631 and External watchdog kills the recipe.

Once these two are fixed, I hope to close this one as duplicate.
Comment 6 Marian Csontos 2010-06-21 09:23:01 EDT
NAKing the beaker-blocker as this does not look like our bug.
Comment 7 Marian Csontos 2011-01-26 06:42:19 EST
Is this still an issue?

Seems the recent problem were purely with infrastructure and this works now fine.
Comment 8 Martin Jenner 2011-03-04 17:00:49 EST
Close out this bz I have not seen for a long time.
Comment 9 Martin Jenner 2012-01-23 11:35:40 EST
closing not seen for a long time, will re-open if re-occurs.

Note You need to log in before you can comment on or make changes to this bug.