618528 – Test watchdog does not work properly (coming tests might be affected by previous timeout)

Bug 618528 - Test watchdog does not work properly (coming tests might be affected by previous timeout)

Summary: Test watchdog does not work properly (coming tests might be affected by previ...

Keywords:
Status:	CLOSED DUPLICATE of bug 618123
Alias:	None
Product:	Beaker
Classification:	Retired
Component:	beah
Sub Component:
Version:	0.5
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Bill Peck
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-07-27 08:04 UTC by Frantisek Reznicek
Modified:	2015-11-16 01:12 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2010-07-27 10:57:50 UTC
Embargoed:

Attachments	(Terms of Use)

Description Frantisek Reznicek 2010-07-27 08:04:51 UTC

Description of problem:

I have experienced multiple issues with watchdog here:
https://beaker.engineering.redhat.com/jobs/8851
https://beaker.engineering.redhat.com/jobs/8798
https://beaker.engineering.redhat.com/jobs/8801

The behavior is following, if a test is timeouting and test maximum duration is reached, then the test might end up with Warn status and all tests which are scheduled after this test also timeout, BUT with the watchdog time of the first timeouted test (2010-07-26 12:38:28).
It is clealy visible here: https://beaker.engineering.redhat.com/jobs/8851

Moreover there is possibility that test timeouts with a lot of "./" lines in WebUI which is quite interesting behavior and I tend to claim that this is invalid test watchdog trigger.

Let's closely look at
https://beaker.engineering.redhat.com/recipes/16074#task199644
(/distribution/MRG/Messaging/qpid_test_rhm_docs_examples test)

The test is set to use 1h watchdog timeout, but as you can see in the job/recipe it took 4:45:50 insted of 1:00:00 +- a minute.

Version-Release number of selected component (if applicable):
0.5.50 (WebUI version)

How reproducible:
quite frequently

Steps to Reproduce:
1. look at https://beaker.engineering.redhat.com/jobs/8851 or clone that test
2. see the timeouted test after first timeouting test
  
Actual results:
Beaker test watchdog is not behaving properly which results in test data loss.

Expected results:
Beaker test watchdog should behave properly.

Additional info:

Comment 1 Marian Csontos 2010-07-27 10:57:50 UTC

The long timeout is caused by harness allowing longer timeout to submit results to LC. This is to avoid watchdog killing the job when the task is already finished and harness is waiting for LC to submit results.

For the rest see Bug 618123.

*** This bug has been marked as a duplicate of bug 618123 ***

Note You need to log in before you can comment on or make changes to this bug.