Bug 755562

Summary: External watchdog kills /distribution/install test in jobs
Product: [Retired] Beaker Reporter: Jan Ščotka <jscotka>
Component: schedulerAssignee: Nick Coghlan <ncoghlan>
Status: CLOSED INSUFFICIENT_DATA QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 0.6CC: azelinka, bpeck, dcallagh, mcsontos, mishin, rmancy, stl
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: MC
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-07 07:22:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jan Ščotka 2011-11-21 13:34:18 UTC
Description of problem:
It seems that test not finished at proper time, then is okay kill it, but it is killed by external watchdog instead of internal (local) one.
I don't know what causes it, but for me it seems that it is not caused by one test .
It also appeared in /distribution/install test, what is very unpleasant (Unable to see whats wrong).

Affected is only RHEL5 (5.7, 5.8) not RHEL6.
see jobs:
https://beaker.engineering.redhat.com/jobs/159678
https://beaker.engineering.redhat.com/jobs/159125
https://beaker.engineering.redhat.com/jobs/158535

I'm not sure it it is a bug of beaker, or some component causes this strange behaviour (ntp, anaconda).
   Honza

Comment 1 Marian Csontos 2011-11-21 14:10:39 UTC
To me looks like an infrastructure problem: I went through few of the links and have not found any aborted recipe where installation had completed.

Contact eng-ops to investigate, please.

Comment 3 Marian Csontos 2011-11-21 15:28:28 UTC
Out of these:

- two s390 systems have not survived reboot after local watchdog:

00: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from 
 CPU 01. 
01: Storage cleared - system reset. 
01: zIPL v1.8.1-16.el5 interactive boot menu 
01: 
01:  0. default (linux-1) 
01: 
01:  1. linux-1 
01:  2. linux-2 
01: 
01: Note: VM users please use '#cp vi vmsg <input>' 
01: 
01: Please choose: 

Not sure what the issue is but to me looks like a system settings thing - go to eng-ops please.


- the ia64 ended with very strange errors on console - this ended up with a mess and I have no clue what had happened. Could be external storage outage?

http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2011/11/1591/159125/332668/console.log

First some "Text file busy" messages, followed by few syntactic errors and at the end Bus error reported:

/usr/bin/rhts-test-runner.sh: line 48: /bin/mv: Text file busy 
/usr/bin/rhts-test-runner.sh: line 51: 30913 Bus error               rm -f /mnt/testarea/_TESTOUT.log


- and the remaining one is without any clues at all.

Clone and report whether reproducible, please.

Comment 4 Nick Coghlan 2012-10-17 04:34:10 UTC
Bulk reassignment of issues as Bill has moved to another team.

Comment 5 Min Shin 2012-11-07 07:22:36 UTC
This bugs is closed as it is either not in the current Beaker scope or we could not find sufficient data in the bug report for consideration.
Please feel free to reopen the bug with additional information and/or business cases behind it.