Bug 975486

Summary: beaker-watchdog fails to detect a panic if it crosses block boundaries
Product: [Retired] Beaker Reporter: Gurhan Ozen <gozen>
Component: lab controllerAssignee: Dan Callaghan <dcallagh>
Status: CLOSED CURRENTRELEASE QA Contact: tools-bugs <tools-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 0.12CC: aigao, asaha, bpeck, dcallagh, jburke, jstancek, llim, pbunyan, qwan, rmancy, xjia
Target Milestone: 0.15.3   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-02-03 04:51:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Dan Callaghan 2013-06-20 03:44:29 UTC
In this case beaker-watchdog was definitely reading the console log correctly (because it was uploading it to the job). But there is no log line "Panic detected for..." in watchdog.log, so beaker-watchdog never noticed the line "Kernel panic" in the console log.

One thing I noticed is that beaker-watchdog reads the log in chunks of 64KB, and if the "Kernel panic" phrase happened to span two chunks it would never be picked up. That's the only explanation I can come up with so far.

Comment 4 Dan Callaghan 2014-01-16 03:34:40 UTC
I'm going to assume that this was indeed caused by the problem described in comment 1, since that bug definitely exists and can cause missed panics (although it should be pretty rare to hit it) and no other theories have emerged.

Comment 5 Dan Callaghan 2014-01-16 06:33:16 UTC
I fixed this while working on bug 952661.

http://gerrit.beaker-project.org/2692

Comment 7 Nick Coghlan 2014-01-21 07:33:55 UTC
Given that this is a probabilistic bug based on when a panic occurs relative to the internal buffering in Beaker's console log processing, I don't believe it's practical to test it explicitly on a live system.

Instead, the new automated tests added as part of the patch deliberately provoke the misbehaviour in the log processing by injecting data directly.

Comment 9 Nick Coghlan 2014-02-03 04:51:48 UTC
This change is included in the Beaker 0.15.3 maintenance release:

http://beaker-project.org/docs/whats-new/release-0.15.html#beaker-0-15-3