Bug 213542 - x460 4 node system reboots while running the pounder test suite
Summary: x460 4 node system reboots while running the pounder test suite
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.4
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Konrad Rzeszutek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-11-01 21:51 UTC by Jeff Burke
Modified: 2007-11-17 01:14 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-12-14 18:07:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jeff Burke 2006-11-01 21:51:19 UTC
Description of problem:
 While running the pounder test suite the system will reboot.

Version-Release number of selected component (if applicable):
 RHEL4-U4 x86_64

How reproducible:
 Always

Steps to Reproduce:
1. Install the RHTS pounder test suite on ibm-heremes-n1
  
Actual results:
 Nothing to add I can't gather any data on this.  I have netdump enabled about
20 min into the test the system will just reboot.

Expected results:
 Ssytem should be able to run the test and pass

Additional info:

Comment 1 RHEL Program Management 2006-11-01 22:05:14 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux release.  Product Management has requested further review
of this request by Red Hat Engineering.  This request is not yet committed for
inclusion in release.

Comment 2 Konrad Rzeszutek 2006-11-02 15:24:34 UTC
Lets try to configure it for a 2-node and see if that triggers the problem. It
can be electrical power drain related - I have caused the power breaker to trip
a couple of times when utilizing that 4-node to its full potential. 

The 4-node in Beaverton do not exhibit this problem.

Comment 3 Konrad Rzeszutek 2006-11-03 17:00:38 UTC
Problem traced down to bad CPU board or bad CPU. Not sure which and is working
on swapping components to determine which is at fault.

Comment 4 Konrad Rzeszutek 2006-11-13 19:02:43 UTC
Updating the CPLD did not fix it either.

Comment 5 Peter Martuccelli 2006-11-14 15:04:51 UTC
I cannot make this a beta blocker when we only see the issue on one lab system.
 IBM needs to continue their work on resolving the HW issues on the x460 in the
Westford lab.

Comment 6 Konrad Rzeszutek 2006-12-11 17:38:56 UTC
back-leveling the BIOS (to 154B) fixed the problem.

The system is currently running the pounder21 test on 1767381 test kernel in a
4-node configuration.


Comment 8 Konrad Rzeszutek 2006-12-14 18:07:54 UTC
CLosing BZ as NOTABUG since this was a BIOS issue.


Note You need to log in before you can comment on or make changes to this bug.