Bug 455310 - LS21 locks up booting MRG RT kernel
LS21 locks up booting MRG RT kernel
Status: CLOSED NOTABUG
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel (Show other bugs)
1.0
All Linux
low Severity low
: ---
: ---
Assigned To: Red Hat Real Time Maintenance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-07-14 15:29 EDT by Clark Williams
Modified: 2008-08-14 17:31 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-08-14 17:31:18 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Boot log for LS21 lockup (4.74 KB, text/plain)
2008-07-14 15:29 EDT, Clark Williams
no flags Details

  None (edit)
Description Clark Williams 2008-07-14 15:29:41 EDT
Description of problem:
RT kernel fails to boot on blade1 of HSV bladecenter (blade2 boots the kernel
and runs fine). Same blade boots and runs RHEL5.2 kernel.

Version-Release number of selected component (if applicable):

kernel-rt-2.6.24.7-72.el5rt

How reproducible:

Every time.

Steps to Reproduce:
1. Install RHEL5.2
2. Install MRG RT kernel
3. Boot RT kernel
  
Actual results:

Kernel hangs after reporting amount of memory available (see attached console
output).


Expected results:

Running kernel

Additional info:

Debbugging printk's indicate that the hang is occuring in
calibrate_delay_direct(). Jiffies are not incrementing, so the calibration loop
never terminates.
Comment 1 Clark Williams 2008-07-14 15:29:41 EDT
Created attachment 311759 [details]
Boot log for LS21 lockup
Comment 2 Clark Williams 2008-07-14 17:17:04 EDT
I swapped the two LS21's that were in slots 1 & 2 and the failing blade
(formerly in slot 1) reported double bit errors on DIMM slots 5 & 6, disabled
the two slots and then booted on up. 

Here's a cut-n-paste from the web interface to the event log:

1  E  BLADE_02 	 07/14/08, 21:07:55 	(SN#YK10A269W03L) DIMM number 5 failed.
2  E  BLADE_02 	 07/14/08, 21:07:55 	(SN#YK10A269W03L) POSTBIOS: 289 Board 1
DIMM Pair 3 Double Bit Error.
3  E  BLADE_02 	 07/14/08, 21:07:54 	(SN#YK10A269W03L) DIMM number 6 failed.
4  E  BLADE_02 	 07/14/08, 21:07:54 	(SN#YK10A269W03L) POSTBIOS: 289 Board 1
DIMM Pair 3 Double Bit Error.
5  I  BLADE_02 	 07/14/08, 21:07:25 	(SN#YK10A269W03L) System Reboot

Comment 3 Clark Williams 2008-08-14 17:31:18 EDT
Closing due to confirmed h/w error

Note You need to log in before you can comment on or make changes to this bug.