Bug 438571 - [5.2] hp-bl460c-01 not installable since RHEL5.2-Server-20080320.0
Summary: [5.2] hp-bl460c-01 not installable since RHEL5.2-Server-20080320.0
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-03-22 03:30 UTC by Qian Cai
Modified: 2008-03-24 15:05 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-03-24 13:23:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
sosreport (2.15 MB, application/octet-stream)
2008-03-22 03:30 UTC, Qian Cai
no flags Details

Description Qian Cai 2008-03-22 03:30:54 UTC
Description of problem:
It is not possible to install hp-bl460c-01.rhts.boston.redhat.com with
RHEL5.2-Server-20080320.0 tree in RHTS, although it is fine with
RHEL5.2-Server-20080313.1.

Running anaconda, the Red Hat Enterprise Linux Server system installer - please
wait...
Probing for video card:   ATI Technologies Inc ES1000
CPU 3: Machine Check Exception: 0000000000000005
CPU 2: Machine Check Exception: 0000000000000004
Uhhuh. NMI received for unknown reason b1 on CPU 0.
You probably have a hardware problem with your RAM chips
Bank 4: b200000000060151
Dazed and confused, but trying to continue
Bank 5: b20000300c000e0f
Kernel panic - not syncing: CPU context corrupt

How reproducible:
Always

Comment 1 Qian Cai 2008-03-22 03:30:54 UTC
Created attachment 298819 [details]
sosreport

Comment 7 Prarit Bhargava 2008-03-24 13:23:44 UTC
Jeff Burke cannot reproduce this issue.

NOTABUG.

P.

Comment 8 Don Zickus 2008-03-24 14:20:23 UTC
Well NOTABUG is not true.  This is a bug.  Just not in the software but probably
the hardware.  In fact it is probably transient which is why people can't
reproduce it.  I mean corrupted memory bits only happen once in a blue moon, so
unless you test this a billion times a row you may never reproduce this problem.  

The odd thing about this problem is that EDAC should have diagnosed and possibly
fixed this issue (that's the whole reason for its existence is to catch/handle
the memory problems).  Perhaps that is where the software bug is.  

I'll talk to Aris about this.  But unfortunately I don't expect much to come out
of it.  

Cai, please continue to file these types of reports because problems like this
are trappable by the kernel and should be recoverable too, I think.

Cheers,
Don


Comment 9 Prarit Bhargava 2008-03-24 14:33:14 UTC
Really this should have been closed as INSUFFICIENT_DATA -- I was being lazy ;)

P.

Comment 11 Tony Camuso 2008-03-24 15:05:45 UTC
Please try re-seating the memory DIMMs.

This system was working fine when I installed it and for at least a few days
afterwards. 

I don't think the different kernel versions matter as much as what may be an
intermittent contact on the DIMMs or even a flakey DIMM.



Note You need to log in before you can comment on or make changes to this bug.