Description of problem: It is not possible to install hp-bl460c-01.rhts.boston.redhat.com with RHEL5.2-Server-20080320.0 tree in RHTS, although it is fine with RHEL5.2-Server-20080313.1. Running anaconda, the Red Hat Enterprise Linux Server system installer - please wait... Probing for video card: ATI Technologies Inc ES1000 CPU 3: Machine Check Exception: 0000000000000005 CPU 2: Machine Check Exception: 0000000000000004 Uhhuh. NMI received for unknown reason b1 on CPU 0. You probably have a hardware problem with your RAM chips Bank 4: b200000000060151 Dazed and confused, but trying to continue Bank 5: b20000300c000e0f Kernel panic - not syncing: CPU context corrupt How reproducible: Always
Created attachment 298819 [details] sosreport
Jeff Burke cannot reproduce this issue. NOTABUG. P.
Well NOTABUG is not true. This is a bug. Just not in the software but probably the hardware. In fact it is probably transient which is why people can't reproduce it. I mean corrupted memory bits only happen once in a blue moon, so unless you test this a billion times a row you may never reproduce this problem. The odd thing about this problem is that EDAC should have diagnosed and possibly fixed this issue (that's the whole reason for its existence is to catch/handle the memory problems). Perhaps that is where the software bug is. I'll talk to Aris about this. But unfortunately I don't expect much to come out of it. Cai, please continue to file these types of reports because problems like this are trappable by the kernel and should be recoverable too, I think. Cheers, Don
Really this should have been closed as INSUFFICIENT_DATA -- I was being lazy ;) P.
Please try re-seating the memory DIMMs. This system was working fine when I installed it and for at least a few days afterwards. I don't think the different kernel versions matter as much as what may be an intermittent contact on the DIMMs or even a flakey DIMM.