Bug 223374 - LargeSMP Kernel Tainted w/ 16-Cores
Summary: LargeSMP Kernel Tainted w/ 16-Cores
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.4
Hardware: x86_64
OS: Linux
medium
urgent
Target Milestone: ---
: ---
Assignee: Jason Baron
QA Contact: Brian Brock
URL: http://www.fabric7.com/products_q80.php
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-01-18 23:36 UTC by James Sodini
Modified: 2013-03-06 05:59 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-01-29 22:45:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description James Sodini 2007-01-18 23:36:16 UTC
Description of problem:
Booting RHEL4u3 or RHEL4u4 with eight-sockets of dual-core processes (16 cores
total) shows a tainted kernel.


Version-Release number of selected component (if applicable):
Standard installation kernel for both U3 (2.6.9-34.ELlargesmp) & U4
(2.6.9-42.ELlargesmp)

How reproducible:
100% 8-Socket
0% < 8-socket

Steps to Reproduce:
1. Perform fresh installation onto 8-socket
2. Boot & login to system
3. cat /proc/sys/kernel/tainted
  
Actual results:
[root@localhost ~]# cat /proc/sys/kernel/tainted
16

Expected results:
[root@localhost ~]# cat /proc/sys/kernel/tainted
0

Additional info:
This is preventing us from passing Red Hat Hardware Certification. This is in
regards to the Fabric7 Q80 which is linked above. We have a machine at Red Hat
for debugging this very problem.

Comment 2 Jason Baron 2007-01-29 19:20:02 UTC
hi James....any input on the cause of this is welcome :) anyways we apparently
have one of these boxes in our lab...however, i'm not able to login to the box.
We were able to connect to the management port (via serial), but i couldn't
figure out how to get to the main system console. Can you please advise. thanks.

Comment 3 Jason Baron 2007-01-29 19:44:47 UTC
apparently the mce records (an mce is causing taint to be set) are posted in
/sys/class/misc/mcelog. Do we have the contents of this directory from a system
exhibiting this problem? 

Comment 4 James Sodini 2007-01-29 21:47:40 UTC
Bug has been tracked down to bad memory. This was extremely difficult to isolate
because it could only be seen with all eight-sockets with the LargeSMP kernel.

Thank you for your effort!

Concerning using the Q80 (which will be probable in the future for support
issues), there should be a printed copy of the usage manual. If not please send
me an email with instructions on where to send a PDF.


Note You need to log in before you can comment on or make changes to this bug.