Red Hat Bugzilla – Bug 118564
[PATCH] RHEL3 cannot boot on 8-way Opteron systems
Last modified: 2013-08-05 21:04:55 EDT
Description of problem:
/usr/src/linux/arch/x86_64/mm/k8topology.c uses the wrong mask to get
the number of nodes when booting on an 8 way system. It will crash
if loaded on such a system.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Simple, 2 change patch to fix the problem attached.
Created attachment 98611 [details]
patch to correctly read processor masks for 8p system
This patch has been tested by AMD and verified to work.
Patch submitted, should be included in U3.
Jim's patch has been posted to our internal review list. When it
is finally checked in to our CVS patch pool (for U3), I'll change
the status of this bug back to "modified" to confirm this.
Further testing found an additional bug in weird memory situations
(ie, some but not all processors have memory). Patch below:
diff -u linux/arch/x86_64/mm/k8topology.c-o
--- linux/arch/x86_64/mm/k8topology.c-o 2004-01-29 16:15:07.000000000
+++ linux/arch/x86_64/mm/k8topology.c 2004-04-10 01:20:41.000000000
@@ -196,7 +196,7 @@
if ((nodes_present >> rr) == 0)
rr = 0;
- rr = ffz(~nodes_present >> rr);
+ rr += ffz(~nodes_present >> rr);
PLAT_NODE_DATA(i) = PLAT_NODE_DATA(rr);
The patch in comment #1 has just been committed to the RHEL3 U3
patch pool (in kernel version 2.4.21-15.2.EL).
Jim, could you please validate/test/post the additional patch in
comment #4? If/when that is committed to U3, I'll change the state
of this Bugzilla report to "modified".
Mark, the patch in comment #4 doesn't apply to the current
(in-progress) RHEL3 U3 patch pool nor to the (released) RHEL3
U2 source tree. I'm not sure where it's from, but let me just
ask: has the original problem described in this report been
resolved for you?
Mark - Have you been able to try a recent RHEL3 kernel on an 8-way
Opteron to see if this issue is indeed resolved for you?
Sorry for the delay in answering. We've only got a few of the
relevant systems so scheduling tests takes time.
I tried 2.4.21-15.0.3 (RHEL3-U2) with the following grub.conf boot
line and the system failed to load the kernel:
title Red Hat Enterprise Linux AS (2.4.21-15.0.3.ELsmp)
kernel /vmlinuz-2.4.21-15.0.3.ELsmp ro root=LABEL=/
apm=power-off hdc=ide-scsi noexec=off
Adding numa=off allowed the system to boot into run level 3 as
I can retest on RHEL3 U3 later this week, but we're still getting
back from OLS.
2.4.21-18 (U3 beta kernel) works on the 8 way systems we have
available, with both NUMA enabled and disabled. There is a 60%
performance boost with NUMA enabled on the 8 way systems.
Thanks for resolving this issue.
I'm reverting this to MODIFIED state. The Errata System will
automatically change the state to CLOSED/ERRATA when U3 is
released (most likely tomorrow).
An errata has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.