Bug 178977
Summary: | NUMA hash table lookup fails on dual opteron 252 system w/16GB of RAM | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Jarod Wilson <jarodwilson> |
Component: | kernel | Assignee: | Peter Martuccelli <peterm> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | jarod, jbaron |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-06-15 21:11:47 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jarod Wilson
2006-01-25 21:48:27 UTC
I'll see if I can't isolate the patch SUSE added to their SP3 kernel and slap it on top of 2.6.9-22.0.2.EL later tonight or tomorrow. Dead simple patch, if this is really all that's needed. Will get this applied later today and see if the issue is resolved... -------- From: ak Subject: Increase NUMA node hash size Suse-bugzilla: 106287 Patch-mainline: yes This is needed on some systems with AMD E stepping CPUs which have memory hoisting enabled. The memory map is not unform enough for the 256 entry hash table. Enlarge to 0xfff diff -u linux-2.6.5-hack/include/asm-x86_64/mmzone.h-o linux-2.6.5-hack/include/asm-x86_64/mmzone.h --- linux-2.6.5-hack/include/asm-x86_64/mmzone.h-o 2004-04-04 05:38:00.000000000 +0200 +++ linux-2.6.5-hack/include/asm-x86_64/mmzone.h 2005-09-30 13:46:17.000000000 +0200 @@ -13,7 +13,7 @@ #include <asm/smp.h> #define MAXNODE 8 -#define NODEMAPSIZE 0xff +#define NODEMAPSIZE 0xfff /* Simple perfect hash to map physical addresses to node numbers */ extern int memnode_shift; Well, apparently, that is NOT all that is required to fix this. I've verified that the kernel I'm running now does have this patch implemented, but the problem still exists. Back to the drawing board... The NUMA hash function was re-implemented in RHEL4 Update 2. Please upgrade to Update 2 or later and inform us if the problem persists. The problem still exists with kernel-smp-2.6.9-22.0.2.EL, as well as with a kernel built from the same sources w/the extra hash size patch (the reimplemented numa hash function may explain why that patch didn't help). All released updates have been applied to this system. I have yet to try out a U3 beta kernel though. If that is the case, then please provide a console log of an affected system running the most recent kernel you have. Printouts of the form: <6>node 1 shift 29 addr 204000000 conflict 0 were eliminated in the re-implementation of the NUMA hash function for U2. Any boot log that shows lines like this must be prior to U2. I believe the initial console log was from am earlier kernel, but 'numactl --hardware' on 2.6.9-22.0.2.EL does still show only a single memory controller. I'll grab current console output a bit later this afternoon. Here's the console output w/kernel-smp-2.6.9-22.0.2.EL: Scanning NUMA topology in Northbridge 24 Number of nodes 2 (10010) Node 0 using interleaving mode 1/0 No NUMA configuration found Faking a node at 0000000000000000-0000000420000000 Bootmem setup node 0 0000000000000000-0000000420000000 No mptable found. On node 0 totalpages: 4325376 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 4321280 pages, LIFO batch:16 HighMem zone: 0 pages, LIFO batch:1 Does this problem persist with the most recent kernel? Unfortunately, I don't have access to the hardware to test this on anymore... Lemme see if I can ping someone back at my former employer to take a look though. User jparadis's account has been closed No access to hardware and nobody else has reported a problem in over a year. Closing INSUFFICIENT_DATA. |