Description of problem: Not a problem, but a feature request. Some guy from IBM stated, that with NUMA enabled on an Opteron SMP system memory performance is sped up by up to 30%. Please set CONFIG_K8_NUMA=y CONFIG_ACPI_NUMA=y in /usr/src/redhat/SOURCES/kernel-2.4.21-x86_64-smp.config Version-Release number of selected component (if applicable): 2.4.21-1.1931.2.393 How reproducible: N/A Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
the kernel cannot be built with NUMA support turned on. Reason is a loop in #includes. First, the asm/mmzone.h needs a definition from linux/mmzone.h . Including linuz/mmzone.h before asm/mmzone.h leads to an error because some wait_q_type is not defined. Reason is, that mmzone.h and wait.h include each other.
I wonder what the IBM person bases this on, especially since 1) Opteron systems hardly are NUMA 2) The kernel by default (including ours) does not have a lot of NUMA support (some other linux distros patch in a lot of this, which slows down the normal case. gaining 30% relative to that is easier of course :)
AFAIK Opteron SMP systems are NUMA by design. Each part of the memory is assigned to a CPU. When a CPU wants to access memory, that is assigned to the other CPU, the access passes the memory controller of the other CPU, what makes the access slower than a local one. I really would like to benchmark this, but as explained i don't get the kernel built due to type definition problems :-(
They are NUMA, however the NUMA factor is very very close to 1 (unlike some other boxes where the NUMA speed is in the order of 15); academic research has shown that a numa factor of 2 is about the turnover point for the OS doing special NUMA hacks. A performance improvement of 30% is certainly believable for systems with a large NUMA factor (like the IBM x440) but I have a hard time believing it for real world use on AMD Opteron machines.