Description of problem:
Not a problem, but a feature request.
Some guy from IBM stated, that with NUMA enabled on
an Opteron SMP system memory performance is sped up
by up to 30%. Please set
Version-Release number of selected component (if applicable):
Steps to Reproduce:
the kernel cannot be built with NUMA support turned on.
Reason is a loop in #includes. First, the asm/mmzone.h
needs a definition from linux/mmzone.h . Including
linuz/mmzone.h before asm/mmzone.h leads to an error
because some wait_q_type is not defined. Reason is, that
mmzone.h and wait.h include each other.
I wonder what the IBM person bases this on, especially since
1) Opteron systems hardly are NUMA
2) The kernel by default (including ours) does not have a lot of NUMA support
(some other linux distros patch in a lot of this, which slows down the normal
case. gaining 30% relative to that is easier of course :)
AFAIK Opteron SMP systems are NUMA by design. Each part of the memory is
assigned to a CPU. When a CPU wants to access memory, that is assigned
to the other CPU, the access passes the memory controller of the other
CPU, what makes the access slower than a local one.
I really would like to benchmark this, but as explained i don't get
the kernel built due to type definition problems :-(
They are NUMA, however the NUMA factor is very very close to 1 (unlike some
other boxes where the NUMA speed is in the order of 15); academic research has
shown that a numa factor of 2 is about the turnover point for the OS doing
special NUMA hacks. A performance improvement of 30% is certainly believable for
systems with a large NUMA factor (like the IBM x440) but I have a hard time
believing it for real world use on AMD Opteron machines.