Red Hat Bugzilla – Bug 149502
performance drop in SMP
Last modified: 2007-11-30 17:07:16 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041215 Firefox/1.0 Red Hat/1.0-12.EL4
Description of problem:
Some memory-intensive number-crunching (single-threaded) programs run significantly slower with SMP kernel than with UP kernel on the same (dual-CPU)
hardware. The problem seems to pertain to most recent RH kernels -- I reproduce
it on FC3 with the most recent kernel (kernel-smp-2.6.10-1.766_FC3.i686.rpm)
My last recorded benchamrk wich does not show this problem was from June 2004
(whatever kernels were current for RHEL3 and FC).
Version-Release number of selected component (if applicable):
Steps to Reproduce:
This is the simplest example I can come up with:
1. boot SMP kernel
2. run octave and execute the following at the prompt:
(this will create 3000x3000 matrix filled with zeros)
octave:2> tic; w=s'; toc
(this will transpose it and time the procedure)
3. boot UP kernel and repeat the procedure.
Actual Results: On my computer (2xAthlonMP / 1GB RAM) I get approximately 2 seconds with SMP kernel and 0.8 sec in UP mode.
Expected Results: Since tic/toc timer counts walltime, I expect (and used to observe) slightly shorter timing in SMP mode.
2x Athlon MP (2000 MHz) on Tyan Tiger MP S2460 (Bios 1.05, the latest). 1 GB RAM.
The problem may be hardware dependent. In about 2000 (kernel 2.0.30 or about) I had a very similar problem on Intel 440LX m/b, but not on Intel 440BX...
Unfortunately there is no many Athlon MP chipsets around.
Created attachment 111441 [details]
plot of times vs size for SMP and UP
Times transposure of matrix DxD size as a
function of size. Compare SMP and UP modes.
I did some additional testing. The figure
shows times obtained in SMP mode and in uni-processor
(UP) mode on the same hardware (2xAthlon 2000MHz / 1 GB RAM).
The swap was turned off for the test.
One can see a curious region around D=2000 to 4000 when
UP outperform SMP by almost a factor of 3.
The reason for deviation from O(d^2) law at high D is not
clear to me either.
Created attachment 120499 [details]
NFS locking fixes that release the kernel_lock in do_unlk()
Ingo, this cause of this appears to be the missing unlock_kernel() in
do_unlk(). Its part of the NFS locking changes targeted for RHEL4-U3.
Created attachment 120512 [details]
The correct patch
Note patch that Larry posted does not the needed the fix
in it and also breaks lock tests (i.e. F_TEST) by passing
back the wrong value when a lock does not exist.
Please try the one I just attached.