From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041215 Firefox/1.0 Red Hat/1.0-12.EL4 Description of problem: Some memory-intensive number-crunching (single-threaded) programs run significantly slower with SMP kernel than with UP kernel on the same (dual-CPU) hardware. The problem seems to pertain to most recent RH kernels -- I reproduce it on FC3 with the most recent kernel (kernel-smp-2.6.10-1.766_FC3.i686.rpm) My last recorded benchamrk wich does not show this problem was from June 2004 (whatever kernels were current for RHEL3 and FC). Version-Release number of selected component (if applicable): kernel-smp-2.6.9-5.0.3.EL How reproducible: Always Steps to Reproduce: This is the simplest example I can come up with: 1. boot SMP kernel 2. run octave and execute the following at the prompt: octave:1> s=zeros(3000); (this will create 3000x3000 matrix filled with zeros) octave:2> tic; w=s'; toc (this will transpose it and time the procedure) 3. boot UP kernel and repeat the procedure. Actual Results: On my computer (2xAthlonMP / 1GB RAM) I get approximately 2 seconds with SMP kernel and 0.8 sec in UP mode. Expected Results: Since tic/toc timer counts walltime, I expect (and used to observe) slightly shorter timing in SMP mode. Additional info: 2x Athlon MP (2000 MHz) on Tyan Tiger MP S2460 (Bios 1.05, the latest). 1 GB RAM. The problem may be hardware dependent. In about 2000 (kernel 2.0.30 or about) I had a very similar problem on Intel 440LX m/b, but not on Intel 440BX... Unfortunately there is no many Athlon MP chipsets around.
Created attachment 111441 [details] plot of times vs size for SMP and UP Times transposure of matrix DxD size as a function of size. Compare SMP and UP modes.
I did some additional testing. The figure ftp://coffee.phys.unm.edu/pub/dima/octave/cpuscale.png shows times obtained in SMP mode and in uni-processor (UP) mode on the same hardware (2xAthlon 2000MHz / 1 GB RAM). The swap was turned off for the test. One can see a curious region around D=2000 to 4000 when UP outperform SMP by almost a factor of 3. The reason for deviation from O(d^2) law at high D is not clear to me either.
Created attachment 120499 [details] NFS locking fixes that release the kernel_lock in do_unlk() Ingo, this cause of this appears to be the missing unlock_kernel() in do_unlk(). Its part of the NFS locking changes targeted for RHEL4-U3. Larry Woodman
Created attachment 120512 [details] The correct patch Note patch that Larry posted does not the needed the fix in it and also breaks lock tests (i.e. F_TEST) by passing back the wrong value when a lock does not exist. Please try the one I just attached.