Bug 149502 - performance drop in SMP
Summary: performance drop in SMP
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel   
(Show other bugs)
Version: 4.0
Hardware: i386
OS: Linux
Target Milestone: ---
: ---
Assignee: Steve Dickson
QA Contact: Brian Brock
Depends On:
TreeView+ depends on / blocked
Reported: 2005-02-23 17:42 UTC by Dmitri A. Sergatskov
Modified: 2007-11-30 22:07 UTC (History)
5 users (show)

Fixed In Version: RHSA-2006:0132
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-07-14 19:43:42 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
plot of times vs size for SMP and UP (6.33 KB, image/png)
2005-02-25 19:52 UTC, Dmitri A. Sergatskov
no flags Details
NFS locking fixes that release the kernel_lock in do_unlk() (4.12 KB, patch)
2005-10-28 10:32 UTC, Larry Woodman
no flags Details | Diff
The correct patch (362 bytes, patch)
2005-10-28 18:03 UTC, Steve Dickson
no flags Details | Diff

Description Dmitri A. Sergatskov 2005-02-23 17:42:09 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041215 Firefox/1.0 Red Hat/1.0-12.EL4

Description of problem:
Some memory-intensive number-crunching (single-threaded) programs run significantly slower with SMP kernel than with UP kernel on the same (dual-CPU)
hardware. The problem seems to pertain to most recent RH kernels -- I reproduce
it on FC3 with the most recent kernel (kernel-smp-2.6.10-1.766_FC3.i686.rpm)
My last recorded benchamrk wich does not show this problem was from June 2004
(whatever kernels were current for RHEL3 and FC).

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
This is the simplest example I can come up with:

1. boot SMP kernel
2. run octave and execute the following at the prompt:
   octave:1> s=zeros(3000);
 (this will create 3000x3000 matrix filled with zeros)  
   octave:2> tic; w=s'; toc
 (this will transpose it and time the procedure)
3. boot UP kernel and repeat the procedure. 

Actual Results:  On my computer (2xAthlonMP / 1GB RAM) I get approximately 2 seconds with SMP kernel and 0.8 sec in UP mode. 

Expected Results:  Since tic/toc timer counts walltime, I expect (and used to observe) slightly shorter timing in SMP mode. 

Additional info:

2x Athlon MP (2000 MHz) on Tyan Tiger MP S2460 (Bios 1.05, the latest). 1 GB RAM.

The problem may be hardware dependent. In about 2000 (kernel 2.0.30 or about) I had a very similar problem on Intel 440LX m/b, but not on Intel 440BX...
Unfortunately there is no many Athlon MP chipsets around.

Comment 1 Dmitri A. Sergatskov 2005-02-25 19:52:38 UTC
Created attachment 111441 [details]
plot of times vs size for SMP and UP

Times transposure of matrix DxD size as a 
function of size. Compare SMP and UP modes.

Comment 2 Dmitri A. Sergatskov 2005-02-25 19:53:18 UTC
I did some additional testing. The figure 
 shows times obtained in SMP mode and in uni-processor
(UP) mode on the same hardware (2xAthlon 2000MHz / 1 GB RAM).
The swap was turned off for the test.
One can see a curious region around D=2000 to 4000 when
UP outperform SMP by almost a factor of 3.
The reason for deviation from O(d^2) law at high D is not
clear to me either.

Comment 7 Larry Woodman 2005-10-28 10:32:11 UTC
Created attachment 120499 [details]
NFS locking fixes that release the kernel_lock in do_unlk()

Ingo, this cause of this appears to be the missing unlock_kernel() in
do_unlk(). Its part of the NFS locking changes targeted for RHEL4-U3.

Larry Woodman

Comment 8 Steve Dickson 2005-10-28 18:03:04 UTC
Created attachment 120512 [details]
The correct patch

Note patch that Larry posted does not the needed the fix 
in it and also breaks lock tests (i.e. F_TEST) by passing 
back the wrong value when a lock does not exist.

Please try the one I just attached.

Note You need to log in before you can comment on or make changes to this bug.