Red Hat Bugzilla – Bug 139318
Slow system calls on Pentium 4
Last modified: 2007-11-30 17:10:54 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Description of problem:
A while ago I noticed that system calls on my Fedora Core 1/2 system
was taking a lot longer than I had expected. I wrote a blog (and later
an article) showing some numbers,
I've spent a little time trying to figure out why my "stock" linux
build is so much faster on syscalls compared to Fedora. I'm not 100%
sure, but I suspect something is messing up the vsyscall stuff? Maybe
one of the ~65 patches is doing something to cause this?
Like I say, I'm not 100% sure exactly what causes this, it might be
some build option that triggers it, but I can't figure out which one
it would be. Building a "stock" Linux kernel with the Fedora kernel
configuration gives a kernel with good performance as far as I can tell.
If there is anything else I can do to help, please just let me know.
As you can see from the numbers in my benchmarks, it's quite a
significant difference on system calls.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Run the test program I have on
2. A "good" kernel will report numbers in the range of 1000 clock
cycles / system call. A "bad" kernel will show numbers in the range of
10,000 (a magnitude more).
This is due to an incompatability with exec-shield and sysenter.
The P4 sadly is affected quite badly by this.
The only way to get both to work would be to add an expensive
MSR write to the context switch path.
please retest with current Fedora kernels which have the 3:1 split.
(i.e. not hugemem.)
I tested kernel-smp-2.6.10-1.1074_FC4, and it has the same (actually,
even worse) problem. The timings from my simple tests with this kernel
gettimeofday(): clock ticks/call: 8504
uname(): clock ticks/call: 1720
chdir(): clock ticks/call: 107518
open(): clock ticks/call: 71137
The same test on a 2.6.9 "stock" kernel (no patches) yields:
gettimeofday(): clock ticks/call: 859
uname(): clock ticks/call: 1258
chdir(): clock ticks/call: 4665
open(): clock ticks/call: 1392
Hmm, I was just trying to add myself to the CC list, not sure why the
QA contact disappeared. And now I'm not allowed to put the QA contact
I did some more tests with the latest FC4 kernel, it's still slow (and
surprisingly, chdir() and open() times are way worse):
gettimeofday(): clock ticks/call: 8568
uname(): clock ticks/call: 1750
chdir(): clock ticks/call: 110082
open(): clock ticks/call: 73196
I also tried a bunch of things to see if there was any way around
this, seing that it might be related to exec-shield. I did
1) Booted the kernel with exec-shield=0 ==> No difference
2) Tried setting /proc/sys/kernel/exec-shield to 0 ==> No difference
3) Compiled the test program with -Wa,--execstack ==> No difference
4) Set the binary using setarch i386 ==> No difference
I also tried the RHEL4 kernel, and to my surprise, out of the box (on
the same Fedora installation), it's running a lot better. The numbers
are slightly longer (slower) than my "stock" kernel, but many times
faster than the FC4 candidate kernel.
Are we sure this is due exec-shield, and not some other patch?
the latest fc4 kernel (2.6.11-1.1234) has some exec-shield patches to make it
work with vdso again (and hence: sysenter).
*** Bug 148839 has been marked as a duplicate of this bug. ***
Hmmm, still pretty slow on my system (about 7x - 10x slower than say RHEL4 kernel):
thor (08:20) 29/0 $ uname -a; ./cpu-bench
Linux thor.ogre.com 2.6.11-1.1240_FC4smp #1 SMP Wed Apr 13 08:57:15 EDT 2005
i686 i686 i386 GNU/Linux
gettimeofday(): clock ticks/call: 8628
uname(): clock ticks/call: 1784
chdir(): clock ticks/call: 40391
open(): clock ticks/call: 34638
root@leifh 27/0 # uname -a; ./cpu-bench
Linux leifh.corp.yahoo.com 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686
i686 i386 GNU/Linux
gettimeofday(): clock ticks/call: 1662
uname(): clock ticks/call: 1577
chdir(): clock ticks/call: 3821
open(): clock ticks/call: 2280
Not to mention "unpatched" Linus kernel which is still faster than RHEL4.
Because of the exec-shield implementation, you can only use sysenter when the
new NX page protection is available. There should be a boot-time message that
says either "NX (Execute Disable) protection: active" or "Using x86 segment
limits to approximate NX protection". If you get the latter, then you are not
using sysenter. Current P4 production chips have NX support, but that is pretty