Red Hat Bugzilla – Bug 119903
nfs peformance very bad on EL3
Last modified: 2007-11-30 17:07:01 EST
Description of problem:
I have just moved over from using a redhat 7.3 nfs server to using an
EL3 AS server on much better hardware. (previously the /home
filesystem was mounted on ide disks with Adaptec 2400, now HP DL360,
storageworks 4400 SCSI on 642 smart array). The nfs peformance on EL3
is absolutely terrible, I see high loads on the server, nfs write
speeds are at best 900kbytes/s. Tried tcp and version 2 nfs. Doesn;t
help. Samba is fast, so not a network or disk io problem.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. mount nfs on EL3 box
2. copy file to nfs share
3. very slow
This happens from EL3 WS clients, debian with a 2.6 kernel, gentoo,
and suse clients.
Please address this issue ASAP. An upgrade which was supposed to
sort out peformance, and at the same time deal with the EOL of 7.3 on
my old server, has been very difficult and embarrasing.
Interestingly I have just tried the nfs from a solaris client and I am
getting about 6Mbytes/s.
So, it seems that any linux struggles but Solaris 8 is ok.
Even when the client activity is very low (its midnight here),
stopping the nfs service loweers the load on muy nfs server from about
1.5 quiescent loading to about 0.01.
OK, more info....
If I export the filesystem with async (as was the old default) I get
much better read and write speeds of 11MB/s.
So why, with sync, is the peformance about 15 times worse than with
Becuse the with sync, writes are committed to stable storage which
takes much longer.
Using async is very dangerous wrt data integrity because if
the server goes down before it sync out the data, there will
be data corruption...
On the other hand, async show how fast nfsd could go when it
does not have to wait for the underlying fs to sync out the data...
Well, with sync, an HP DL360, 1GB, dual 3GHz Xeon, Compaq smart array
642 with RAID V of 10krpm U320 disks couldn't even serve 5 clients
adequately over nfs! (lots of nfs timeouts, very slow clients)
I moved to async and it became usable.
This morning I booted the latest beta kernel (with lots of scsi work)
and I now get about 5 times the performance out of my disks. They
actually look like U320 disks now, rather than mediocre IDE type
I might try going back to sync at some point, in case the SCSI fix has
sorted things out.
Do you really expect such a huge performance difference with sync vs
async? 15 times? I'm curious. I can;t see me ever using sync if the
performance penalty is really so great. Factor of 2 might be
acceptable, but 15 takes some swallowing.
Cheers for your comments so far.
No... sync should not be 15times slower... if it is, its
probably not an NFS issues, its more like a local filesystem
or driver issue..... or even a network issue... Remember
the NFS server is just a middle guy... very dependent
both networks and local filesystems functionally well...
OK - this problem has gone away since running the latest beta errata
kernel with the reworked SCSI drivers.
Thanks. Sorry for barking up the wrong tree.
Closing this per info in last comment. Thanks for your time
investigating this, Paul. -ernie