Bug 119903 - nfs peformance very bad on EL3
Summary: nfs peformance very bad on EL3
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: All
OS: Linux
high
high
Target Milestone: ---
Assignee: Steve Dickson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-04-02 23:46 UTC by Paul D. Mitcheson
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-04-05 22:14:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2004:188 0 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 2 2004-05-11 04:00:00 UTC

Description Paul D. Mitcheson 2004-04-02 23:46:38 UTC
Description of problem:
I have just moved over from using a redhat 7.3 nfs server to using an
EL3 AS server on much better hardware.  (previously the /home
filesystem was mounted on ide disks with Adaptec 2400, now HP DL360,
storageworks 4400 SCSI on 642 smart array).  The nfs peformance on EL3
is absolutely terrible,  I see high loads on the server, nfs write
speeds are at best 900kbytes/s.  Tried tcp and version 2 nfs.  Doesn;t
help.  Samba is fast, so not a network or disk io problem.

Version-Release number of selected component (if applicable):
kernel-2.4.21-9.0.1.EL


How reproducible:
always

Steps to Reproduce:
1. mount nfs on EL3 box
2. copy file to nfs share
3. very slow
  
Actual results:
700kbytes/s typical.

Expected results:
11Megbytes/s

Additional info:
This happens from EL3 WS clients, debian with a 2.6 kernel, gentoo,
and suse clients.

Please address this issue ASAP.   An upgrade which was supposed to
sort out peformance, and at the same time deal with the EOL of 7.3 on
my old server, has been very difficult and embarrasing.

Comment 1 Paul D. Mitcheson 2004-04-03 00:10:43 UTC
Interestingly I have just tried the nfs from a solaris client and I am
getting about 6Mbytes/s.

So, it seems that any linux struggles but Solaris 8 is ok.

Very bizarre.

Paul

Comment 2 Paul D. Mitcheson 2004-04-03 00:23:16 UTC
Even when the client activity is very low (its midnight here),
stopping the nfs service loweers the load on muy nfs server from about
1.5 quiescent loading to about 0.01.

Paul

Comment 3 Paul D. Mitcheson 2004-04-04 11:59:19 UTC
OK, more info....

If I export the filesystem with async (as was the old default) I get
much better read and write speeds of 11MB/s.

So why, with sync, is the peformance about 15 times worse than with 
async?

Paul

Comment 4 Steve Dickson 2004-04-05 12:17:09 UTC
Becuse the with sync, writes are committed to stable storage which
takes much longer.

Using async is very dangerous wrt data integrity because if 
the server goes down before it sync out the data, there will 
be data corruption...

On the other hand, async show how fast nfsd could go when it
does not have to wait for the underlying fs to sync out the data...

Comment 5 Paul D. Mitcheson 2004-04-05 12:25:31 UTC
Hi Steve,

Well, with sync, an HP DL360, 1GB, dual 3GHz Xeon, Compaq smart array
642 with RAID V of 10krpm U320 disks couldn't even serve 5 clients
adequately over nfs!  (lots of nfs timeouts, very slow clients)

I moved to async and it became usable.

This morning I booted the latest beta kernel (with lots of scsi work)
and I now get about 5 times the performance out of my disks.  They
actually look like U320 disks now, rather than mediocre IDE type
performance.

I might try going back to sync at some point, in case the SCSI fix has
sorted things out.

Do you really expect such a huge performance difference with sync vs
async?  15 times?  I'm curious.  I can;t see me ever using sync if the
performance penalty is really so great.  Factor of 2 might be
acceptable, but 15 takes some swallowing.

Cheers for your comments so far.

Thanks,

Paul

Comment 6 Steve Dickson 2004-04-05 18:07:38 UTC
No... sync should not be 15times slower... if it is, its
probably not an NFS issues, its more like a local filesystem
or driver issue..... or even a network issue... Remember
the NFS server is just a middle guy... very dependent 
both networks and local filesystems functionally well... 

Comment 7 Paul D. Mitcheson 2004-04-05 21:35:53 UTC
OK - this problem has gone away since running the latest beta errata
kernel with the reworked SCSI drivers.

Thanks.  Sorry for barking up the wrong tree.

Paul

Comment 8 Ernie Petrides 2004-04-05 22:14:01 UTC
Closing this per info in last comment.  Thanks for your time
investigating this, Paul.  -ernie



Note You need to log in before you can comment on or make changes to this bug.