Bug 125167 - NFS performance very bad
NFS performance very bad
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Larry Woodman
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-06-03 07:47 EDT by Michael Bischof
Modified: 2007-11-30 17:07 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-10-19 15:24:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Michael Bischof 2004-06-03 07:47:59 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1)
Gecko/20031114

Description of problem:
reading and writing a single file works fine:
(writing ~  7.5 MB/s, reading ~ 11.5 MB/s,
tested with filessizes of 100,200,500 MB)

but things like extracting a tar (compressed/uncompressed, tar local 
 or on nfs) takes ages. extracting a 8 MB .tar.gz (containing ~ 32MB, 
178 directories, 1452 files) takes ~ 135s. deleting the just extracted
tree takes about 22s.

- a similar installation running on the SAME host under VMWARE GSX
  works as expected (extracted the same tar.gz archive as above,
  takes ~30s, deletion takes ~3.5s)
  
- all filesystems are ext3, software raid 5 (3 disks)
  suspecing the software raid as the problem, i created 
  a normal ext3 fs (no raid) and exported this one.
  same problem. slow.

- nfsstat doesnt show any errors

- network works fine:
  - tested with scp, ~ 10 MB/s
  - ifconfig doesnt show any errors
 
- if i export the filesystem with the async option, the performance
  is fine (i know sync/async makes a difference, but i don't think the
  difference should be is that big)

- tested with rhel3 and fedora core 1 as nfs clients, same result.

- mounted the nfs share locally (on the server itself), same result,
  terrible slow.



Version-Release number of selected component (if applicable):
kernel-smp-2.4.21-15.EL, kernel-smp-2.4.21-9.EL, ...

How reproducible:
Always

Steps to Reproduce:
1. Export a filesystem on the server
2. Mount it on the client
3. Extract a .tar
    

Additional info:


Hardware:

Dell PowerEdge 2650
Dual Intel Xeon, 2.80GHz
4 GB RAM
Adaptec AIC 7899
3x Maxtor SCSI Disk
Comment 1 Michael Bischof 2004-06-03 07:49:26 EDT
did some more tests...

with solaris (8) as client, performance is poor on both servers.
Comment 2 Steve Dickson 2004-06-03 21:21:24 EDT
This also could be an ext3 issued since the nfs server really does not
do too much to stand in the way of progress... 

Does the data=journal ext3 mode help?
Comment 3 Michael Bischof 2004-06-04 07:30:16 EDT
No, doesnt help. I also tried to tweak bdflush a little bit
(echo 40 0 0 0 60 300 60 0 0 > /proc/sys/vm/bdflush),
but didnt help either.


Comment 4 Michael Bischof 2004-06-04 07:32:13 EDT
Sorry, I was wrong... it got a little bit better, took 110s (135s 
before) this time. But this is still much too long.
Comment 5 John Hodrien 2004-07-16 05:20:00 EDT
Interesting you comment on this.  I've found performance of a Dell
Poweredge 1400 (Dual P3 800, Serverworks chipset) with AIC 7899
identical to this.  Since an upgrade from Redhat 7.2 to RH9, I've seen
performance drop in this manner, with local file operations nice and
fast, but remote nfs performance suffers from 'high latency' if you
see what I mean.  I'm getting in the order of 10Mb/sec across the
100Mbit switched network which is as expected, but anything that
involves lots of files suffers.  No errors are reported anywhere,
nfsstat is happy.  I've tried it with a 3com card and the inbuilt e100
with no differences.

I also see a load average increase on the server when this occurs,
especially as you point out with untar type operations.  I've two
identical machines, so I've been able to test on one with jfs, and
while this produces improved performance, it doesn't make *that* much
difference.  Async almost removes the problem, but I'm not convinced
it has entirely.

I've tried putting a Promise ATA 133 card in the machine, and testing
with a fast IDE disk, but there's not a great deal of difference there
either...  Which left me deciding it was the motherboard chipset.  I'm
suspicious it's not a coincidence that yours is a serverworks chipset
too.  Is there any tweaking we're not doing that we should be?

A 1Ghz celeron machine with a crap slow IDE disk gives better
performance than this machine and that's an out of the box RH9 setup.
Comment 6 Steve Dickson 2004-10-14 17:22:53 EDT
There was some VM issues that we recently found in RHEL3
kernels, so I'm going to reassign this to our VM guy 
to see what he thinks.
Comment 7 Larry Woodman 2004-10-14 17:28:24 EDT
Please collect "vmstat 1" and "top" outputs while the system is
running is this "bad state" so I can see exactly whats going on. 
Also, please use the appropiate pre-RHEL3-U4 kernel for the run from here:

>>>http://people.redhat.com/coughlan/RHEL3-perf-test/

Thanks, Larry Woodman
Comment 8 RHEL Product and Program Management 2007-10-19 15:24:55 EDT
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.

Note You need to log in before you can comment on or make changes to this bug.