Bug 745648 - 5.7 nfs client perf regression on 1MB writes to medium-sized (16MB-1GB) files
Summary: 5.7 nfs client perf regression on 1MB writes to medium-sized (16MB-1GB) files
Keywords:
Status: CLOSED DUPLICATE of bug 728508
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.7
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Jeff Layton
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-10-12 22:11 UTC by Ben England
Modified: 2011-10-25 10:20 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-10-25 10:20:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
data, statistics and graph showing regression, needs OpenOffice Calc to open (23.74 KB, application/vnd.oasis.opendocument.spreadsheet)
2011-10-12 22:11 UTC, Ben England
no flags Details
spreadsheet showing problem on both NetApp and Linux NFS server (33.42 KB, application/octet-stream)
2011-10-13 18:03 UTC, Ben England
no flags Details
updated spreadsheet showing problem fixed in -293 kernel (42.89 KB, application/vnd.oasis.opendocument.spreadsheet)
2011-10-24 15:27 UTC, Ben England
no flags Details

Description Ben England 2011-10-12 22:11:37 UTC
Created attachment 527794 [details]
data, statistics and graph showing regression, needs OpenOffice Calc to open

Description of problem:

There is a performance regression of as much as 50% from RHEL5.6 to RHEL5.7 for single threaded buffered write of medium-size files.  

What is peculiar about this regression is that it goes away when you do fsync() before close().  In fact RHEL5.7 is significantly faster than RHEL5.6 (as much as 25%) in this situation.

The problem goes away with a sufficiently large file.

It's too late for 5.7 but perhaps for 5.8?

 
Version-Release number of selected component (if applicable):

RHEL5.6, 2.6.18-238 kernel  =>  RHEL5.7, 2.6.18-274 kernel
NetApp OnTap 7.3.5.1 (Steve Dickson's filer may have it)
This is perf36.lab.bos.redhat.com (16 GB RAM) over 10-Gbps to a NetApp NFS export.  The configuration is identical except for the OS version in these tests.  

How reproducible:

Every time so far.  I tried 4 samples at every file size to verify that the variance in the result was well below the size of the regression (data in attachment)


Steps to Reproduce:
1. install both RHEL5.6 and RHEL5.7 kernels on same /boot, so everything else
is the same
2. mount -t nfs -o nfsvers=3 perf-na1-10ge:/vol/vol0/home /mnt/big
3. run the below script, set FSYNC env. var to "conv=fsync" to do fsync before close, delete/blank the environment variable to just do close at end.  Do this for both OS versions. 

script try-fsizes.sh:

for m in `seq 1 4 ` ; do 
  for n in 16 64 256 1024 4096 ; do 
    rm -fv /mnt/big/x.dd
    sync
    sleep 2
    cmd="dd $FSYNC if=/dev/zero of=/mnt/big/x.dd bs=1024k count=$n"
    echo "$cmd"
    $cmd
  done
done 2>&1 


Actual results:

The attached graph shows on a RHEL5.7 system how fsync() speeds up (huh?) performance of single-thread writes for an NFS file.  

Expected results:

This seems wrong to me.  I did some straces and without the fsync call, throughput was determined by the time it takes to close the file, whereas with the fsync call, throughput was determined by the elapsed time of the fsync call.  For example, using a 1 GB file, without an fsync(), the close() call took 7 seconds(!), whereas with fsync() the close call took 0 seconds and the fsync() call took 2.5 seconds.  I think writes without fsync at end should be no slower than writes with fsync at end.  For NFS, when you close a file, the NFS client has to flush all the dirty pages to the NFS server (not necessarily to disk), because of NFS close-to-open semantics.   But why should this process be slower with close() than with fsync?  

Additional info:

Unsure at this time whether it happens on a Linux NFS server, on 1-Gbps connection, etc.

Comment 1 Ben England 2011-10-12 22:12:46 UTC
adding perfbz to cc list

Comment 2 Ben England 2011-10-13 18:03:02 UTC
Created attachment 528075 [details]
spreadsheet showing problem on both NetApp and Linux NFS server

An additional Calc worksheet shows that the problem happens when you use a Linux NFS server.  The regression is not as extreme as it was with the NetApp but it is still very significant (~30%).  

The disk and network must be fast enough or else you will not reproduce the problem.  The configuration used in the Linux NFS server tests is:
- same NFS client
- same 10-Gbps network
- Linux RHEL5.7 NFS server
- ext4 filesystem
- LVM volume
- hardware RAID 0+1 volume

Comment 3 Jeff Moyer 2011-10-17 16:55:42 UTC
Hi, Ben,

First, it would be good to define "medium file sizes".  It looks like you mean anything from 16MB to 1GB.  Second, it seems this is an NFS only regression.  Just to be sure, I tried on my enterprise storage array, and found no performance loss.  I'm updating the summary to reflect this, please change it if you feel I have misrepresented anything.

Comment 4 Jeff Layton 2011-10-21 00:57:13 UTC
There's a known performance regression in rhel5.7 kernels. See bug 728508 and the associated 5.7.z bug. Can you test a more recent kernel with the fix for that bug and let me know if the regression goes away?

Comment 5 Ben England 2011-10-24 15:27:03 UTC
Created attachment 529901 [details]
updated spreadsheet showing problem fixed in -293 kernel

I collected data on 2.6.18-293 kernel in brew, the regression went away.  See 3rd worksheet titled 2.6.18-293-netapp.  I flipped kernel back to -274 and the problem reappeared.  I think you can mark it fixed in 5.8.

Comment 6 Jeff Layton 2011-10-25 10:20:43 UTC
Excellent. Thanks for confirming it.

*** This bug has been marked as a duplicate of bug 728508 ***


Note You need to log in before you can comment on or make changes to this bug.