Bug 745648

Summary: 5.7 nfs client perf regression on 1MB writes to medium-sized (16MB-1GB) files
Product: Red Hat Enterprise Linux 5 Reporter: Ben England <bengland>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 5.7CC: esandeen, jlayton, jmoyer, perfbz, rwheeler, steved
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-25 10:20:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
data, statistics and graph showing regression, needs OpenOffice Calc to open
none
spreadsheet showing problem on both NetApp and Linux NFS server
none
updated spreadsheet showing problem fixed in -293 kernel none

Description Ben England 2011-10-12 22:11:37 UTC
Created attachment 527794 [details]
data, statistics and graph showing regression, needs OpenOffice Calc to open

Description of problem:

There is a performance regression of as much as 50% from RHEL5.6 to RHEL5.7 for single threaded buffered write of medium-size files.  

What is peculiar about this regression is that it goes away when you do fsync() before close().  In fact RHEL5.7 is significantly faster than RHEL5.6 (as much as 25%) in this situation.

The problem goes away with a sufficiently large file.

It's too late for 5.7 but perhaps for 5.8?

 
Version-Release number of selected component (if applicable):

RHEL5.6, 2.6.18-238 kernel  =>  RHEL5.7, 2.6.18-274 kernel
NetApp OnTap 7.3.5.1 (Steve Dickson's filer may have it)
This is perf36.lab.bos.redhat.com (16 GB RAM) over 10-Gbps to a NetApp NFS export.  The configuration is identical except for the OS version in these tests.  

How reproducible:

Every time so far.  I tried 4 samples at every file size to verify that the variance in the result was well below the size of the regression (data in attachment)


Steps to Reproduce:
1. install both RHEL5.6 and RHEL5.7 kernels on same /boot, so everything else
is the same
2. mount -t nfs -o nfsvers=3 perf-na1-10ge:/vol/vol0/home /mnt/big
3. run the below script, set FSYNC env. var to "conv=fsync" to do fsync before close, delete/blank the environment variable to just do close at end.  Do this for both OS versions. 

script try-fsizes.sh:

for m in `seq 1 4 ` ; do 
  for n in 16 64 256 1024 4096 ; do 
    rm -fv /mnt/big/x.dd
    sync
    sleep 2
    cmd="dd $FSYNC if=/dev/zero of=/mnt/big/x.dd bs=1024k count=$n"
    echo "$cmd"
    $cmd
  done
done 2>&1 


Actual results:

The attached graph shows on a RHEL5.7 system how fsync() speeds up (huh?) performance of single-thread writes for an NFS file.  

Expected results:

This seems wrong to me.  I did some straces and without the fsync call, throughput was determined by the time it takes to close the file, whereas with the fsync call, throughput was determined by the elapsed time of the fsync call.  For example, using a 1 GB file, without an fsync(), the close() call took 7 seconds(!), whereas with fsync() the close call took 0 seconds and the fsync() call took 2.5 seconds.  I think writes without fsync at end should be no slower than writes with fsync at end.  For NFS, when you close a file, the NFS client has to flush all the dirty pages to the NFS server (not necessarily to disk), because of NFS close-to-open semantics.   But why should this process be slower with close() than with fsync?  

Additional info:

Unsure at this time whether it happens on a Linux NFS server, on 1-Gbps connection, etc.

Comment 1 Ben England 2011-10-12 22:12:46 UTC
adding perfbz to cc list

Comment 2 Ben England 2011-10-13 18:03:02 UTC
Created attachment 528075 [details]
spreadsheet showing problem on both NetApp and Linux NFS server

An additional Calc worksheet shows that the problem happens when you use a Linux NFS server.  The regression is not as extreme as it was with the NetApp but it is still very significant (~30%).  

The disk and network must be fast enough or else you will not reproduce the problem.  The configuration used in the Linux NFS server tests is:
- same NFS client
- same 10-Gbps network
- Linux RHEL5.7 NFS server
- ext4 filesystem
- LVM volume
- hardware RAID 0+1 volume

Comment 3 Jeff Moyer 2011-10-17 16:55:42 UTC
Hi, Ben,

First, it would be good to define "medium file sizes".  It looks like you mean anything from 16MB to 1GB.  Second, it seems this is an NFS only regression.  Just to be sure, I tried on my enterprise storage array, and found no performance loss.  I'm updating the summary to reflect this, please change it if you feel I have misrepresented anything.

Comment 4 Jeff Layton 2011-10-21 00:57:13 UTC
There's a known performance regression in rhel5.7 kernels. See bug 728508 and the associated 5.7.z bug. Can you test a more recent kernel with the fix for that bug and let me know if the regression goes away?

Comment 5 Ben England 2011-10-24 15:27:03 UTC
Created attachment 529901 [details]
updated spreadsheet showing problem fixed in -293 kernel

I collected data on 2.6.18-293 kernel in brew, the regression went away.  See 3rd worksheet titled 2.6.18-293-netapp.  I flipped kernel back to -274 and the problem reappeared.  I think you can mark it fixed in 5.8.

Comment 6 Jeff Layton 2011-10-25 10:20:43 UTC
Excellent. Thanks for confirming it.

*** This bug has been marked as a duplicate of bug 728508 ***