Bug 800181 - NFSv4 on RHEL 6.2, 6.3 over six times slower than 5.7, 5.8
NFSv4 on RHEL 6.2, 6.3 over six times slower than 5.7, 5.8
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.3
x86_64 Linux
unspecified Severity unspecified
: rc
: ---
Assigned To: nfs-maint
Red Hat Kernel QE team
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-05 17:04 EST by mark roth
Modified: 2013-07-26 09:32 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-26 08:53:18 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description mark roth 2012-03-05 17:04:10 EST
Description of problem:
On a 6.2 system, unpacking a large archive from an NFSv4 mounted directory from a 6.2 server, to the local disk takes seconds. cd'ing into an NFS mounted directory, even with that directory export from, and mounted to, the same server, takes 7.5 minutes.

The same command performed on a 6.2 system, into an NFSv3 mounted directory, takes 1.5 minutes.

Version-Release number of selected component (if applicable):
nfs-utils & nfs-utils-libs 1.2.3.15.el6

How reproducible:

100%

Steps to Reproduce:
1. mkdir /tmp/foo, export /tmp/foo to same server as /mnt/foo(rw,sync,no_wdelay)
2. fstab entry: canary:/scratch/foo /mnt/foo nfs rw,hard,intr,async 0 0
3. cd /tmp, time <unpack large archive>
4. cd /mnt/foo, time <unpack large archive>
  
Actual results:
Approximately 7 min 35 sec.

Expected results:
Approximately 1.5 minutes

Additional info:
Comment 2 RHEL Product and Program Management 2012-05-03 01:19:26 EDT
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.
Comment 3 mark roth 2012-07-10 09:52:37 EDT
This is still a show stopper. Redhat has never acknowledged this report nor looked at it.

Note that I work for a US government agency, and this impacts us: we cannot migrate to 6.x from 5.x for home directory servers.
Comment 4 J. Bruce Fields 2012-07-10 10:42:17 EDT
Apologies for not looking at this before, but: could you talk to support?  That would help make progress.

"cd'ing into an NFS mounted directory... takes 7.5 minutes."

I'm taking that to mean "cd'ing to an NFS mounted directory and then unpacking a large archive in that directory... takes 7.5 minutes"?

Assuming so: how many files and directories does the archive contain?  And what kind of filesystem and disk is the export (/tmp/foo) stored on?
Comment 5 mark roth 2012-07-10 12:00:26 EDT
I tried using the support email address my manager was using on another issue - I sent that a couple of weeks ago, and got no response. I've tried emailing nfs-main@redhat.com, and it *fails*, with a "too many hops".

Yes, I cd into the NFS mounted directory, and use my manager's unpack script to unpack the tar.gz file, from 28M to 92M. I get the time by running the unpack within the time command. I'm going to ext3; tried on ext4, and it's the same. The final straw is that on our RHEL test server, I'm exporting a directory, and mounting it *on* *the* *same* server, and get these results.

find /mnt/foo/<unpacked tarfile directory> | wc -l says 4626.

     mark
Comment 6 Ric Wheeler 2012-07-10 14:25:53 EDT
Hi Mark,

nfs-maint@redhat.com is an alias for developers.

What you need to do is to contact Red Hat Support and work with them to debug your configuration if you are customer. Our support team is *really* good at this.

Red Hat developers (who are on this bz and behind that email alias) work with support once they gather the data, debug common issues, etc.

Please open up a formal support ticket if you have a Red Hat subscription.

Thanks!
Comment 7 Ric Wheeler 2012-07-11 11:19:44 EDT
What you might be seeing is the impact of RHEL6 enabling write barriers - that makes your data safe in face of a power outage.

That would explain the performance difference (your RHEL5 config is fast, but not power failure safe).

Again, you need to work this with Red Hat support, not developers.
Comment 8 mark roth 2012-07-11 11:47:57 EDT
What I did:

mkdir /scratch/foo /mnt/foo
vi /etc/exports
/scratch/foo <thisserver>(rw,sync,no_wdelay)
/etc/nfsmount - all defaults, 
# RPCGSS security flavors
# [none, sys, krb5, krb5i, krb5p ]
# Sec=sys
so no kerboros
service nfs start
mount <thisserver>:/scratch/foo /mnt/foo

test 0:
cd /tmp
time unpack file.tar.gz

test 1:
cd /mnt/foo
time unpack file.tar.gz

Time for test 0 is approx 1.5 min
Time for test 1 is approx 7.5 min

Please identify any additional information required

       mark
Comment 9 Ric Wheeler 2012-08-02 09:16:52 EDT
You seem to be comparing untar something in /tmp to the performance over NFS.

It will certainly be much slower over NFS, exact speeds and ratios depend on what is under /tmp and what kind of NFS you use.

You still need to find out how to engage Red Hat support officially. If you can send me (rwheeler@redhat.com) details about your agency I will try to hook you up with the correct channel.

Thanks!
Comment 10 Jeff Layton 2013-07-26 08:53:18 EDT
No response on this for almost a year. I'm going to close it on the basis that the problem report was not an apples-to-apples comparison.
Comment 11 Jeff Layton 2013-07-26 09:32:58 EDT
Pasting in Mark's reply that he sent via email:

> No, nothing's changed. I AM comparing apples to apples. If my NFS export
> server is 5.x, w/ rw,sync, it's 6-7 times faster untaring, while in an
> NFS-mounted directory, than it is if the NFS export server is running 6.x
> with *exactly* the same parms in exports.
> 
> As far as I can tell, RH never even tested the bug I complained about.

These sorts of problems take a fair bit of effort to chase down. Performance problems come down to numbers and you need to ensure that as many variables as possible are taken out of the equation. It's also heavily dependent on things like the server involved, latency between client and server, etc...

Our engineering team relies heavily on our support organization to help us do that legwork. In comment #4 and comment #6, we asked whether you could open a case with our support group so they could help you do that. Almost one year later, and it appears that that was not done.

If you do wish to do that at this time, then please refer them to this bug and they can reopen it.

Note You need to log in before you can comment on or make changes to this bug.