Bug 131011 (IT_41790) - poor NFS performance when nfs_uncached_io set due to excessive FSSTAT messages
Summary: poor NFS performance when nfs_uncached_io set due to excessive FSSTAT messages
Keywords:
Status: CLOSED ERRATA
Alias: IT_41790
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: fileutils
Version: 2.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Tim Waugh
QA Contact: Mike McLean
URL:
Whiteboard:
Depends On:
Blocks: 123573
TreeView+ depends on / blocked
 
Reported: 2004-08-26 17:04 UTC by Neil Horman
Modified: 2007-11-30 22:06 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-12-13 21:35:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch to correct excessive FSTATFS NFS transactions (3.04 KB, patch)
2004-08-26 17:07 UTC, Neil Horman
no flags Details | Diff
fileutils-131011.patch (2.81 KB, patch)
2004-09-01 13:15 UTC, Tim Waugh
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2004:459 0 normal SHIPPED_LIVE Updated fileutils package 2004-12-13 05:00:00 UTC

Description Neil Horman 2004-08-26 17:04:08 UTC
Description of problem:
As described in issue tracer ticket 41790, the utilities in the
fileutils package calls fstatfs for every block of data they transfer
between two filesystems, which is not always necessecary.  This
results in a large number of FSSTAT messages being sent when one of
the fielsystems happens to be NFS based, leading to poor performance

Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to reproduce:
As taken from the IT ticket...
===============
1. enable NFS O_DIRECT (add "options nfs nfs_uncached_io=1" into
/etc/modules.conf)
2. mount NFS filesystem
3. execute cp/dd command for NFS filesystem

Actual Result
===========
A lot of FSSTAT/FSINFO requests are generated, then cp/dd perfomance
goes down.

Expected Result
==============
performance does not go down. Or improve the performance.

Additional Info
============
This problem does not happen with fileutils-4.1-10.1 or previous,
because the versions are not applied  *O_DIRECT.patch.
Also, this does not happen
O_DIRECT support in dd/cp command is done by
fileutils-[dd_]O_DIRECT.patch.

I've tested copying 50Mbyte file on NFS filesystem to the local
filesystem with fileutils-4.1-10.1 and fileutils-4.1-10.4.
(I used RHEL3 as NFS server)
Follow is the result for each case.
$ cp /exports/testifle /home/test/

a) fileutils-4.1-10.4
    real  0m10.380s
    user 0m0.050s
    sys  0m2.050s  
b) fileutils-4.1-10.1
    real 0m4.732s
    user 0m0.010s
    sys  0m1.230s
Steps to Reproduce:
1.
2.
3.
  
Actual results:
...

Expected results:
...

Additional info:

Comment 1 Neil Horman 2004-08-26 17:07:06 UTC
Created attachment 103126 [details]
patch to correct excessive FSTATFS NFS transactions

This is the version of the patch that was discussed/approved in the IT ticket. 
It reduces the number of fstatfs calls by caching the last call and only
updating it when the filesystem device changes.

Comment 2 Tim Waugh 2004-08-31 15:39:20 UTC
This patch seems to cause *more* fstatfs() calls in the non-O_DIRECT
case, for instance here:

   if (flags & O_DIRECT)
     {
       o_direct_opened = 1;
-      fstatfs (desc, &fs);
     }
 
+  if(buf.st_dev != cached_dev)
+    {
+      fstatfs(desc, &fs);
+      cached_dev = buf.st_dev;
+    }

In the original, fstatfs() would not be called at all unless O_DIRECT
was specified; in the new, fstatfs() is called unconditionally for
each fs touched.

Am I reading that right?

Comment 3 Neil Horman 2004-08-31 16:43:48 UTC
more or less.  You are correct in that in the patch we wind up calling
fstatfs once for every filesystem we traverse in a copy operation. 
However, if O_DIRECT is specified, then fstatfs will get called for
every block of data that is copied, since the affected function are
called on a per-block basis.  Under NFS this is unbearably slow (see
issue tracker 41790 for specific stats).  So this patch trades off
making the call once for every file system traveresed for once for
every block of data copied when O_DIRECT is specified.  We could
probably remove the additional calls in the non-direct case as well by
adding a conditional such as:
+  if((buf.st_dev != cached_dev) && (flags & O_DIRECT))
+    {
+      fstatfs(desc, &fs);
+      cached_dev = buf.st_dev;
+    }

Its trading an extra comparison for the ocasional extra fstatfs call.
 I'll happily add it if you like.


Comment 4 Tim Waugh 2004-08-31 16:59:18 UTC
Well, since the comparison is already done in the previous block,
couldn't we move that whole cached_dev=buf.st.dev block into the
previous block in place of the fstatfs() call?

Comment 5 Neil Horman 2004-08-31 17:25:23 UTC
Honestly I don't know that the few extra calls to fstatfs are worth
keeping the O_DIRECT check, but if you prefer to hold on to it I think
moving the cached_dev check inside the O_DIRECT check would be just as
good as what we have here.

Comment 6 Neil Horman 2004-08-31 21:39:01 UTC
wait. scratch that last comment.  I wasn't looking at the code when I
said that.  The patch needs to be just the way it is attached to this
bugzilla.  If you look at the package the return data structure from
fstatfs gets used in this line:
if ((fs.f_type == OCFS) && (o_direct_opened == 1))

fs.f_type is being tested regardless of weather or not O_DIRECT is
specified, so the patch as it currently stands fixes the improper use
of garbage data as well.

I knew there was a reason I did that... :)

Comment 7 Tim Waugh 2004-09-01 09:00:54 UTC
Aha, okay.  I think that if those two tests are switched around, i.e.:

if ((o_direct_opened == 1) && (fs.f_type == OCFS))

then fs won't be touched unless initialised, even if you make the
change I suggested in comment #4 -- do you agree?

Just trying to make sure I understand the patch!


Comment 8 Neil Horman 2004-09-01 11:15:30 UTC
Agreed, I think adding the change in #7 to the change in #4 would work
just fine.

Comment 9 Tim Waugh 2004-09-01 13:15:18 UTC
Created attachment 103337 [details]
fileutils-131011.patch

Similarly for the other file, and in fact the fstat() can be moved inside the
test for O_DIRECT as well.

Here is the patch I'm about to go ahead with.  Does it look okay?

Comment 10 Neil Horman 2004-09-01 13:31:54 UTC
Yeah, that looks good to me.  I suppose if you wanted to, you could
replace this:
if ((o_direct_opened == 1) && (fs.f_type == OCFS))

with this:
if ((flags & O_DIRECT) && (fs.f_type == OCFS))

And then remove the o_direct_opened variable to save yourself 4 bytes
on the stack in both full_write and safe_read, but thats getting
pretty pedantic.  I'd just go with what you have right now.  Thanks!

Comment 11 Tim Waugh 2004-10-07 10:40:00 UTC
Tests are showing that this is actually about 24% *slower* over NFS
than before the change. :-(

Comment 12 Tim Waugh 2004-10-07 11:18:37 UTC
Hmm, how about if we make this change?  This seems to have been a typo
in the original patch:

--- fileutils-4.1/lib/safe-read.c       2004-09-01 14:07:58.371301840
+0100
+++ fileutils-4.1/lib/safe-read.c       2004-09-01 14:07:58.371301840
+0100
@@ -90,7 +90,7 @@
       fstat(desc,&buf);
       if (cached_dev != buf.st_dev)
        {
-         statfs(desc,&fs);
+         fstatfs(desc,&fs);
          cached_dev = buf.st_dev;
        }
     }


Comment 13 Neil Horman 2004-10-07 11:27:11 UTC
24% slower?  That doesn't make any sense to me.  Do you have tcpdumps
to show traffic differences between the systems with an without the
patch?  The only thing this patch should be doing is reducing the
number of calls to fstatfs.  Do you by any chance have comparative
tcpdumps before and after the patch?

As for the change you have in your last comment, it certainly looks
sesible to me.  In fact I'm sorry I missed it before, as I'm not sure
how this would have worked using statfs at all (since desc is an int
rather than a string).  do you think that this had something to do
with your performance difference?

Comment 14 Tim Waugh 2004-10-07 11:35:21 UTC
I don't have a test set-up here, and have been unable to obtain one. 
Brock Organ is performing testing on this package.

I will build a new package with this fix (since I think it must be
needed anyway).

Comment 15 Tim Waugh 2004-10-07 11:40:23 UTC
Do you happen to have a test set-up there?  I can give you a pointer
to a package to test.

Comment 16 Neil Horman 2004-10-07 12:46:17 UTC
I agree, that change is definately needed.

As for the testing I'll wander over and talk to Brock.  Perhaps I can
get some tcpdumps off his environment directly.  Go ahead and send me
a  pointer to the test package, and I'll also test on my boxes here. 
Thanks!


Comment 23 John Flanagan 2004-12-13 21:35:54 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-459.html



Note You need to log in before you can comment on or make changes to this bug.