Description of problem: As described in issue tracer ticket 41790, the utilities in the fileutils package calls fstatfs for every block of data they transfer between two filesystems, which is not always necessecary. This results in a large number of FSSTAT messages being sent when one of the fielsystems happens to be NFS based, leading to poor performance Version-Release number of selected component (if applicable): How reproducible: always Steps to reproduce: As taken from the IT ticket... =============== 1. enable NFS O_DIRECT (add "options nfs nfs_uncached_io=1" into /etc/modules.conf) 2. mount NFS filesystem 3. execute cp/dd command for NFS filesystem Actual Result =========== A lot of FSSTAT/FSINFO requests are generated, then cp/dd perfomance goes down. Expected Result ============== performance does not go down. Or improve the performance. Additional Info ============ This problem does not happen with fileutils-4.1-10.1 or previous, because the versions are not applied *O_DIRECT.patch. Also, this does not happen O_DIRECT support in dd/cp command is done by fileutils-[dd_]O_DIRECT.patch. I've tested copying 50Mbyte file on NFS filesystem to the local filesystem with fileutils-4.1-10.1 and fileutils-4.1-10.4. (I used RHEL3 as NFS server) Follow is the result for each case. $ cp /exports/testifle /home/test/ a) fileutils-4.1-10.4 real 0m10.380s user 0m0.050s sys 0m2.050s b) fileutils-4.1-10.1 real 0m4.732s user 0m0.010s sys 0m1.230s Steps to Reproduce: 1. 2. 3. Actual results: ... Expected results: ... Additional info:
Created attachment 103126 [details] patch to correct excessive FSTATFS NFS transactions This is the version of the patch that was discussed/approved in the IT ticket. It reduces the number of fstatfs calls by caching the last call and only updating it when the filesystem device changes.
This patch seems to cause *more* fstatfs() calls in the non-O_DIRECT case, for instance here: if (flags & O_DIRECT) { o_direct_opened = 1; - fstatfs (desc, &fs); } + if(buf.st_dev != cached_dev) + { + fstatfs(desc, &fs); + cached_dev = buf.st_dev; + } In the original, fstatfs() would not be called at all unless O_DIRECT was specified; in the new, fstatfs() is called unconditionally for each fs touched. Am I reading that right?
more or less. You are correct in that in the patch we wind up calling fstatfs once for every filesystem we traverse in a copy operation. However, if O_DIRECT is specified, then fstatfs will get called for every block of data that is copied, since the affected function are called on a per-block basis. Under NFS this is unbearably slow (see issue tracker 41790 for specific stats). So this patch trades off making the call once for every file system traveresed for once for every block of data copied when O_DIRECT is specified. We could probably remove the additional calls in the non-direct case as well by adding a conditional such as: + if((buf.st_dev != cached_dev) && (flags & O_DIRECT)) + { + fstatfs(desc, &fs); + cached_dev = buf.st_dev; + } Its trading an extra comparison for the ocasional extra fstatfs call. I'll happily add it if you like.
Well, since the comparison is already done in the previous block, couldn't we move that whole cached_dev=buf.st.dev block into the previous block in place of the fstatfs() call?
Honestly I don't know that the few extra calls to fstatfs are worth keeping the O_DIRECT check, but if you prefer to hold on to it I think moving the cached_dev check inside the O_DIRECT check would be just as good as what we have here.
wait. scratch that last comment. I wasn't looking at the code when I said that. The patch needs to be just the way it is attached to this bugzilla. If you look at the package the return data structure from fstatfs gets used in this line: if ((fs.f_type == OCFS) && (o_direct_opened == 1)) fs.f_type is being tested regardless of weather or not O_DIRECT is specified, so the patch as it currently stands fixes the improper use of garbage data as well. I knew there was a reason I did that... :)
Aha, okay. I think that if those two tests are switched around, i.e.: if ((o_direct_opened == 1) && (fs.f_type == OCFS)) then fs won't be touched unless initialised, even if you make the change I suggested in comment #4 -- do you agree? Just trying to make sure I understand the patch!
Agreed, I think adding the change in #7 to the change in #4 would work just fine.
Created attachment 103337 [details] fileutils-131011.patch Similarly for the other file, and in fact the fstat() can be moved inside the test for O_DIRECT as well. Here is the patch I'm about to go ahead with. Does it look okay?
Yeah, that looks good to me. I suppose if you wanted to, you could replace this: if ((o_direct_opened == 1) && (fs.f_type == OCFS)) with this: if ((flags & O_DIRECT) && (fs.f_type == OCFS)) And then remove the o_direct_opened variable to save yourself 4 bytes on the stack in both full_write and safe_read, but thats getting pretty pedantic. I'd just go with what you have right now. Thanks!
Tests are showing that this is actually about 24% *slower* over NFS than before the change. :-(
Hmm, how about if we make this change? This seems to have been a typo in the original patch: --- fileutils-4.1/lib/safe-read.c 2004-09-01 14:07:58.371301840 +0100 +++ fileutils-4.1/lib/safe-read.c 2004-09-01 14:07:58.371301840 +0100 @@ -90,7 +90,7 @@ fstat(desc,&buf); if (cached_dev != buf.st_dev) { - statfs(desc,&fs); + fstatfs(desc,&fs); cached_dev = buf.st_dev; } }
24% slower? That doesn't make any sense to me. Do you have tcpdumps to show traffic differences between the systems with an without the patch? The only thing this patch should be doing is reducing the number of calls to fstatfs. Do you by any chance have comparative tcpdumps before and after the patch? As for the change you have in your last comment, it certainly looks sesible to me. In fact I'm sorry I missed it before, as I'm not sure how this would have worked using statfs at all (since desc is an int rather than a string). do you think that this had something to do with your performance difference?
I don't have a test set-up here, and have been unable to obtain one. Brock Organ is performing testing on this package. I will build a new package with this fix (since I think it must be needed anyway).
Do you happen to have a test set-up there? I can give you a pointer to a package to test.
I agree, that change is definately needed. As for the testing I'll wander over and talk to Brock. Perhaps I can get some tcpdumps off his environment directly. Go ahead and send me a pointer to the test package, and I'll also test on my boxes here. Thanks!
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-459.html