Red Hat Bugzilla – Bug 435443
Add patch to prevent NFS cache invalidation after write calls
Last modified: 2008-09-22 10:53:39 EDT
During a discussion on the phone the other day, it was mentioned that we should
not be trying to invalidate the cache after a write call completes.
With a cursory look, I think we'll likely need this patch backported for RHEL4
(commitid is from Trond's nfs-2.6 git tree):
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Sun Sep 30 15:21:24 2007 -0400
NFS: Fake up 'wcc' attributes to prevent cache invalidation after write
NFSv2 and v4 don't offer weak cache consistency attributes on WRITE calls.
In NFSv3, returning wcc data is optional. In all cases, we want to prevent
the client from invalidating our cached data whenever ->write_done()
attempts to update the inode attributes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We'll probably need to come up with some sort of reproducer for this so that we
can quantify whether this actually helps things.
For RHEL5, it looks like the above patch went in as part of Peter's big
performance update in bug 321111.
I suppose we can start by trying this test. Presumably the first read should be
as fast as the "reread" if the patch is working correctly. Sniffing traffic
should also tell us (and maybe looking at nfsstat -c).
# /opt/iozone/bin/iozone -azc -f /mnt/test/testfile -s 64k -r 64 -i 0 -i 1
Created attachment 296373 [details]
patch1 -- NFSv4: Add post-op attributes to NFSv4 write and commit callbacks.
Created attachment 296374 [details]
patch2 -- NFS: Clean up inode metadata updates
Created attachment 296376 [details]
patch3 -- NFS: Fake up 'wcc' attributes to prevent cache invalidation after write
I've attached a set of 3 experimental patches to implement what Peter suggested
the other day in our meeting. This seems to correct the iozone slowdowns in the
testing I've done.
I'm building a new set of test kernels now and will post them on my people page
once they're done...
I've put this on 4.8 proposed, but I'm not opposed to considering this for 4.7
if it's deemed important enough. This of course presumes no regressions show up
That said, it seems like this might cause a few more problems with cache
client writes to file and has up to date cache
client writes to file again and doesn't invalidate cache since we've faked up
the wcc preattrs local timestamp is set to mtime of file
server races in with a write from local process or another client within the
client doesn't realize the file has changed
...before this patchset, the client would have probably invalidated the cache
after the second write. Given the other possible races due to coarse mtime
granularity, this probably isn't a huge issue but its something we should keep
I've put some patches up on my people page with this patchset:
Would it be possible for your customer to test
kernel-2.6.9-68.16.EL.jtltest.31 someplace non-critical and let us know if this
helps them at all?
Note that this kernel *also* contains a patch to fix a lockd race that can
seriously affect performance as well, so if it does help we'll still likely need
to have them verify whether it's this patch that actually helps...
Created attachment 296811 [details]
patch1 -- NFSv4: Add post-op attributes to NFSv4 write and commit callbacks
Fixed patch1 -- the original one had a bad merge of nops changes for write and
commit. So much of this file looks alike that it confuses the patch program and
it ends up merging changes into the wrong place.
Building a new test kernel now. The problem should only have affected NFSv4, so
it's doubtful this will make any difference on NFSv2/3.
Updating PM score.
*** This bug has been marked as a duplicate of bug 427385 ***