Red Hat Bugzilla – Bug 113905
NFS doesn't honor nsec timestamp values
Last modified: 2007-11-30 17:07:00 EST
Description of problem:
RHEL nfs clients don't honor nsec portion of file timestamp fields.
RHEL nfs client can miss file updates, and remain out-of-sync.
Version-Release number of selected component (if applicable):
(also report under RHL 8, likely present in AS2.1)
Customer reports that it's load/timing related.
Will post a test case if/when available.
Windows client updates a file via multiple write operations in under 1
second on a netapp filer (filesize remains constant), while rhel nfs
client reads same files.
(Yes, it has been observed that file locking would correct this
problem. The issue is that once out of sync, the linux nfs client
*stays* out of sync.)
Steps to Reproduce:
1. Update file multiple times in under 1 second
2. Trigger linux nfs client read between writes operations
Linux nfs client gets out of sync with the nfs server and stays out of
sync. Subsequently touching the file (from a third nfs client) brings
the rhel nfs client's cache back into sync.
Due to asynchronous read/write operations it is expected that the
linux nfs client will occasionally be out of sync for a short period.
It should not remain out of sync, though.
Escalated per Riel's request.
This was orignially reported in IT 108088
This shows the same symptoms as reported in BZ 108088.
The customer reports that it's reproducable with v2 as well as v3.
A first cut at honoring sub-second mtime updates. (Now if I could
reproduce the problem I could test this...)
--- fs/nfs/inode.c.orig 2004-01-19 17:00:41.000000000 -0800
+++ fs/nfs/inode.c 2004-01-20 00:52:30.000000000 -0800
@@ -1101,6 +1101,8 @@
/* Ugh... */
if (cdif == 0 && fattr->size > NFS_CACHE_ISIZE(inode))
+ if (fattr->mtime > NFS_CACHE_MTIME(inode))
+ goto out_valid;
Does this increase the overall operation count when
running something like the connectathon test suite?
If so, by what percentage?
Well the patch in comment #3 has no effect on the traffic
for the simply reason its never executed, at least when running
the connectathon test suite. So its not clear to me, that this patch
will help at all with the client seeing changes on the server in
a more timely bases.
The main issue is a file is being updated on the server
and the clients are not noticing it in a timely bases. Looking
over all that has been said, I don't see where changing the
default cache timeouts (i.e. acregmax, acdirmax oractimeo)
have been tried to try and solve this problem. The defaults
range any were from 60 to 30 seconds, knocking them down
to say 30 or 20 second would help the issue....
*** Bug 156307 has been marked as a duplicate of this bug. ***
Undid dup from bug 156307, since that was against RHEL2.1.
Also removed this one from U5 blocker list, since U5 is now closed.