113905 – NFS doesn't honor nsec timestamp values

Bug 113905 - NFS doesn't honor nsec timestamp values

Summary: NFS doesn't honor nsec timestamp values

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Steve Dickson
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-01-20 00:40 UTC by Don Howard
Modified:	2007-11-30 22:07 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2005-09-06 12:40:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Don Howard 2004-01-20 00:40:03 UTC

Description of problem:

RHEL nfs clients don't honor nsec portion of file timestamp fields.
RHEL nfs client can miss file updates, and remain out-of-sync.


Version-Release number of selected component (if applicable):

RHEL 3 
(also report under RHL 8, likely present in AS2.1)

How reproducible:

Customer reports that it's load/timing related.
Will post a test case if/when available.

Windows client updates a file via multiple write operations in under 1
second on a netapp filer (filesize remains constant), while rhel nfs
client reads same files.

(Yes, it has been observed that file locking would correct this
problem.  The issue is that once out of sync, the linux nfs client
*stays* out of sync.)



Steps to Reproduce:
1. Update file multiple times in under 1 second
2. Trigger linux nfs client read between writes operations
3. 
  
Actual results:

Linux nfs client gets out of sync with the nfs server and stays out of
sync.  Subsequently touching the file (from a third nfs client) brings
the rhel nfs client's cache back into sync.

Expected results:

Due to asynchronous read/write operations it is expected that the 
linux nfs client will occasionally be out of sync for a short period.

It should not remain out of sync, though.



Additional info:

Escalated per Riel's request.

Comment 1 Don Howard 2004-01-20 00:41:05 UTC

This was orignially reported in IT 108088

Comment 2 Don Howard 2004-01-20 00:44:48 UTC

This shows the same symptoms as reported in BZ 108088.  
The customer reports that it's reproducable with v2 as well as v3.

Comment 3 Don Howard 2004-01-21 02:10:07 UTC

A first cut at honoring sub-second mtime updates.  (Now if I could 
reproduce the problem I could test this...) 
 
--- fs/nfs/inode.c.orig 2004-01-19 17:00:41.000000000 -0800 
+++ fs/nfs/inode.c      2004-01-20 00:52:30.000000000 -0800 
@@ -1101,6 +1101,8 @@ 
        /* Ugh... */ 
        if (cdif == 0 && fattr->size > NFS_CACHE_ISIZE(inode)) 
                goto out_valid; 
+       if (fattr->mtime > NFS_CACHE_MTIME(inode)) 
+               goto out_valid; 
        return -1; 
  out_valid: 
        return 0;

Comment 4 Steve Dickson 2004-11-12 11:16:40 UTC

Does this increase the overall operation count when
running something like the connectathon test suite?
If so, by what percentage?

Comment 9 Steve Dickson 2004-12-20 20:42:53 UTC

Well the patch in comment #3 has no effect on the traffic
for the simply reason its never executed, at least when running
the connectathon test suite. So its not clear to me, that this patch
will help at all with the client seeing changes on the server in
a more timely bases.

The main issue is a file is being updated on the server
and the clients are not noticing it in a timely bases. Looking
over all that has been said, I don't see where changing the
default cache timeouts (i.e. acregmax, acdirmax oractimeo)
have been tried to  try and solve this problem.  The defaults
range any were from 60 to 30 seconds, knocking them down
to say 30 or 20 second would help the issue....

Comment 21 Don Howard 2005-04-29 17:20:39 UTC

*** Bug 156307 has been marked as a duplicate of this bug. ***

Comment 22 Ernie Petrides 2005-04-29 19:10:46 UTC

Undid dup from bug 156307, since that was against RHEL2.1.

Also removed this one from U5 blocker list, since U5 is now closed.

Note You need to log in before you can comment on or make changes to this bug.