Bug 135688 - NFS ESTALES returned on open [IT50092]
Summary: NFS ESTALES returned on open [IT50092]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-10-14 13:57 UTC by Neil Horman
Modified: 2007-11-30 22:07 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-05-18 13:28:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch to retry fs operations that result in ESTALE errors (1.41 KB, patch)
2004-10-14 13:59 UTC, Neil Horman
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2005:294 0 normal SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 5 2005-05-18 04:00:00 UTC

Description Neil Horman 2004-10-14 13:57:12 UTC
Description of problem:
If a file is modified on an NFS server which is exporting that
directory to NFS clients, the NFS clients may receive numerous ESTALE
errors when attempting to access that file using cached file handles.
 The number of ESTALES recevied and reported to user space appears
related to how many directories above that file may have been removed

Version-Release number of selected component (if applicable):
all

How reproducible:
always

Steps to Reproduce:
1) NFS mount a share from a client to a server.  Lets say we mount
server:/tmp on the client at /mnt/tmp. You may want to mount it with a
large timeo value to ensure that you get ESTALE on open from the cache
file handles

2) on the server in /tmp, create the following file:
a/b/c

3) tar the contents of directory a recusively, preserving timestamps
on the file.  I use the command:
 tar -c --file ./test.tar a
run from the /tmp directory on the server

4) on the client run:
cd /mnt/tmp
cat a/b/c
This should get the file handle for c in the NFS cache.

5) on the server run:
rm -rf a
tar -x -v --file ./test.tar
This will recreate the file tree on the server, and make the cached
file handles on the client stale

6) on the client run:
cat a/b/c
If you are running without the patch that I posted you will of course
get an ESTALE error.  If you tcpdumped the connection, you will find 3
ESTALE errors returned, in two lookup responses, and in 1 getattr
response.


Actual results:
ESTALES are returned to the calling userspace application

Expected results:
arguably, since ESTALE is largely a transient error on network file
systems, the operation should be silently retried.

Additional info:

Comment 1 Neil Horman 2004-10-14 13:59:31 UTC
Created attachment 105200 [details]
patch to retry fs operations that result in ESTALE errors

Patch to prevent transient ESTALE errors from being reported to user space by
retrying them.

Comment 4 Ernie Petrides 2004-11-16 02:21:29 UTC
A fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-25.1.EL).


Comment 5 Tim Powers 2005-05-18 13:28:16 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html



Note You need to log in before you can comment on or make changes to this bug.