Red Hat Bugzilla – Bug 130165
nfs mounts "disappear" randomly
Last modified: 2015-01-04 17:08:55 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7)
Description of problem:
NFS mounts from one Fedora Core 2 machine on another Fedora Core 2
machine all running the same versions of the respective software
"disappear" occasionally. Basically, the mount point disappears and
any access returns "no such file or directory" with no errors in the
logfiles on either machine.
Occasionally (but not always) there will be an "nfs_statfs: error = 2"
in the client logs.
Logging in to the NFS server and having a shell open in the affected
directory solves the problem on the client. This is with both NFSv2
and NFSv3 and any combinations of mount and export options I tried.
NFSv4 is not an option because of the bug where all user ids are
mapped to "root:bin". (There is an entry for this in Bugzilla, but I
could not find it just now).
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Mount a remote directory over NFS.
2. Log in and use it like normal.
3. Eventually, every access to the mount point will return "no such
file or directory"
Actual Results: Files that are present return "no such file or directory"
Expected Results: The files that exist on the NFS server should
appear on the NFS client.
What OS(es) are the NFS clients?
Both client and server are Fedora Core 2, both compeletely updated.
If the client is an updated FC1 machine, the error gets reported as
"Stale NFS File Handle" but the same work around solves the problem.
If you use the touch command on the server in the directory that you
are trying to access via nfs (i.e. run touch .) in the affected
directory, can the clients see it again?
Are you by any chance running a program on your server that might
reset the modification time on various directories of your server
(rsync with the -a option is an example of this.)
Yes, running touch on the affected directory allows clients to see it
again briefly. There are some scripts running which may reset
modification times, but they aren't running in the affected
directories. Additionally, the mtime is unchanged between touching
the directory and the next time it fails.
Unchanged modification times are commonly what lead to these problems.
If the contents of the directory were updated, but the modification
time was set to a previous value, the nfs client can be "fooled" into
thinking its cached file handles are still good. A simmilar problem
was fixed for RHEL3 in BZ 113636. What version of NFS are you using,
2, 3 or 4? I believe the RHEL3 did not occur if NFS v2 was used.
With kernel 2.6.5-1.358 (the original kernel) the file ownerships are
correct with NFSv4, any kernel after has the root:bin prob.
So I take it then that you were using NFSv3 when the stale file handle
issues came up?
Assuming the nfs(5) man page is correct that v2 is the default, then
it occurs with both NFSv2 and NFSv3. It happens both with nfsvers=3
and whatever the default is.
IIRC, v3 is actually the current default. Can you try explicitly
mounting with V2?
Even explicitly mounting as V2 the problem persists only the error
message changes to "Stale NFS file handle."
I may be having a similar problem. kernel 2.6.9-1.3_FC2smp, Nexsan
atabeast carved up with LVM2 and formatted with reiserfs, then shared
with nfs. Processes will start complaining about Stale file handles;
generally they can be umounted and remounted manually At least in one
case a user did an 'ls' on a directory 2 times and received "stale"
errir, then on third try the filesystem came back. Happens both to
manual mounts and autofs mounts. I'm thinking something screwy in
Not sure how to go about documenting this; it happens maybe once a
day. Logging? What is available.
are you still having this problem with the 2.6.10 updates ?
I'm going to tentatively say it has been fixed.
I haven't experienced any problems today ;-)
Given the random nature of the problem, I can't quit give an unresevered "this
bug is fixed", but so far, it appears to be.
Fedora Core 2 has now reached end of life, and no further updates will be
provided by Red Hat. The Fedora legacy project will be producing further kernel
updates for security problems only.
If this bug has not been fixed in the latest Fedora Core 2 update kernel, please
try to reproduce it under Fedora Core 3, and reopen if necessary, changing the
product version accordingly.