Red Hat Bugzilla – Full Text Bug Listing
|Summary:||FC3T1: Mount Fails with "mount: Stale NFS file handle" message|
|Product:||[Fedora] Fedora||Reporter:||Thomas J. Baker <tjb>|
|Component:||nfs-utils||Assignee:||Steve Dickson <steved>|
|Status:||CLOSED CURRENTRELEASE||QA Contact:|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2004-10-12 15:53:42 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Thomas J. Baker 2004-07-14 16:33:10 EDT
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040510 Galeon/1.3.16 Description of problem: I just installed FC3T1 and mounting an nfs directory from an FC2 server fails: [root@doolittle /]# mkdir /xxx [root@doolittle /]# mount wintermute:/home/tjb /xxx mount: Stale NFS file handle [root@doolittle /]# Nothing is logged on the client but the FC2 server logs say this: Jul 14 15:50:27 wintermute rpc.mountd: authenticated mount request from doolittle.sr.unh.edu:842 for /home/tjb (/home) I left selinux on the default which I believe is enforcing but /var/log/messages isn't filled with avc messages about the failure, just the stale nfs handle messages. The FC3T1 system can mount nfs directories from an RHEL3 machine and from a system running rawhide. (Along the same lines, the system running rawhide can mount some but not all of the FC2 systems exports. The ones that fail fail with the Stale NFS mount error message. This started happening right after I upgraded it from FC2 to rawhide.) Version-Release number of selected component (if applicable): kernel-2.6.7-1.478, nfs-utils-1.0.6-30 How reproducible: Sometimes Steps to Reproduce: 1. install fc3t1 2. try to mount an exported directory from an fc2 system 3. Actual Results: Fails with stale nfs handle. Expected Results: works. Additional info:
Comment 1 Steve Dickson 2004-07-14 17:41:46 EDT
Is it possible to get an ethereal trace of the estale? Also there were some recent mounting issue that seem to be cleared up in the 480 kernel... Does it help to boot with selinux=0?
Comment 2 Thomas J. Baker 2004-07-16 10:03:53 EDT
Created attachment 101965 [details] ethereal dump selinux=0 doesn't help, kernel 2.6.7-1.488 doesn't help. Here is the ethereal dump. I'll be trying 492 kernel today.
Comment 3 Thomas J. Baker 2004-07-16 15:35:23 EDT
This looks like it could be due to a strange interaction with how the filesystem is exported from the server. If the FC3T1 host is not listed explicitly and instead is part of a netgroup, it fails with the Stale NFS handle. If it is listed explicitly, it works as expected. I'm trying to isolate it down to a specific case for easier debugging.
Comment 4 Thomas J. Baker 2004-07-20 14:29:16 EDT
It's related to how the file systems are exported. If the FC3T1 client is listed explicitly in the exports file for a given filesystem, nfs mounting seems to work fine. If the file system is exported with the client as part of a netgroup it gives the stale nfs handle problem. If I have the FC3T1 client explicitly listed, do an nfs mount, and then remove the explicit listing from the server exports, leaving the netgroup listing which should still match the client, exportfs again, the client then behaves like the server has disappeared completely: nfs: server wintermute.sr.unh.edu not responding, still trying It's really quite a strange bug. Another oddity I've run into is that between two FC3T1 systems, wildcard matching in the export file doesn't work. If I have a file system exported to *.unh.edu, and try to mount it, it gives mount doolittle:/space /xxx mount: doolittle:/space failed, reason given by server: Permission denied [tjb@katratzi tjb]# showmount -e doolittle Export list for doolittle: /home @rcc_linux /space *.unh.edu [tjb@katratzi tjb]# I don't know if they're related or not.
Comment 5 Steve Dickson 2004-07-30 10:03:48 EDT
Yes.. I agree with the strangeness... The "Permission denied" could have to do with rpc.idmapd messing things up... make sure you have the latest nfs-utils (1.0.6-22 i think). WRT the estale, is there anything in /var/log/messages on why the server is failing the fsinfo? Also can you post the exports tab that works and the one that does not work?
Comment 6 Thomas J. Baker 2004-08-02 14:07:00 EDT
Here is the fstab: /home @rcc_linux(rw) \ katratzi(rw,no_root_squash) \ doolittle(rw,no_root_squash) /space katratzi(rw,no_root_squash) \ doolittle(rw,no_root_squash) /space/tmp *.unh.edu(ro,insecure,all_squash) /space/ftp/redhat *.unh.edu(ro,insecure,all_squash) \ 220.127.116.11/255.255.0.0(ro,insecure,all_squash) /temp/music @rcc_linux(ro) /temp/games @rcc_linux(ro) Both katratzi and doolittle are members of yp netgroup file rcc_linux yet only exports that list them explicity work. I've got the latest nfs utils (nfs-utils-1.0.6-30). The only thing the server logs is this: Aug 2 14:05:28 wintermute rpc.mountd: authenticated mount request from doolittle.sr.unh.edu:683 for /space/ftp/redhat/rcc (/space/ftp/redhat) Aug 2 14:05:28 wintermute rpc.mountd: authenticated mount request from doolittle.sr.unh.edu:690 for /space/ftp/redhat/rcc (/space/ftp/redhat) The client logs this: Aug 2 14:05:15 doolittle kernel: SELinux: initialized (dev 0:1a, type nfs), uses genfs_contexts Aug 2 14:05:28 doolittle automount: >> mount: Stale NFS file handle Aug 2 14:05:28 doolittle automount: mount(nfs): nfs: mount failure redhat-mirror:/space/ftp/redhat/rcc on /net/redhat/rcc Aug 2 14:05:28 doolittle automount: failed to mount /net/redhat/rcc Aug 2 14:05:28 doolittle automount: >> mount: Stale NFS file handle Aug 2 14:05:28 doolittle automount: mount(nfs): nfs: mount failure redhat-mirror:/space/ftp/redhat/rcc on /net/redhat/rcc Aug 2 14:05:28 doolittle automount: failed to mount /net/redhat/rcc I'm now running the 2.6.7-1.499 kernel.
Comment 7 Thomas J. Baker 2004-08-02 14:08:51 EDT
That should read 'are members of the yp netgroup rcc_linux'.
Comment 8 Steve Dickson 2004-08-10 10:15:22 EDT
Just curious... if you re-export (i.e. exportfs -arv) the filesystems, does the problem go away?
Comment 9 Thomas J. Baker 2004-08-12 07:50:27 EDT
No, I get the same stale nfs handle error message on the FC3T1 client.
Comment 10 Steve Dickson 2004-08-13 11:31:37 EDT
Just for grins.... added fsid=0 to one of your exports options and see what happens...
Comment 11 Thomas J. Baker 2004-08-13 14:07:48 EDT
I added the fsid=0 to the /temp/games export, exportfs -arv, and when I tried to automount it from the FC3T1 client, the mount hangs. The cd command that triggered the automount hangs, a df from another window hangs, and I can't even log in as a normal user though root works. I looked at the /etc/mtab and it doesn't include the mount for /temp/games so at least we know that the mount is hanging. Eventually, the 'nfsserver not responding message is logged' which explains why my other login attempts fail as my home directory is automounted from the same server. Client didn't log anything and server logged a normal mount request: Aug 13 13:31:41 wintermute rpc.mountd: authenticated mount request from katratzi.sr.unh.edu:919 for /temp/games (/temp/games) To make it really interesting, I tried to log into my other FC3T1 test system and it can't nfs mount my home directory either. Server logs normally: Aug 13 13:56:55 wintermute rpc.mountd: authenticated mount request from doolittle.sr.unh.edu:927 for /home/tjb (/home) BUT two other RHEL3U2 systems mount my home directory fine. After rebooting the first FC3T1 client, I tried mounting my home directory again and it hung again. It's like if I modify the exports file and reexport, my FC3T1 systems can't talk to the nfs server anymore yet RHEL3 and FC2 seem fine. BTW, kernels are 517 on the first and 515 on the second FC3T1 systems.
Comment 12 Steve Dickson 2004-08-13 14:26:59 EDT
can you try the mount by hand (i.e. not using autofs) while running an ethereal trace... then post the trace....
Comment 13 Thomas J. Baker 2004-08-13 14:59:09 EDT
Created attachment 102711 [details] ethereal dump taken from server with just server client traffic
Comment 14 Thomas J. Baker 2004-08-13 15:44:47 EDT
Created attachment 102716 [details] second ethereal dump from server with just client server traffic I rebooted the server and without any other changes, the fc3t1 system came back (nfs server OK) from the hung mount request but gave a permissions denied. A second mount attempt gave the stale nfs handle and a third was captured in the included dump. It may or may not provide more info but I thought it couldn't hurt.
Comment 15 Steve Dickson 2004-08-28 11:04:41 EDT
hmm... it sure seems like its an server export problem... If you simplify your exports to something like: /home *(rw,sync,fsid=0) does that work?
Comment 16 Thomas J. Baker 2004-09-28 14:19:34 EDT
I updated the server to FC3T2 and the nfs problem persists. Another problem is that if I ever run 'exportfs -av' on the server, the FC3T2 clients hang with "NFS server not responding" until I reboot the server. It makes testing changes rather tedious. Anyway, I exported a directory as you requested and that doesn't work either: Client side: [root@katratzi tjb]# mount wintermute:/raid /xxx mount: wintermute:/raid failed, reason given by server: Permission denied [root@katratzi tjb]# showmount -e wintermute Export list for wintermute: /raid * /temp/games @rcc_linux /space/ftp 18.104.22.168/255.255.0.0 /home @rcc_linux,katratzi.sr.unh.edu [root@katratzi tjb]# mount wintermute:/raid /xxx mount: Stale NFS file handle [root@katratzi tjb]# Server side: Sep 28 13:54:52 wintermute rpc.mountd: authenticated mount request from katratzi.sr.unh.edu:676 for /raid (/raid) I have no firewall, selinux disabled, and right now, due to another potential bug (#133906), tcp wrappers completely turned off too.
Comment 17 Carlo Wood 2004-10-08 18:19:04 EDT
This looks very much like a problem I have too. But as far as I know, I am not using FC3T2, heh. I am not really into testing fedora - I just run 'apt-get update; apt-get upgrade' sometimes. Last time I did that I suddenly couldn't mount anymore and had to downgrade nfs-tools. I have a VERY simple setup (and simple exports file). Doesn't get much simpler than this. In other words, it seems to ME that fedora's public, current, non-test version has a totally broken NFS (under certain circumstances I presume). Is there anything I can do to help to fix this?
Comment 18 Need Real Name 2004-10-09 14:16:37 EDT
It seems that the failure is in mountd.c (rpc.mountd) when function cache_get_filehandle tries to open /proc/fs/nfs[d]/filehandle and fails (I got 2.6.9-rc1-mm4 and don't got this file there...) I'm doing some more research.
Comment 19 Need Real Name 2004-10-09 14:20:33 EDT
Got it - add these lines to /etc/fstab (on NFS server): nfsd /proc/fs/nfsd nfsd defaults 0 0 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs defaults 0 0 See also bug:125345
Comment 20 Thomas J. Baker 2004-10-12 15:32:04 EDT
Whatever was in the last bunch of updates (10/12/2004) seems to have fixed this problem. Since nfs-utils wasn't updated, I can only assume it's the kernel?