Hide Forgot
Created attachment 528888 [details] sosreport for typical client Description of problem: As more users are moving from RHEL4 to RHEl5, some have noticed frequent "Stale NFS file handle" messages when doing some simple NFS file manipulation. mount arguments: abel:/vol/cache477/verif_release_ro on /proj/verif_release type nfs (rw,tcp,rsize=65536,wsize=65536,timeo=600,retrans=2,intr,addr=163.181.34.90) Version-Release number of selected component (if applicable): Fails on: RHEL5.5, 5.7, 6.0 No errors on: RHEL4.9 How reproducible: It seems the issue is easier to reproduce: * on a high traffic mountpoint (though the immediate parent of the directory I'm manipulating is untouched) * from a high traffic client (a many-user interactive will repro much more often than an idle grid node) Steps to Reproduce: while true; do rm -rf /proj/something/test_dir/user mkdir /proj/something/test_dir/user echo blah > testfile cp testfile /proj/something/test_dir/testfile done Actual results: open("/proj/verif_release/stale_nfs_handle_test/bpetrini/some_bogus_file.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ESTALE (Stale NFS file handle) close(3) = 0 write(2, "Unable to copy file\nStale NFS fi"..., 42Unable to copy file Stale NFS file handle ) = 42 Expected results: no ESTALE Additional info: per customer, this IS reproducible with a single NFS client (unlike bug 231143, which originally we thought this might be). In typical user workloads, individual users are mounting different shares on the NetApp filers and there multiple users should not be accessing the same files/dirs/inodes. I've got outstanding requests out to the customer for A) netapp filer version numbers (should have this soon) B) more details on the actual application (the trivial reproducer above is "similar" but I've asked for the actual offending code). I have network traces I'm attaching as well. --pvn
Created attachment 528889 [details] network trace during reproduction
Well, now that is interesting... There's not a single NFSERR_STALE error in either capture. Typically when we see these sorts of problems that means that server returned that error, meaning that it didn't recognize the filehandle. There are however some places in the NFS and generic VFS code that return that error for other reasons. I'll need to look over the code and the capture and see if there is any reason that this would occur...
Question: when you run this test, do you ever see any kernel messages pop up in dmesg?
I ran the reproducer above for about 30000 passes on a linux nfs server and never saw it fail. My suspicion at this point is that the server they're testing against is significant here. What we may need to do is get them to run the same reproducer with NFS debugging turned up. On their test client machine, have them run this to turn up NFS debugging: # rpcdebug -m nfs -s all ...then run the test. Once it fails, immediately run: # rpcdebug -m nfs -c all ...to turn it off again. Then, run dmesg and collect the output. With luck we may see the source of the ESTALE problem. Ideally you'll want to do this on a client that's only running this test at the time. Any other NFS activity will just add extra chaff to the logs.
Ok, that log helps a lot: -----------------[snip]------------------- Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS: nfs_lookup_revalidate(npavey/test_dir.gram0142) is valid Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_follow_mountpoint() Oct 19 16:03:59 s_kernel@gram0142 kernel: nfs_follow_mountpoint: enter Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS call lookup test_dir.gram0142 Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS: nfs_update_inode(0:26/14035749 ct=1 info=0x6) Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS reply lookup: 0 Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_do_submount() Oct 19 16:03:59 s_kernel@gram0142 kernel: nfs_do_submount: submounting on npavey/test_dir.gram0142 Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_xdev_get_sb() Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_clone_server(,23475a4a:0,) Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_probe_fsinfo() Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS call fsinfo Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS reply fsinfo: 0 Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS call pathconf Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS reply pathconf: 0 Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_probe_fsinfo() = 0 Oct 19 16:03:59 s_kernel@gram0142 kernel: Cloned FSID: 23475a4a:0 Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_clone_server() = ffff810b8d847000 Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_free_server() Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_put_client({2}) Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_free_server() Oct 19 16:03:59 s_kernel@gram0142 kernel: do_proc_get_root: call fsinfo Oct 19 16:03:59 s_kernel@gram0142 kernel: do_proc_get_root: reply fsinfo: 0 Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS: nfs_update_inode(0:26/14035757 ct=2 info=0x6) Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS: nfs_fhget(0:26/14035757 ct=2) Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_xdev_get_sb() = -116 [splat] Oct 19 16:03:59 s_kernel@gram0142 kernel: nfs_do_submount: done Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_do_submount() = ffffffffffffff8c Oct 19 16:03:59 s_kernel@gram0142 kernel: nfs_follow_mountpoint: done, returned -116 Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_follow_mountpoint() = -116 -----------------[snip]------------------- -116 is ESTALE... So the reply from the server is causing the client to attempt a new submount. I don't have a capture from this new reproducer, but in looking at the older one, almost all of the mkdir calls have this in both the new object attributes and the dir_wcc info: fsid: 0x000000000d9c6279 (228352633) ...but I see this in frame 239 (just prior to the failure) in the dir_wcc info: fsid: 0x000000000d9c61c9 (228352457) ...the new object attributes show the original fsid. An fsid change implies that you've crossed a mountpoint on the server. RHEL5 in this situation will attempt to mount the new filesystem as a new submount. RHEL4 does not do this which explains why they're only seeing this now. Since you're only creating a directory in this situation, this is almost certainly a server bug. I'd contact Netapp and ask them if this is a known problem and whether they have a fix.
...and and since the client is returning an ESTALE error when we attempt the submount, the info the server is providing is clearly bogus.
Thanks Paul. Going ahead and closing this as NOTABUG. Please reopen if you wish to discuss it further.