Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 747124

Summary:	NFS ESTALE from mkdir/rmdir
Product:	Red Hat Enterprise Linux 5	Reporter:	Paul Victor Novarese <pvn>
Component:	kernel	Assignee:	Jeff Layton <jlayton>
Status:	CLOSED NOTABUG	QA Contact:	Red Hat Kernel QE team <kernel-qe>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	5.7	CC:	bfields, bryce.petrini, dhowells, dwysocha, jlayton, rwheeler, sprabhu, steved, wnix
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-10-21 17:02:15 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Paul Victor Novarese 2011-10-18 21:03:38 UTC

Created attachment 528888 [details]
sosreport for typical client

Description of problem:

As more users are moving from RHEL4 to RHEl5, some have noticed frequent "Stale NFS file handle" messages when doing some simple NFS file manipulation.

mount arguments:
abel:/vol/cache477/verif_release_ro on /proj/verif_release type nfs (rw,tcp,rsize=65536,wsize=65536,timeo=600,retrans=2,intr,addr=163.181.34.90)

Version-Release number of selected component (if applicable):

Fails on: RHEL5.5, 5.7, 6.0
No errors on: RHEL4.9

How reproducible:

It seems the issue is easier to reproduce:
* on a high traffic mountpoint (though the immediate parent of the directory I'm manipulating is untouched)
* from a high traffic client (a many-user interactive will repro much more often than an idle grid node)

Steps to Reproduce:

while true; do
rm -rf /proj/something/test_dir/user
mkdir /proj/something/test_dir/user
echo blah > testfile
cp testfile /proj/something/test_dir/testfile
done  

Actual results:

open("/proj/verif_release/stale_nfs_handle_test/bpetrini/some_bogus_file.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ESTALE (Stale NFS file handle)
close(3)                                = 0
write(2, "Unable to copy file\nStale NFS fi"..., 42Unable to copy file
Stale NFS file handle
) = 42

Expected results:

no ESTALE

Additional info:

per customer, this IS reproducible with a single NFS client (unlike bug 231143, which originally we thought this might be).  In typical user workloads, individual users are mounting different shares on the NetApp filers and there multiple users should not be accessing the same files/dirs/inodes.

I've got outstanding requests out to the customer for 

A) netapp filer version numbers (should have this soon)
B) more details on the actual application (the trivial reproducer above is "similar" but I've asked for the actual offending code).

I have network traces I'm attaching as well.

--pvn

Comment 1 Paul Victor Novarese 2011-10-18 21:06:22 UTC

Created attachment 528889 [details]
network trace during reproduction

Comment 9 Jeff Layton 2011-10-19 11:26:11 UTC

Well, now that is interesting...

There's not a single NFSERR_STALE error in either capture. Typically when we see these sorts of problems that means that server returned that error, meaning that it didn't recognize the filehandle.

There are however some places in the NFS and generic VFS code that return that error for other reasons. I'll need to look over the code and the capture and see if there is any reason that this would occur...

Comment 10 Jeff Layton 2011-10-19 11:33:10 UTC

Question: when you run this test, do you ever see any kernel messages pop up in dmesg?

Comment 11 Jeff Layton 2011-10-19 12:17:22 UTC

I ran the reproducer above for about 30000 passes on a linux nfs server and never saw it fail. My suspicion at this point is that the server they're testing against is significant here.

What we may need to do is get them to run the same reproducer with NFS debugging turned up.

On their test client machine, have them run this to turn up NFS debugging:

    # rpcdebug -m nfs -s all

...then run the test. Once it fails, immediately run:

    # rpcdebug -m nfs -c all

...to turn it off again. Then, run dmesg and collect the output. With luck we may see the source of the ESTALE problem. Ideally you'll want to do this on a client that's only running this test at the time. Any other NFS activity will just add extra chaff to the logs.

Comment 16 Jeff Layton 2011-10-20 00:45:17 UTC

Ok, that log helps a lot:

-----------------[snip]-------------------
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS: nfs_lookup_revalidate(npavey/test_dir.gram0142) is valid
Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_follow_mountpoint()
Oct 19 16:03:59 s_kernel@gram0142 kernel: nfs_follow_mountpoint: enter
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS call  lookup test_dir.gram0142
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS: nfs_update_inode(0:26/14035749 ct=1 info=0x6)
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS reply lookup: 0
Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_do_submount()
Oct 19 16:03:59 s_kernel@gram0142 kernel: nfs_do_submount: submounting on npavey/test_dir.gram0142
Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_xdev_get_sb()
Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_clone_server(,23475a4a:0,)
Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_probe_fsinfo()
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS call  fsinfo
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS reply fsinfo: 0
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS call  pathconf
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS reply pathconf: 0
Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_probe_fsinfo() = 0
Oct 19 16:03:59 s_kernel@gram0142 kernel: Cloned FSID: 23475a4a:0
Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_clone_server() = ffff810b8d847000
Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_free_server()
Oct 19 16:03:59 s_kernel@gram0142 kernel: --> nfs_put_client({2})
Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_free_server()
Oct 19 16:03:59 s_kernel@gram0142 kernel: do_proc_get_root: call  fsinfo
Oct 19 16:03:59 s_kernel@gram0142 kernel: do_proc_get_root: reply fsinfo: 0
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS: nfs_update_inode(0:26/14035757 ct=2 info=0x6)
Oct 19 16:03:59 s_kernel@gram0142 kernel: NFS: nfs_fhget(0:26/14035757 ct=2)
Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_xdev_get_sb() = -116 [splat]
Oct 19 16:03:59 s_kernel@gram0142 kernel: nfs_do_submount: done
Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_do_submount() = ffffffffffffff8c
Oct 19 16:03:59 s_kernel@gram0142 kernel: nfs_follow_mountpoint: done, returned -116
Oct 19 16:03:59 s_kernel@gram0142 kernel: <-- nfs_follow_mountpoint() = -116
-----------------[snip]-------------------

-116 is ESTALE...

So the reply from the server is causing the client to attempt a new submount.

I don't have a capture from this new reproducer, but in looking at the older one, almost all of the mkdir calls have this in both the new object attributes and the dir_wcc info:

    fsid: 0x000000000d9c6279 (228352633)

...but I see this in frame 239 (just prior to the failure) in the dir_wcc info:

    fsid: 0x000000000d9c61c9 (228352457)

...the new object attributes show the original fsid.

An fsid change implies that you've crossed a mountpoint on the server. RHEL5 in this situation will attempt to mount the new filesystem as a new submount. RHEL4 does not do this which explains why they're only seeing this now.

Since you're only creating a directory in this situation, this is almost certainly a server bug. I'd contact Netapp and ask them if this is a known problem and whether they have a fix.

Comment 17 Jeff Layton 2011-10-20 00:47:52 UTC

...and and since the client is returning an ESTALE error when we attempt the submount, the info the server is providing is clearly bogus.

Comment 21 Jeff Layton 2011-10-21 17:02:15 UTC

Thanks Paul. Going ahead and closing this as NOTABUG. Please reopen if you wish to discuss it further.