Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 763434 (GLUSTER-1702)

Summary:	NFS client got "Stale NFS file handle" periodically
Product:	[Community] GlusterFS	Reporter:	Bernard Li <bernard>
Component:	nfs	Assignee:	Shehjar Tikoo <shehjart>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	low
Version:	nfs-alpha	CC:	gluster-bugs, vijay
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	---
Regression:	RTP	Mount Type:	nfs
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Bernard Li 2010-09-24 22:00:19 UTC

Running GlusterFS nfs_beta_rc14 on CentOS 4 sharing a replicate of 5 bricks from 5 nodes.  Client is openSUSE 11.3 running kernel 2.6.34-12-desktop.

Occasionally, when running ls on a directory in the NFS mounted volume, I would get:

ls: reading directory .: Stale NFS file handle
total 0

and on the client's syslog:

Sep 24 14:05:52 vus-bli kernel: [2505162.929534] NFS: server gluster-nfs error: fileid changed
Sep 24 14:05:52 vus-bli kernel: [2505162.929537] fsid 0:14: expected fileid 0xbf4046, got 0x82280d5
Sep 24 14:05:52 vus-bli kernel: [2505163.614948] NFS: server gluster-nfs error: fileid changed
Sep 24 14:05:52 vus-bli kernel: [2505163.614952] fsid 0:14: expected fileid 0xbf4046, got 0x82280d5
Sep 24 14:05:52 vus-bli kernel: [2505163.637769] NFS: server gluster-nfs error: fileid changed
Sep 24 14:05:52 vus-bli kernel: [2505163.637773] fsid 0:14: expected fileid 0xbf4046, got 0x82280d5

on the server, the log shows:

[2010-09-24 14:05:45] E [rpcsvc.c:1230:rpcsvc_program_actor] rpc-service: RPC program not available

The glusterfsd on the replicate brick servers AFAIK have stayed up during the whole time, so this should have nothing to do with self-healing.

Comment 1 Shehjar Tikoo 2010-09-25 07:15:46 UTC

5 replicas have not been tested. I am looking into it.

Comment 2 Shehjar Tikoo 2010-09-25 08:33:40 UTC

Error is similar to the fileid changes Harsha experienced during recent tests.

Comment 3 Bernard Li 2010-09-25 14:00:04 UTC

I forgot to mention that the bricks are running GlusterFS version 3.0.5 and *not* nfs_beta_rc14, not sure if that matters...

Comment 4 Shehjar Tikoo 2010-09-27 03:19:12 UTC

(In reply to comment #3)
> I forgot to mention that the bricks are running GlusterFS version 3.0.5 and
> *not* nfs_beta_rc14, not sure if that matters...
That hasnt been tested. Please try with rc14 on all bricks and nfs servers.


What is the version of the kernel on the nfs client machine where you saw this error?

Comment 5 Shehjar Tikoo 2010-10-05 08:22:15 UTC

Bernard, please re-open if ESTALEs occur even with nfs-beta on the bricks.