Bug 762086 (GLUSTER-354) - Stale file handle on unfs3 booster
Summary: Stale file handle on unfs3 booster
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-354
Product: GlusterFS
Classification: Community
Component: libglusterfsclient
Version: mainline
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Shehjar Tikoo
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-11-03 11:47 UTC by Shehjar Tikoo
Modified: 2009-11-12 06:23 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Shehjar Tikoo 2009-11-03 09:05:43 UTC
Findings:

1. git-bisect shows that the error has started happening since the following commit in release2:
commit 7f2da3aab0f32daca97176c3bfed76c70497f9b2
Author: Shehjar Tikoo <shehjart>
Date:   Thu Sep 24 00:59:50 2009 +0000

libglusterfsclient: Re-validate root inode on every path resolution


Reverting this commit eliminates the stale file handle errors. However, the problem has been present from before this patch and that this patch is responsible for triggering the bug.

In all the fops being issues by libgf, like say stat, through libgf_client_stat, the iattr cache for an inode is updated with the latest stat. The root inode for glusterfs should always have the ino as 1 but libgf was not accounting for this fact while updating the stat in the inode context. i.e. instead of blindly copying the received stat into the inode context, we'd to ensure that if the root inode's context is being updated, then the inode number in the stat should be set to 1, not the number received from the child subvolume. In this case, the child is posix, which returns the local file system's ino for the root. Now this wasnt problem till now because before the above patch, the root inode was looked up once and never revalidated so the stat structure in iattr cache always contained ino as 1.
BUT
We changed the code revalidate root inode without ensuring that iattr cache updation routine does not clobber the inode number for the root inode.

So when a file handle is received twice, unfs does a stat. For both the stats, it becomes possible for the root filehandle to have 2 different inode numbers, 1 and X, where X is what was received from posix and with which the inode number 1 was clobbered during revalidation.

Comment 1 Shehjar Tikoo 2009-11-03 11:47:21 UTC
On release2 branch and on mainline, on exporting a simple posix volume through
unfs3-booster, the server returns an ESTALE on any operation after restarting it.

For eg:

<START SERVER>
root@indus:/# mount 127.0.0.1:/testpath /mnt
root@indus:/# ls /mnt
fstatcache.c  statcache.c    statcache_rewinddir.c  test           testfile2.2.0  testfile4.4.0
statcache2.c  statcache_r.c  statcache_seekdir.c    testfile1.1.0  testfile3.3.0

<RESTART SERVER>
root@indus:/# ls /mnt
ls: cannot access /mnt: Stale NFS file handle
root@indus:/#

Comment 2 Anand Avati 2009-11-04 03:09:51 UTC
PATCH: http://patches.gluster.com/patch/2127 in release-2.0 (libglusterfsclient: Prevent root inode number clobbering)

Comment 3 Anand Avati 2009-11-04 03:09:55 UTC
PATCH: http://patches.gluster.com/patch/2128 in release-2.0 (libglusterfsclient: Dont alloc root inode context)

Comment 4 Anand Avati 2009-11-04 03:21:45 UTC
PATCH: http://patches.gluster.com/patch/2125 in master (libglusterfsclient: Prevent root inode number clobbering)

Comment 5 Anand Avati 2009-11-04 03:21:49 UTC
PATCH: http://patches.gluster.com/patch/2126 in master (libglusterfsclient: Dont alloc root inode context)


Note You need to log in before you can comment on or make changes to this bug.