Hide Forgot
Findings: 1. git-bisect shows that the error has started happening since the following commit in release2: commit 7f2da3aab0f32daca97176c3bfed76c70497f9b2 Author: Shehjar Tikoo <shehjart> Date: Thu Sep 24 00:59:50 2009 +0000 libglusterfsclient: Re-validate root inode on every path resolution Reverting this commit eliminates the stale file handle errors. However, the problem has been present from before this patch and that this patch is responsible for triggering the bug. In all the fops being issues by libgf, like say stat, through libgf_client_stat, the iattr cache for an inode is updated with the latest stat. The root inode for glusterfs should always have the ino as 1 but libgf was not accounting for this fact while updating the stat in the inode context. i.e. instead of blindly copying the received stat into the inode context, we'd to ensure that if the root inode's context is being updated, then the inode number in the stat should be set to 1, not the number received from the child subvolume. In this case, the child is posix, which returns the local file system's ino for the root. Now this wasnt problem till now because before the above patch, the root inode was looked up once and never revalidated so the stat structure in iattr cache always contained ino as 1. BUT We changed the code revalidate root inode without ensuring that iattr cache updation routine does not clobber the inode number for the root inode. So when a file handle is received twice, unfs does a stat. For both the stats, it becomes possible for the root filehandle to have 2 different inode numbers, 1 and X, where X is what was received from posix and with which the inode number 1 was clobbered during revalidation.
On release2 branch and on mainline, on exporting a simple posix volume through unfs3-booster, the server returns an ESTALE on any operation after restarting it. For eg: <START SERVER> root@indus:/# mount 127.0.0.1:/testpath /mnt root@indus:/# ls /mnt fstatcache.c statcache.c statcache_rewinddir.c test testfile2.2.0 testfile4.4.0 statcache2.c statcache_r.c statcache_seekdir.c testfile1.1.0 testfile3.3.0 <RESTART SERVER> root@indus:/# ls /mnt ls: cannot access /mnt: Stale NFS file handle root@indus:/#
PATCH: http://patches.gluster.com/patch/2127 in release-2.0 (libglusterfsclient: Prevent root inode number clobbering)
PATCH: http://patches.gluster.com/patch/2128 in release-2.0 (libglusterfsclient: Dont alloc root inode context)
PATCH: http://patches.gluster.com/patch/2125 in master (libglusterfsclient: Prevent root inode number clobbering)
PATCH: http://patches.gluster.com/patch/2126 in master (libglusterfsclient: Dont alloc root inode context)