Description of problem: ======================= After snapshot restore, if User is under any snap directory under .snaps, ls on the snap directory fails with "No such file/directory error" Version-Release number of selected component (if applicable): ============================================================ glusterfs 3.6.0.35 How reproducible: ================ 3/3 Steps to Reproduce: =================== 1.Create a 2x2 dist-rep volume. Fuse and NFS mount the volume 2.Enable USS 3.Create some IO 4.Take snapshot (snap1_vol1)and activate it 5.Create more IO 6.Take 2 more snapshots (snap2_vol1 snap3_vol1)and activate them 7.cd to .snaps and list the snapshots Fuse mount : ========== [root@dhcp-0-97 vol1_fuse]# cd .snaps [root@dhcp-0-97 .snaps]# ll total 0 d---------. 0 root root 0 Jan 1 1970 snap1_vol1 d---------. 0 root root 0 Jan 1 1970 snap2_vol1 d---------. 0 root root 0 Jan 1 1970 snap3_vol1 [root@dhcp-0-97 .snaps]# cd snap1_vol1 [root@dhcp-0-97 snap1_vol1]# ls fuse1 nfs1 NFS mount : ========= [root@dhcp-0-97 .snaps]# ll total 0 drwxr-xr-x. 5 root root 92 Dec 3 16:36 snap1_vol1 drwxr-xr-x. 7 root root 138 Dec 3 16:38 snap2_vol1 drwxr-xr-x. 7 root root 138 Dec 3 16:38 snap3_vol1 [root@dhcp-0-97 .snaps]# cd snap1_vol1/ [root@dhcp-0-97 snap1_vol1]# ls fuse1 nfs1 8.Stop the volume and restore the snapshot to snap3_vol1 9.Start the volume 10.Go back to the terminal in Step 7 and perform ls under snap1_vol1 under .snaps, it fails with "No such file or directory" Actual results: =============== If the User is already any snap directory under .snaps and if snapshot restore operation is performed, performing ls on the snap directory fails. Expected results: ================= User should be able to list the snapshots under .snaps irrespective of any operation performed Additional info:
Tried the workaround mentioned in bz 1169790: Workaround: ----------- drop the VFS cache command to do that is "echo 3 >/proc/sys/vm/drop_caches" After restore operation, ls on the remaining snapshots directory under .snaps still fails [root@dhcp-0-97 S1]# ls fuse_etc.1 fuse_etc.2 nfs_etc.1 nfs_etc.2 After restore : -------------- [root@dhcp-0-97 S1]# ls ls: cannot open directory .: No such file or directory [root@dhcp-0-97 S1]# echo 3 >/proc/sys/vm/drop_caches [root@dhcp-0-97 S1]# ls ls: cannot open directory .: No such file or directory [root@dhcp-0-97 S1]# pwd /mnt/vol0_fuse/.snaps/S1
When a volume is restored, there will be a graph change in the 'snapd'. With this the gfid of '.snaps' will be re-generated. In this case snapd sends a ESTALE back. So you need to cd out from .snaps and cd in again and this should work. Problem in this case is ESTALE is converted to ENOENT by a protocol server before sending it to the client. with ESTALE VFS drops the cache, but with ENOENT VFS marks the file negative and caches the same. Workaround: 1) cd out from .snaps 2) drop cache: 'echo 3 >/proc/sys/vm/drop_caches' 3) Now cd to .snaps and ls under .snaps should work
Tried the workaround mentioned in Comment 3 , it works fine.
As per the previous comment, workaround works fine, removing the blocker flag.
Hi Vijai, I see this bug listed as a known issue in the known issues tracker bug for 3.0.3. Can you please fill out the doc text after changing the doc type to known issue?
Changing the doc type to known issue
Hi Vijai, Can you please review the edited doc text for technical accuracy and sign off?
Doc-text looks good to me
While reproducing this issue with latest glusterfs-3.7.5-18 build, below are the observations: Steps to Reproduce: =================== 1.Create a tiered volume. Fuse and NFS mount the volume 2.Enable USS 3.Create some IO 4.Take snapshot (snap1)and activate it 5.Create more IO 6.Take 2 more snapshots (snap2, snap3)and activate them 7.cd to .snaps and list the snapshots Fuse mount : ========== [root@dhcp-0-97 vol1_fuse]# cd .snaps [root@dhcp-0-97 .snaps]# ll total 0 d---------. 0 root root 0 Jan 1 1970 snap1 d---------. 0 root root 0 Jan 1 1970 snap2 d---------. 0 root root 0 Jan 1 1970 snap3 [root@dhcp-0-97 .snaps]# cd snap1 [root@dhcp-0-97 snap1]# ls fuse1 nfs1 NFS mount : ========= [root@dhcp-0-97 .snaps]# ll total 0 drwxr-xr-x. 5 root root 92 Dec 3 16:36 snap1 drwxr-xr-x. 7 root root 138 Dec 3 16:38 snap2 drwxr-xr-x. 7 root root 138 Dec 3 16:38 snap3 [root@dhcp-0-97 .snaps]# cd snap1 [root@dhcp-0-97 snap1]# ls fuse1 nfs1 8.Stop the volume and restore the snapshot to snap3 9.Start the volume 10.Go back to the terminal in Step 7 and perform ls under snap1 under .snaps, it shows "Stale file handle" error message. [root@dhcp35-63 snap1]# ls ls: cannot open directory .: Stale file handle 11. however coming out of snap1 and cding it again works fine without any issues. [root@dhcp35-63 snap1]# cd .. [root@dhcp35-63 .snaps]# ls snap1 snap2 [root@dhcp35-63 .snaps]# cd snap1 [root@dhcp35-63 snap1]# ls fuse1 nfs1
Hi Shashank, If you are getting and ESTALE, then it is an expected behavior. Please see comment# 3. Thanks, Vijay
As per comment# 13, looks like the problem is fixed in glusterfs-3.7, so closing the bug.