Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1170145 - [USS]: After snapshot restore, if user is under any snap directory in .snaps , ls on the snap directory fails
[USS]: After snapshot restore, if user is under any snap directory in .snaps ...
Status: CLOSED NEXTRELEASE
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: snapshot (Show other bugs)
3.0
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Vijaikumar Mallikarjuna
Anoop
USS
: Triaged
Depends On:
Blocks: 1153907
  Show dependency treegraph
 
Reported: 2014-12-03 06:46 EST by senaik
Modified: 2016-09-17 08:59 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
After the completion of the restore operation, if a volume is restored while you are in the .snaps directory, the following error message is displayed from the mount point - "No such file or directory". Workaround: 1) cd out of the .snaps directory 2) Drop VFS cache: 'echo 3 >/proc/sys/vm/drop_caches' 3) Now cd to .snaps should work Workaround (if any): Result:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-02-03 02:23:47 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description senaik 2014-12-03 06:46:04 EST
Description of problem:
=======================
After snapshot restore, if User is under any snap directory under .snaps, ls on the snap directory fails  with "No such file/directory error"


Version-Release number of selected component (if applicable):
============================================================
glusterfs 3.6.0.35 

How reproducible:
================
3/3


Steps to Reproduce:
===================
1.Create a 2x2 dist-rep volume. Fuse and NFS mount the volume 

2.Enable USS

3.Create some IO 

4.Take snapshot (snap1_vol1)and activate it

5.Create more IO

6.Take 2 more snapshots (snap2_vol1 snap3_vol1)and activate them

7.cd to .snaps and list the snapshots 

Fuse mount :
==========
[root@dhcp-0-97 vol1_fuse]# cd .snaps
[root@dhcp-0-97 .snaps]# ll
total 0
d---------. 0 root root 0 Jan  1  1970 snap1_vol1
d---------. 0 root root 0 Jan  1  1970 snap2_vol1
d---------. 0 root root 0 Jan  1  1970 snap3_vol1
[root@dhcp-0-97 .snaps]# cd snap1_vol1
[root@dhcp-0-97 snap1_vol1]# ls
fuse1  nfs1


NFS mount :
=========
[root@dhcp-0-97 .snaps]# ll
total 0
drwxr-xr-x. 5 root root  92 Dec  3 16:36 snap1_vol1
drwxr-xr-x. 7 root root 138 Dec  3 16:38 snap2_vol1
drwxr-xr-x. 7 root root 138 Dec  3 16:38 snap3_vol1
[root@dhcp-0-97 .snaps]# cd snap1_vol1/
[root@dhcp-0-97 snap1_vol1]# ls
fuse1  nfs1

8.Stop the volume and restore the snapshot to snap3_vol1

9.Start the volume 

10.Go back to the terminal in Step 7 and perform ls under snap1_vol1 under .snaps, it fails with "No such file or directory"


Actual results:
===============
If the User is already any snap directory under .snaps and if snapshot restore operation is performed, performing ls on the snap directory fails.


Expected results:
=================
User should be able to list the snapshots under .snaps irrespective of any operation performed


Additional info:
Comment 2 senaik 2014-12-04 04:51:11 EST
Tried the workaround mentioned in bz 1169790:

Workaround:
-----------
drop the VFS cache
command to do that is "echo 3 >/proc/sys/vm/drop_caches"

After restore operation, ls on the remaining snapshots directory under .snaps still fails 

[root@dhcp-0-97 S1]# ls
fuse_etc.1  fuse_etc.2  nfs_etc.1  nfs_etc.2

After restore :
--------------
[root@dhcp-0-97 S1]# ls
ls: cannot open directory .: No such file or directory

[root@dhcp-0-97 S1]# echo 3 >/proc/sys/vm/drop_caches
[root@dhcp-0-97 S1]# ls
ls: cannot open directory .: No such file or directory
[root@dhcp-0-97 S1]# pwd
/mnt/vol0_fuse/.snaps/S1
Comment 3 Vijaikumar Mallikarjuna 2014-12-05 05:00:39 EST
When a volume is restored, there will be a graph change in the 'snapd'. With this the gfid of '.snaps' will be re-generated. In this case snapd sends a ESTALE back. So you need to cd out from .snaps and cd in again and this should work.

Problem in this case is ESTALE is converted to ENOENT by a protocol server before sending it to the client. with ESTALE VFS drops the cache, but with ENOENT VFS marks the file negative and caches the same.

Workaround:
1) cd out from .snaps
2) drop cache: 'echo 3 >/proc/sys/vm/drop_caches'
3) Now cd to .snaps and ls under .snaps should work
Comment 4 senaik 2014-12-05 06:24:56 EST
Tried the workaround mentioned in Comment 3 , it works fine.
Comment 5 senaik 2014-12-05 07:06:23 EST
As per the previous comment, workaround works fine, removing the blocker flag.
Comment 6 Pavithra 2014-12-08 05:27:12 EST
Hi Vijai,

I see this bug listed as a known issue in the known issues tracker bug for 3.0.3. Can you please fill out the doc text after changing the doc type to known issue?
Comment 8 Pavithra 2014-12-16 00:32:19 EST
Changing the doc type to known issue
Comment 9 Pavithra 2014-12-16 00:34:36 EST
Hi Vijai, 

Can you please review the edited doc text for technical accuracy and sign off?
Comment 10 Vijaikumar Mallikarjuna 2014-12-16 00:36:30 EST
Doc-text looks good to me
Comment 13 Shashank Raj 2016-02-03 02:17:27 EST
While reproducing this issue with latest glusterfs-3.7.5-18 build, below are the observations:

Steps to Reproduce:
===================
1.Create a tiered volume. Fuse and NFS mount the volume 

2.Enable USS

3.Create some IO 

4.Take snapshot (snap1)and activate it

5.Create more IO

6.Take 2 more snapshots (snap2, snap3)and activate them

7.cd to .snaps and list the snapshots 

Fuse mount :
==========
[root@dhcp-0-97 vol1_fuse]# cd .snaps
[root@dhcp-0-97 .snaps]# ll
total 0
d---------. 0 root root 0 Jan  1  1970 snap1
d---------. 0 root root 0 Jan  1  1970 snap2
d---------. 0 root root 0 Jan  1  1970 snap3
[root@dhcp-0-97 .snaps]# cd snap1
[root@dhcp-0-97 snap1]# ls
fuse1  nfs1


NFS mount :
=========
[root@dhcp-0-97 .snaps]# ll
total 0
drwxr-xr-x. 5 root root  92 Dec  3 16:36 snap1
drwxr-xr-x. 7 root root 138 Dec  3 16:38 snap2
drwxr-xr-x. 7 root root 138 Dec  3 16:38 snap3
[root@dhcp-0-97 .snaps]# cd snap1
[root@dhcp-0-97 snap1]# ls
fuse1  nfs1

8.Stop the volume and restore the snapshot to snap3

9.Start the volume 

10.Go back to the terminal in Step 7 and perform ls under snap1 under .snaps, it shows "Stale file handle" error message.

[root@dhcp35-63 snap1]# ls
ls: cannot open directory .: Stale file handle

11. however coming out of snap1 and cding it again works fine without any issues.

[root@dhcp35-63 snap1]# cd ..

[root@dhcp35-63 .snaps]# ls
snap1  snap2

[root@dhcp35-63 .snaps]# cd snap1

[root@dhcp35-63 snap1]# ls
fuse1  nfs1
Comment 14 Vijaikumar Mallikarjuna 2016-02-03 02:22:25 EST
Hi Shashank,

If you are getting and ESTALE, then it is an expected behavior. Please see comment# 3.

Thanks,
Vijay
Comment 15 Vijaikumar Mallikarjuna 2016-02-03 02:23:47 EST
As per comment# 13, looks like the problem is fixed in glusterfs-3.7, so closing the bug.

Note You need to log in before you can comment on or make changes to this bug.