Bug 1170145 - [USS]: After snapshot restore, if user is under any snap directory in .snaps , ls on the snap directory fails
Summary: [USS]: After snapshot restore, if user is under any snap directory in .snaps ...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: snapshot
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Vijaikumar Mallikarjuna
QA Contact: Anoop
URL:
Whiteboard: USS
Depends On:
Blocks: 1153907
TreeView+ depends on / blocked
 
Reported: 2014-12-03 11:46 UTC by senaik
Modified: 2016-09-17 12:59 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
After the completion of the restore operation, if a volume is restored while you are in the .snaps directory, the following error message is displayed from the mount point - "No such file or directory". Workaround: 1) cd out of the .snaps directory 2) Drop VFS cache: 'echo 3 >/proc/sys/vm/drop_caches' 3) Now cd to .snaps should work Workaround (if any): Result:
Clone Of:
Environment:
Last Closed: 2016-02-03 07:23:47 UTC
Embargoed:


Attachments (Terms of Use)

Description senaik 2014-12-03 11:46:04 UTC
Description of problem:
=======================
After snapshot restore, if User is under any snap directory under .snaps, ls on the snap directory fails  with "No such file/directory error"


Version-Release number of selected component (if applicable):
============================================================
glusterfs 3.6.0.35 

How reproducible:
================
3/3


Steps to Reproduce:
===================
1.Create a 2x2 dist-rep volume. Fuse and NFS mount the volume 

2.Enable USS

3.Create some IO 

4.Take snapshot (snap1_vol1)and activate it

5.Create more IO

6.Take 2 more snapshots (snap2_vol1 snap3_vol1)and activate them

7.cd to .snaps and list the snapshots 

Fuse mount :
==========
[root@dhcp-0-97 vol1_fuse]# cd .snaps
[root@dhcp-0-97 .snaps]# ll
total 0
d---------. 0 root root 0 Jan  1  1970 snap1_vol1
d---------. 0 root root 0 Jan  1  1970 snap2_vol1
d---------. 0 root root 0 Jan  1  1970 snap3_vol1
[root@dhcp-0-97 .snaps]# cd snap1_vol1
[root@dhcp-0-97 snap1_vol1]# ls
fuse1  nfs1


NFS mount :
=========
[root@dhcp-0-97 .snaps]# ll
total 0
drwxr-xr-x. 5 root root  92 Dec  3 16:36 snap1_vol1
drwxr-xr-x. 7 root root 138 Dec  3 16:38 snap2_vol1
drwxr-xr-x. 7 root root 138 Dec  3 16:38 snap3_vol1
[root@dhcp-0-97 .snaps]# cd snap1_vol1/
[root@dhcp-0-97 snap1_vol1]# ls
fuse1  nfs1

8.Stop the volume and restore the snapshot to snap3_vol1

9.Start the volume 

10.Go back to the terminal in Step 7 and perform ls under snap1_vol1 under .snaps, it fails with "No such file or directory"


Actual results:
===============
If the User is already any snap directory under .snaps and if snapshot restore operation is performed, performing ls on the snap directory fails.


Expected results:
=================
User should be able to list the snapshots under .snaps irrespective of any operation performed


Additional info:

Comment 2 senaik 2014-12-04 09:51:11 UTC
Tried the workaround mentioned in bz 1169790:

Workaround:
-----------
drop the VFS cache
command to do that is "echo 3 >/proc/sys/vm/drop_caches"

After restore operation, ls on the remaining snapshots directory under .snaps still fails 

[root@dhcp-0-97 S1]# ls
fuse_etc.1  fuse_etc.2  nfs_etc.1  nfs_etc.2

After restore :
--------------
[root@dhcp-0-97 S1]# ls
ls: cannot open directory .: No such file or directory

[root@dhcp-0-97 S1]# echo 3 >/proc/sys/vm/drop_caches
[root@dhcp-0-97 S1]# ls
ls: cannot open directory .: No such file or directory
[root@dhcp-0-97 S1]# pwd
/mnt/vol0_fuse/.snaps/S1

Comment 3 Vijaikumar Mallikarjuna 2014-12-05 10:00:39 UTC
When a volume is restored, there will be a graph change in the 'snapd'. With this the gfid of '.snaps' will be re-generated. In this case snapd sends a ESTALE back. So you need to cd out from .snaps and cd in again and this should work.

Problem in this case is ESTALE is converted to ENOENT by a protocol server before sending it to the client. with ESTALE VFS drops the cache, but with ENOENT VFS marks the file negative and caches the same.

Workaround:
1) cd out from .snaps
2) drop cache: 'echo 3 >/proc/sys/vm/drop_caches'
3) Now cd to .snaps and ls under .snaps should work

Comment 4 senaik 2014-12-05 11:24:56 UTC
Tried the workaround mentioned in Comment 3 , it works fine.

Comment 5 senaik 2014-12-05 12:06:23 UTC
As per the previous comment, workaround works fine, removing the blocker flag.

Comment 6 Pavithra 2014-12-08 10:27:12 UTC
Hi Vijai,

I see this bug listed as a known issue in the known issues tracker bug for 3.0.3. Can you please fill out the doc text after changing the doc type to known issue?

Comment 8 Pavithra 2014-12-16 05:32:19 UTC
Changing the doc type to known issue

Comment 9 Pavithra 2014-12-16 05:34:36 UTC
Hi Vijai, 

Can you please review the edited doc text for technical accuracy and sign off?

Comment 10 Vijaikumar Mallikarjuna 2014-12-16 05:36:30 UTC
Doc-text looks good to me

Comment 13 Shashank Raj 2016-02-03 07:17:27 UTC
While reproducing this issue with latest glusterfs-3.7.5-18 build, below are the observations:

Steps to Reproduce:
===================
1.Create a tiered volume. Fuse and NFS mount the volume 

2.Enable USS

3.Create some IO 

4.Take snapshot (snap1)and activate it

5.Create more IO

6.Take 2 more snapshots (snap2, snap3)and activate them

7.cd to .snaps and list the snapshots 

Fuse mount :
==========
[root@dhcp-0-97 vol1_fuse]# cd .snaps
[root@dhcp-0-97 .snaps]# ll
total 0
d---------. 0 root root 0 Jan  1  1970 snap1
d---------. 0 root root 0 Jan  1  1970 snap2
d---------. 0 root root 0 Jan  1  1970 snap3
[root@dhcp-0-97 .snaps]# cd snap1
[root@dhcp-0-97 snap1]# ls
fuse1  nfs1


NFS mount :
=========
[root@dhcp-0-97 .snaps]# ll
total 0
drwxr-xr-x. 5 root root  92 Dec  3 16:36 snap1
drwxr-xr-x. 7 root root 138 Dec  3 16:38 snap2
drwxr-xr-x. 7 root root 138 Dec  3 16:38 snap3
[root@dhcp-0-97 .snaps]# cd snap1
[root@dhcp-0-97 snap1]# ls
fuse1  nfs1

8.Stop the volume and restore the snapshot to snap3

9.Start the volume 

10.Go back to the terminal in Step 7 and perform ls under snap1 under .snaps, it shows "Stale file handle" error message.

[root@dhcp35-63 snap1]# ls
ls: cannot open directory .: Stale file handle

11. however coming out of snap1 and cding it again works fine without any issues.

[root@dhcp35-63 snap1]# cd ..

[root@dhcp35-63 .snaps]# ls
snap1  snap2

[root@dhcp35-63 .snaps]# cd snap1

[root@dhcp35-63 snap1]# ls
fuse1  nfs1

Comment 14 Vijaikumar Mallikarjuna 2016-02-03 07:22:25 UTC
Hi Shashank,

If you are getting and ESTALE, then it is an expected behavior. Please see comment# 3.

Thanks,
Vijay

Comment 15 Vijaikumar Mallikarjuna 2016-02-03 07:23:47 UTC
As per comment# 13, looks like the problem is fixed in glusterfs-3.7, so closing the bug.


Note You need to log in before you can comment on or make changes to this bug.