Bug 1246183

Summary: Snapd crashed while listing .snaps directory (EC Volume)
Product: Red Hat Gluster Storage Reporter: Bhaskarakiran <byarlaga>
Component: snapshotAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED WONTFIX QA Contact: storage-qa-internal <storage-qa-internal>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: asriram, atumball, rcyriac, rhs-bugs, sankarshan, smohan
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Known Issue
Doc Text:
User Serviceable Snapshots is not supported on Erasure Coded (EC) volumes.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-20 04:54:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1216951    
Description Flags
core file none

Description Bhaskarakiran 2015-07-23 15:56:52 UTC
Created attachment 1055451 [details]
core file

Description of problem:

Created a snapshot, activated it and on the client mount (fuse) listed the contents of .snaps/<snapshot>. Deleted the snapshot while being in .snaps directory and did 'ls'. 

Created a new snapshot with the same name and tried to list the contents of .snaps/<snapshot> and seen the crash.

Tried to reproduce and cd to .snaps directory hangs forever. Re-mounted the volume on the client and cd to .snaps throws 'Transport endpoint is not connected" though there is no new crash seen.

[root@rhs-client29 ~]# cd /mnt/fuse
[root@rhs-client29 fuse]# ls -l
total 768
drwxr-xr-x.  2 root root 40960 Jul 23 16:03 20k
drwxr-xr-x.  2 root root 49152 Jul 23 16:02 80k
drwxr-xr-x.  2 root root     6 Jul 23 16:02 dbench
drwxr-xr-x.  3 root root    18 Jul 23 16:02 dirs
drwxr-xr-x.  7 root root    66 Jul 23 16:42 files
drwxr-xr-x.  2 root root     6 Jul 23 12:50 iozone
drwxr-xr-x.  3 root root    24 Jul 23 12:50 linux
drwxr-xr-x. 16 root root  8192 Jul 23 16:24 renames
drwxr-xr-x.  3 root root    24 Jul 23 16:02 tarball
[root@rhs-client29 fuse]# cd .snaps
-bash: cd: .snaps: Transport endpoint is not connected
[root@rhs-client29 fuse]# 
[root@rhs-client29 fuse]# cd .snaps
-bash: cd: .snaps: Transport endpoint is not connected
[root@rhs-client29 fuse]# 


(gdb) bt
#0  inode_unref (inode=0x7f4a390350c8) at inode.c:521
#1  0x00007f4a588a6cfc in pub_glfs_h_close (object=0x7f4a44042390)
    at glfs-handleops.c:1333
#2  0x00007f4a58ab4ed0 in svs_forget (this=<value optimized out>, 
    inode=0x7f4a5249f88c) at snapview-server.c:1163
#3  0x00007f4a661dd2e9 in __inode_ctx_free (inode=0x7f4a5249f88c) at inode.c:337
#4  0x00007f4a661df25c in __inode_destroy (table=<value optimized out>)
    at inode.c:358
#5  inode_table_prune (table=<value optimized out>) at inode.c:1501
#6  0x00007f4a661df81c in inode_unref (inode=0x7f4a44feadec) at inode.c:529
#7  0x00007f4a661f1ebb in gf_dirent_entry_free (entry=0x7f4a44efc190)
    at gf-dirent.c:183
#8  0x00007f4a661f1f28 in gf_dirent_free (entries=0x7f4a5280ebe0)
    at gf-dirent.c:202
#9  0x00007f4a58ab726e in svs_readdirp (frame=0x7f4a63d76184, 
    this=<value optimized out>, fd=<value optimized out>, size=22, off=42903, 
    dict=0x0) at snapview-server.c:1483
#10 0x00007f4a661c5003 in default_readdirp_resume (frame=0x7f4a63d760d8, 
    this=0x7f4a54009750, fd=0x7f4a5402175c, size=4096, off=42903, xdata=0x0)
    at defaults.c:1657
#11 0x00007f4a661e5640 in call_resume (stub=0x7f4a637fcee0) at call-stub.c:2576
#12 0x00007f4a53bda541 in iot_worker (data=0x7f4a5401c8c0) at io-threads.c:215
#13 0x00007f4a652a5a51 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f4a64c0f9ad in clone () from /lib64/libc.so.6

Version-Release number of selected component (if applicable):
RC build.

[root@transformers ~]# gluster --version
glusterfs 3.7.1 built on Jul 19 2015 02:16:40
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@transformers ~]# 

How reproducible:
Seen once. Tried to reproduce but cd to .snaps directory hangs.

Steps to Reproduce:
As in description 

Actual results:

Expected results:

Additional info:
Corefile will be attached.

Comment 2 Bhaskarakiran 2015-07-23 15:58:14 UTC
Volume info:

[root@transformers core]# gluster  v info vol1
Volume Name: vol1
Type: Disperse
Volume ID: ef3820fa-45e5-4650-9e65-9f04c916649c
Status: Started
Number of Bricks: 1 x (8 + 4) = 12
Transport-type: tcp
Brick1: transformers:/rhs/brick1/b1
Brick2: interstellar:/rhs/brick1/b2
Brick3: transformers:/rhs/brick2/b3
Brick4: interstellar:/rhs/brick2/b4
Brick5: transformers:/rhs/brick3/b5
Brick6: interstellar:/rhs/brick3/b6
Brick7: transformers:/rhs/brick4/b7
Brick8: interstellar:/rhs/brick4/b8
Brick9: transformers:/rhs/brick5/b9
Brick10: interstellar:/rhs/brick5/b10
Brick11: transformers:/rhs/brick6/b11
Brick12: interstellar:/rhs/brick6/b12
Options Reconfigured:
features.barrier: disable
cluster.disperse-self-heal-daemon: enable
server.event-threads: 2
client.event-threads: 2
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
features.uss: on
performance.readdir-ahead: on
[root@transformers core]#

Comment 3 Bhaskarakiran 2015-07-23 16:09:39 UTC
Seen another crash while reproducing.


(gdb) bt
#0  0x00007f2e8e814937 in __glfs_resolve_inode (fs=0x7f2e8421bd50, 
    subvol=0x7f2e6001a200, object=0x7f2e84024510) at glfs-resolve.c:976
#1  0x00007f2e8e8150f5 in glfs_resolve_inode (fs=0x7f2e8421bd50, 
    subvol=0x7f2e6001a200, object=0x7f2e84024510) at glfs-resolve.c:1000
#2  0x00007f2e8e81700e in pub_glfs_h_opendir (fs=0x7f2e8421bd50, 
    object=0x7f2e84024510) at glfs-handleops.c:1084
#3  0x00007f2e8ea290b7 in svs_opendir (frame=0x7f2e99ce5184, this=0x7f2e88005e40, 
    loc=0x7f2e9976b06c, fd=0x7f2e800046bc, xdata=<value optimized out>)
    at snapview-server.c:675
#4  0x00007f2e9c13704a in default_opendir_resume (frame=0x7f2e99ce50d8, 
    this=0x7f2e88009750, loc=0x7f2e9976b06c, fd=0x7f2e800046bc, xdata=0x0)
    at defaults.c:1360
#5  0x00007f2e9c154640 in call_resume (stub=0x7f2e9976b02c) at call-stub.c:2576
#6  0x00007f2e8db68541 in iot_worker (data=0x7f2e8801c8c0) at io-threads.c:215
#7  0x00007f2e9b214a51 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f2e9ab7e9ad in clone () from /lib64/libc.so.6

Snippet :

[root@rhs-client29 .snaps]# ls -al
total 0
drwxr-xr-x. 13 root root 148 Jul 23 12:50 snap3
[root@rhs-client29 .snaps]# cd snap3
[root@rhs-client29 snap3]# ls
ls: cannot open directory .: Transport endpoint is not connected
[root@rhs-client29 snap3]# 
[root@rhs-client29 snap3]# 
[root@rhs-client29 snap3]# ls -al
ls: cannot open directory .: Transport endpoint is not connected
[root@rhs-client29 snap3]# 

[root@transformers ~]# gluster v status vol1
Status of volume: vol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
Brick transformers:/rhs/brick1/b1           49152     0          Y       12563
Brick interstellar:/rhs/brick1/b2           49152     0          Y       12246
Brick transformers:/rhs/brick2/b3           49153     0          Y       12581
Brick interstellar:/rhs/brick2/b4           49153     0          Y       11163
Brick transformers:/rhs/brick3/b5           49154     0          Y       11249
Brick interstellar:/rhs/brick3/b6           49154     0          Y       11181
Brick transformers:/rhs/brick4/b7           49155     0          Y       11267
Brick interstellar:/rhs/brick4/b8           49155     0          Y       11199
Brick transformers:/rhs/brick5/b9           49156     0          Y       11285
Brick interstellar:/rhs/brick5/b10          49156     0          Y       11217
Brick transformers:/rhs/brick6/b11          49157     0          Y       11303
Brick interstellar:/rhs/brick6/b12          49157     0          Y       11240
Snapshot Daemon on localhost                N/A       N/A        N       50076
NFS Server on localhost                     2049      0          Y       50373
Self-heal Daemon on localhost               N/A       N/A        Y       50381
Quota Daemon on localhost                   N/A       N/A        Y       50389
Snapshot Daemon on interstellar             49194     0          Y       47947
NFS Server on interstellar                  2049      0          Y       47955
Self-heal Daemon on interstellar            N/A       N/A        Y       43413
Quota Daemon on interstellar                N/A       N/A        Y       43422
Task Status of Volume vol1
There are no active volume tasks
[root@transformers ~]#

Comment 4 Anjana Suparna Sriram 2015-07-27 08:40:43 UTC
Please review the doc text and sign off to be included in Known Issues chapter.

Comment 5 Avra Sengupta 2015-07-27 08:41:47 UTC
Doc text looks good. Verified.