Bug 1201820 - CIFS: [USS]: snapd got crashed after deleting and creating/activating 256 snapshots
Summary: CIFS: [USS]: snapd got crashed after deleting and creating/activating 256 sna...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: snapshot
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks: 1191838 1204329
TreeView+ depends on / blocked
 
Reported: 2015-03-13 15:03 UTC by ssamanta
Modified: 2018-04-04 09:56 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
When a snapshot is deleted, the corresponding file system object in the User Serviceable Snapshot is also deleted. Any subsequent file system access results in the snapshot Daemon becoming unresponsive. Workaround: Ensure that you do not perform any file system operations on the snapshot that is about to be deleted.
Clone Of:
Environment:
Last Closed: 2018-04-04 09:56:24 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1343821 0 high CLOSED SNAPSHOT-VSS : Snapd crashed while deactivate and activate operations 2021-02-22 00:41:40 UTC

Internal Links: 1343821

Description ssamanta 2015-03-13 15:03:31 UTC
Description of problem:
snapd got crashed after 256 snapshots are deleted and created again.


Version-Release number of selected component (if applicable):

[root@gqas005 core]# rpm -qa | grep gluster
gluster-nagios-common-0.1.4-1.el6rhs.noarch
glusterfs-api-3.6.0.51-1.el6rhs.x86_64
glusterfs-rdma-3.6.0.51-1.el6rhs.x86_64
samba-vfs-glusterfs-4.1.17-4.el6rhs.x86_64
glusterfs-debuginfo-3.6.0.51-1.el6rhs.x86_64
glusterfs-cli-3.6.0.51-1.el6rhs.x86_64
glusterfs-libs-3.6.0.51-1.el6rhs.x86_64
glusterfs-3.6.0.51-1.el6rhs.x86_64
glusterfs-server-3.6.0.51-1.el6rhs.x86_64
gluster-nagios-addons-0.1.14-1.el6rhs.x86_64
rhs-tests-rhs-tests-beaker-rhs-gluster-qe-libs-dev-bturner-2.37-0.noarch
vdsm-gluster-4.14.7.3-1.el6rhs.noarch
glusterfs-geo-replication-3.6.0.51-1.el6rhs.x86_64
glusterfs-fuse-3.6.0.51-1.el6rhs.x86_64
[root@gqas005 core]# 

How reproducible:
Tried once


Steps to Reproduce:
1.Create a 6*2 volume in RHS3.0.3 async build (glusterfs-3.6.0.42.1-1) and start it
2.Enable USS and set "show-snapshot-directory" option.
3.Create some files and directories at the CIFS mount point.
4.Create 256 snapshots on the volume and access through CIFS client
5.Deactivate the snapshots before upgrading to RHS3.0.4 build(glusterfs-3.6.0.51-1) as workaround suggested by dev for bug 1196557.

6.After upgrade activate the snapshots and access the .snaps directories 
through the Windows, NFS and CIFS Clients.
7.Upgrade the samba rpms to samba-4.17-4
8.Delete the 256 snapshots
9.Create the snapshots again and activate it and watch the memory usage by snapd
10. snapd got crashed with the following

[2015-03-13 07:27:42.101336] I [server-helpers.c:290:do_fd_cleanup] 0-testvol1-server: fd cleanup on <gfid:9fcce174-19dd-4dd1-bd2c-164c755ad1a8>
pending frames:
frame : type(0) op(27)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-03-13 07:27:42
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.0.51
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f0de75a9c16]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f0de75c4daf]
/lib64/libc.so.6[0x3dc80326a0]
/usr/lib64/libgfapi.so.0(glfs_closedir+0x17)[0x7f0ddc3765e7]
/usr/lib64/glusterfs/3.6.0.51/xlator/features/snapview-server.so(svs_releasedir+0x5f)[0x7f0ddc5920df]
/usr/lib64/libglusterfs.so.0(fd_unref+0x221)[0x7f0de75d6ed1]
/usr/lib64/libglusterfs.so.0(call_stub_destroy+0xe6)[0x7f0de75cc576]
/usr/lib64/glusterfs/3.6.0.51/xlator/performance/io-threads.so(iot_worker+0x158)[0x7f0ddc165348]
/lib64/libpthread.so.0[0x3dc84079d1]
/lib64/libc.so.6(clone+0x6d)[0x3dc80e88fd]

Actual results:
snapd got crashed.

Expected results:
snapd should not crash.

Additional info:
sosreports:http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/sosreport-gqas005-20150313104758-652f.tar.xz

[root@gqas005 core]# gluster volume status
Status of volume: testvol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gqas009.sbu.lab.eng.bos.redhat.com:/r
hs/brick1/br1                               49152     0          Y       28153
Brick gqas012.sbu.lab.eng.bos.redhat.com:/r
hs/brick2/br2                               49152     0          Y       27032
Brick gqas006.sbu.lab.eng.bos.redhat.com:/r
hs/brick3/br3                               49152     0          Y       26922
Brick gqas005.sbu.lab.eng.bos.redhat.com:/r
hs/brick4/br4                               49152     0          Y       7947 
Brick gqas005.sbu.lab.eng.bos.redhat.com:/r
hs/brick5/br5                               49153     0          Y       7955 
Brick gqas006.sbu.lab.eng.bos.redhat.com:/r
hs/brick6/br6                               49153     0          Y       26931
Brick gqas009.sbu.lab.eng.bos.redhat.com:/r
hs/brick7/br7                               49153     0          Y       28161
Brick gqas012.sbu.lab.eng.bos.redhat.com:/r
hs/brick8/br8                               49153     0          Y       27040
Brick gqas006.sbu.lab.eng.bos.redhat.com:/r
hs/brick9/br9                               49154     0          Y       26939
Brick gqas005.sbu.lab.eng.bos.redhat.com:/r
hs/brick10/br10                             49154     0          Y       7963 
Brick gqas009.sbu.lab.eng.bos.redhat.com:/r
hs/brick11/br11                             49154     0          Y       28170
Brick gqas012.sbu.lab.eng.bos.redhat.com:/r
hs/brick12/br12                             49154     0          Y       27048
Snapshot Daemon on localhost                N/A       N/A        N       7971

Comment 1 ssamanta 2015-03-16 14:36:35 UTC
core and sosreports are available at below location:
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1201820/

Comment 4 Vivek Agarwal 2015-03-23 06:56:47 UTC
*** Bug 1204329 has been marked as a duplicate of this bug. ***

Comment 5 Pavithra 2015-03-23 07:02:19 UTC
Hi Rajesh,

I see this bug being added as a known issue for the 3.0.4 release. Please fill out the doc text.

Comment 6 Pavithra 2015-03-24 04:54:19 UTC
Rajesh,

Could you please review the edited doc text for technical accuracy and sign off?

Comment 7 rjoseph 2015-03-24 08:00:28 UTC
The doc text looks ok to me.


Note You need to log in before you can comment on or make changes to this bug.