Bug 1235582

Summary:	snapd crashed due to stack overflow
Product:	[Community] GlusterFS	Reporter:	krishnan parthasarathi <kparthas>
Component:	protocol	Assignee:	krishnan parthasarathi <kparthas>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	mainline	CC:	bugs, gluster-bugs, nsathyan, rgowdapp
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.8rc2	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	1235571
Clones:	1253212 (view as bug list)		Environment:
Last Closed:	2016-06-16 13:16:31 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1235571
Bug Blocks:	1253212

Description krishnan parthasarathi 2015-06-25 09:01:18 UTC

+++ This bug was initially created as a clone of Bug #1235571 +++

Description of problem:

Snapd crashed on one of the Storage node.

Version-Release number of selected component (if applicable):

[root@darkknightrises ~]# rpm -qa | grep glusterfs
glusterfs-libs-3.7.1-5.el6rhs.x86_64
glusterfs-api-3.7.1-5.el6rhs.x86_64
samba-vfs-glusterfs-4.1.17-7.el6rhs.x86_64
glusterfs-3.7.1-5.el6rhs.x86_64
glusterfs-fuse-3.7.1-5.el6rhs.x86_64
glusterfs-cli-3.7.1-5.el6rhs.x86_64
glusterfs-geo-replication-3.7.1-5.el6rhs.x86_64
glusterfs-debuginfo-3.7.1-5.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-5.el6rhs.x86_64
glusterfs-server-3.7.1-5.el6rhs.x86_64



How reproducible:

1/1

Steps to Reproduce:
1. Create 2*2 distribute replicate volume
2. Do fuse mount and start I/O from the mount point
3. Enable quota and set limit usage
4. Enable USS
5. set snap-activate-on-create enable for snapshot 
6. Enable auto-delete for snapshot 
7. set cluster.enable-shared-storage enable
8. Schedule snapshot every after every 5 mins
[root@darkknightrises ~]# snap_scheduler.py list
JOB_NAME         SCHEDULE         OPERATION        VOLUME NAME      
--------------------------------------------------------------------
job1             */5 * * * *      Snapshot Create  vol0             
9. edit scheduled time to every two minutes
10. Restart nptd service on all the storage node. 
11 Did gluster v status vol0 observers the crash


Actual results:

Snapd crashed 

Expected results:

Snapd should not crash

Additional info:

[root@darkknightrises ~]# gluster v info vol0
 
Volume Name: vol0
Type: Distributed-Replicate
Volume ID: 553670d5-8f05-4d41-878e-921d59f117ae
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: node1:/rhs/brick1/b01
Brick2: node2:/rhs/brick1/b02
Brick3: node3:/rhs/brick1/b03
Brick4: node4:/rhs/brick1/04
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
features.uss: enable
features.show-snapshot-directory: enable
features.barrier: enable
auto-delete: enable
snap-activate-on-create: enable
cluster.enable-shared-storage: enable

===============================================================

[root@darkknightrises /]# gluster v status vol0
Status of volume: vol0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/rhs/brick1/b01          49152     0          Y       24731
Brick node2:/rhs/brick1/b02          49152     0          Y       16359
Brick node3:/rhs/brick1/b03          49152     0          Y       10149
Brick node4:/rhs/brick1/04            49152     0          Y       5512 
Snapshot Daemon on localhost                N/A       N/A        N       25133
NFS Server on localhost                     2049      0          Y       25671
Self-heal Daemon on localhost               N/A       N/A        Y       25681
Quota Daemon on localhost                   N/A       N/A        Y       26666
Snapshot Daemon on node3             49153     0          Y       10525
NFS Server on node3                  2049      0          Y       10991
Self-heal Daemon on node3            N/A       N/A        Y       11001
Quota Daemon on node3                N/A       N/A        Y       11853
Snapshot Daemon on node4              49154     0          Y       5918 
NFS Server on node4                   2049      0          Y       6396 
Self-heal Daemon on node4             N/A       N/A        Y       6406 
Quota Daemon on node4                 N/A       N/A        Y       7255 
Snapshot Daemon on node2             49154     0          Y       16714
NFS Server on node2                  2049      0          Y       17201
Self-heal Daemon on node2            N/A       N/A        Y       17211
Quota Daemon on node2                N/A       N/A        Y       18168

bt
========================================
#44214 0x00007f473c1f5def in client_dump_version_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x7f4024701c24) at client-handshake.c:1623
 1623                    gf_msg (frame->this->name, GF_LOG_WARNING, ENOTCONN,
 (gdb) up
 #44215 0x00007f475360cb1e in saved_frames_unwind (saved_frames=0x7f45d80022d0) at rpc-clnt.c:366
 366                     trav->rpcreq->cbkfn (trav->rpcreq, &iov, 1, trav->frame);
 (gdb) up
 #44216 0x00007f475360cc0e in saved_frames_destroy (frames=0x7f45d80022d0) at rpc-clnt.c:383
 383             saved_frames_unwind (frames);
 (gdb) down
 #44215 0x00007f475360cb1e in saved_frames_unwind (saved_frames=0x7f45d80022d0) at rpc-clnt.c:366
 366                     trav->rpcreq->cbkfn (trav->rpcreq, &iov, 1, trav->frame);
 (gdb) up
 #44216 0x00007f475360cc0e in saved_frames_destroy (frames=0x7f45d80022d0) at rpc-clnt.c:383
 383             saved_frames_unwind (frames);
 (gdb) up
 #44217 0x00007f475360ccdb in rpc_clnt_connection_cleanup (conn=0x7f45d0e86f40) at rpc-clnt.c:536
 536             saved_frames_destroy (saved_frames);
 (gdb) up
 #44218 0x00007f475360d29f in rpc_clnt_notify (trans=<value optimized out>, mydata=0x7f45d0e86f40, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:843
 843                     rpc_clnt_connection_cleanup (conn);
 (gdb) up
 #44219 0x00007f4753608928 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:543
 543                     ret = this->notify (this, this->mydata, event, data);
 (gdb) up
 #44220 0x00007f4748370741 in socket_event_poll_err (fd=<value optimized out>, idx=<value optimized out>, data=0x7f45d0ee01c0, poll_in=<value optimized out>, poll_out=0, poll_err=0)
     at socket.c:1205
 1205            rpc_transport_notify (this, RPC_TRANSPORT_DISCONNECT, this);
 (gdb) up
 #44221 socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x7f45d0ee01c0, poll_in=<value optimized out>, poll_out=0, poll_err=0) at socket

 ----------------------------------------
 and some frames at the top

 0  0x00007f47521ee06a in vfprintf () from /lib64/libc.so.6
 #1  0x00007f47522a9b10 in __vsnprintf_chk () from /lib64/libc.so.6
 #2  0x00007f47522a9a4a in __snprintf_chk () from /lib64/libc.so.6
 #3  0x00007f4753856a5d in snprintf (buf=0x7f4024f1bb98 "") at /usr/include/bits/stdio2.h:65
 #4  gf_backtrace_append (buf=0x7f4024f1bb98 "") at common-utils.c:3547
 #5  gf_backtrace_fillframes (buf=0x7f4024f1bb98 "") at common-utils.c:3593
 #6  0x00007f4753856ac1 in gf_backtrace_save (buf=<value optimized out>) at common-utils.c:3631
 #7  0x00007f475383cfc0 in _gf_log_callingfn (domain=0x7f47538add11 "", file=<value optimized out>, function=0x7f47538b3850 "gf_mem_set_acct_info", line=62, level=GF_LOG_ERROR, 
     fmt=0x7f47538b3718 "Assertion failed: xl->mem_acct != NULL") at logging.c:835
 #8  0x00007f475386dd34 in gf_mem_set_acct_info (xl=0x7f4753ad4120, alloc_ptr=0x7f401a76ffe8, size=437, type=39, typestr=0x7f47538b35cc "gf_common_mt_asprintf") at mem-pool.c:62
 #9  0x00007f475386ddf6 in __gf_malloc (size=437, type=39, typestr=0x7f47538b35cc "gf_common_mt_asprintf") at mem-pool.c:146
 #10 0x00007f475386df04 in gf_vasprintf (string_ptr=0x7f401a770228, format=0x7f47538adc22 "[%s] %s [%s:%d:%s] %s %d-%s: ", arg=<value optimized out>) at mem-pool.c:220
 #11 0x00007f475386dfc8 in gf_asprintf (string_ptr=<value optimized out>, format=<value optimized out>) at mem-pool.c:238
 #12 0x00007f475383d1fe in _gf_log_callingfn (domain=0x7f47538add11 "", file=<value optimized out>, function=0x7f47538b3850 "gf_mem_set_acct_info", line=62, level=GF_LOG_ERROR, 
     fmt=0x7f47538b3718 "Assertion failed: xl->mem_acct != NULL") at logging.c:869
 #13 0x00007f475386dd34 in gf_mem_set_acct_info (xl=0x7f4753ad4120, alloc_ptr=0x7f401a770488, size=437, type=39, typestr=0x7f47538b35cc "gf_common_mt_asprintf") at mem-pool.c:62
 #14 0x00007f475386ddf6 in __gf_malloc (size=437, type=39, typestr=0x7f47538b35cc "gf_common_mt_asprintf") at mem-pool.c:146

Comment 1 Anand Avati 2015-06-25 09:02:53 UTC

REVIEW: http://review.gluster.org/11399 (core: check for xl->mem_acct only if memory accounting is enabled) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)

Comment 2 krishnan parthasarathi 2015-06-26 09:17:20 UTC

RCA
----

The stack overflow was seen when older snapshots were being deleted while new ones were being created concurently. In the setup detailed above, snapshot scheduler creates snapshots periodically and auto-delete of snapshots is enabled. When no. of snapshots in the system (of the volume) exceeds the soft-limit configured, snapshots are (auto-)deleted. The crash happened when a scheduled snapshot-create coincided with the auto-delete triggered snapshot-delete operation.

Implementation detail
----------------------
Snapshot daemon uses gfapi interface to serve user-serviceable snapshots. gfapi interface creates a new glfs object for every snapshot (volume) serviced. This object is 'linked' with a global xlator object until the time glfs object is fully initialized (i.e, set-volume operation is complete). The global xlator object's ctx (glusterfs_ctx_t) object is being modified in a thread-unsafe manner and could refer to a destroyed ctx (which belonged to glfs representing a deleted snapshot).

Fix outline
------------
All initialisation managment operations (e.g, RPCs like DUMP_VERSION, SET_VOLUME, etc.) must refer to the corresponding translator objects in the glfs' graph.

Comment 3 Anand Avati 2015-06-27 05:41:43 UTC

REVIEW: http://review.gluster.org/11436 (client: set THIS to client's xl in non-FOP RPCs) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)

Comment 4 Anand Avati 2015-07-17 08:59:53 UTC

REVIEW: http://review.gluster.org/11436 (rpc: add owner xlator argument to rpc_clnt_new) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)

Comment 5 Anand Avati 2015-08-05 17:40:52 UTC

REVIEW: http://review.gluster.org/11436 (rpc: add owner xlator argument to rpc_clnt_new) posted (#3) for review on master by Krishnan Parthasarathi (kparthas)

Comment 6 Anand Avati 2015-08-13 06:12:59 UTC

COMMIT: http://review.gluster.org/11436 committed in master by Raghavendra G (rgowdapp) 
------
commit f7668938cd7745d024f3d2884e04cd744d0a69ab
Author: Krishnan Parthasarathi <kparthas>
Date:   Sat Jun 27 11:04:25 2015 +0530

    rpc: add owner xlator argument to rpc_clnt_new
    
    The @owner argument tells RPC layer the xlator that owns
    the connection and to which xlator THIS needs be set during
    network notifications like CONNECT and DISCONNECT.
    
    Code paths that originate from the head of a (volume) graph and use
    STACK_WIND ensure that the RPC local endpoint has the right xlator saved
    in the frame of the call (callback pair). This guarantees that the
    callback is executed in the right xlator context.
    
    The client handshake process which includes fetching of brick ports from
    glusterd, setting lk-version on the brick for the session, don't have
    the correct xlator set in their frames. The problem lies with RPC
    notifications. It doesn't have the provision to set THIS with the xlator
    that is registered with the corresponding RPC programs. e.g,
    RPC_CLNT_CONNECT event received by protocol/client doesn't have THIS set
    to its xlator. This implies, call(-callbacks) originating from this
    thread don't have the right xlator set too.
    
    The fix would be to save the xlator registered with the RPC connection
    during rpc_clnt_new. e.g, protocol/client's xlator would be saved with
    the RPC connection that it 'owns'. RPC notifications such as CONNECT,
    DISCONNECT, etc inherit THIS from the RPC connection's xlator.
    
    Change-Id: I9dea2c35378c511d800ef58f7fa2ea5552f2c409
    BUG: 1235582
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/11436
    Tested-by: Gluster Build System <jenkins.com>
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 7 Nagaprasad Sathyanarayana 2015-10-25 15:15:57 UTC

Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.

Comment 8 Niels de Vos 2016-06-16 13:16:31 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user