1420606 – glusterd is crashed at the time of stop volume

Bug 1420606 - glusterd is crashed at the time of stop volume

Summary: glusterd is crashed at the time of stop volume

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.10
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Mohit Agrawal
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1422375 (view as bug list)
Depends On:	1420202
Blocks:	1420430 1437940 1437957
TreeView+	depends on / blocked

Reported:	2017-02-09 04:32 UTC by Mohit Agrawal
Modified:	2017-03-31 18:09 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.10.0
Clone Of:	1420202
Environment:
Last Closed:	2017-03-06 17:45:52 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Mohit Agrawal 2017-02-09 04:32:26 UTC

+++ This bug was initially created as a clone of Bug #1420202 +++

Description of problem:
glusterd is crashed at the time of stop the volume

Version-Release number of selected component (if applicable):

glusterfs-3.11dev-0.56.git3cbf732.el7.x86_64
How reproducible:
Allways

Steps to Reproduce:
1.Setup 1*2 environment and start the volume
2.Stop the volume 
3.glusterd is crashed

Actual results:

glusterd is crashed
Expected results:
it should not crash

Additional info:

--- Additional comment from Atin Mukherjee on 2017-02-08 01:20:51 EST ---

(In reply to Mohit Agrawal from comment #0)
> Description of problem:
> glusterd is crashed at the time of stop the volume
> 
> Version-Release number of selected component (if applicable):
> 
> glusterfs-3.11dev-0.56.git3cbf732.el7.x86_64

Are you using some private rpms?

> How reproducible:
> Allways
> 
> Steps to Reproduce:
> 1.Setup 1*2 environment and start the volume
> 2.Stop the volume 
> 3.glusterd is crashed
> 
> Actual results:
> 
> glusterd is crashed
> Expected results:
> it should not crash
> 
> Additional info:

--- Additional comment from Mohit Agrawal on 2017-02-08 01:29:02 EST ---

Hi,

I have build rpm on latest upstream code to reproduce one another issue.


Regards
Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2017-02-08 01:34:42 EST ---

Hi,

Below is the compile time error throwing at the time of source build 

>>>>>>>>>>>>>>>>>>

In function 'snprintf',
    inlined from 'glusterd_bricks_select_stop_volume' at glusterd-op-sm.c:6182:25:
/usr/include/bits/stdio2.h:64:3: warning: call to __builtin___snprintf_chk will always overflow destination buffer [enabled by default]
   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
   ^
In function 'snprintf',
    inlined from 'glusterd_bricks_select_remove_brick' at glusterd-op-sm.c:6291:25:
/usr/include/bits/stdio2.h:64:3: warning: call to __builtin___snprintf_chk will always overflow destination buffer [enabled by default]
   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,


>>>>>>>>>>>>>>>>>>

gluster v stop dist-repl
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
Connection failed. Please check if gluster daemon is operational.


Below is the bt pattern for the same

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

-5.1.2-12alpha.el7.x86_64 zlib-1.2.7-15.el7.x86_64
(gdb) bt
#0  0x00007f8d06e015f7 in raise () from /lib64/libc.so.6
#1  0x00007f8d06e02ce8 in abort () from /lib64/libc.so.6
#2  0x00007f8d06e41317 in __libc_message () from /lib64/libc.so.6
#3  0x00007f8d06ed9b37 in __fortify_fail () from /lib64/libc.so.6
#4  0x00007f8d06ed7cf0 in __chk_fail () from /lib64/libc.so.6
#5  0x00007f8d06ed740b in __vsnprintf_chk () from /lib64/libc.so.6
#6  0x00007f8d06ed7328 in __snprintf_chk () from /lib64/libc.so.6
#7  0x00007f8cfd249e04 in snprintf (__fmt=0x7f8cfd346818 "%s/run/%s-%s.pid", 
    __n=4096, __s=0x7f8cec407310 "`t@\354\214\177") at /usr/include/bits/stdio2.h:64
#8  glusterd_bricks_select_stop_volume (dict=dict@entry=0x7f8ce4000c50, 
    op_errstr=op_errstr@entry=0x7f8cec409910, selected=selected@entry=0x7f8cec409850)
    at glusterd-op-sm.c:6182
#9  0x00007f8cfd257916 in glusterd_op_bricks_select (op=op@entry=GD_OP_STOP_VOLUME, 
    dict=dict@entry=0x7f8ce4000c50, op_errstr=op_errstr@entry=0x7f8cec409910, 
    selected=selected@entry=0x7f8cec409850, rsp_dict=rsp_dict@entry=0x7f8ce40177c0)
    at glusterd-op-sm.c:7645
#10 0x00007f8cfd2f42af in gd_brick_op_phase (op=GD_OP_STOP_VOLUME, 
    op_ctx=op_ctx@entry=0x7f8ce4005a00, req_dict=0x7f8ce4000c50, 
    op_errstr=op_errstr@entry=0x7f8cec409910) at glusterd-syncop.c:1685
#11 0x00007f8cfd2f4d33 in gd_sync_task_begin (op_ctx=op_ctx@entry=0x7f8ce4005a00, 
    req=req@entry=0x7f8cec0018b0) at glusterd-syncop.c:1937
#12 0x00007f8cfd2f5030 in glusterd_op_begin_synctask (req=req@entry=0x7f8cec0018b0, 
---Type <return> to continue, or q <return> to quit---
    op=op@entry=GD_OP_STOP_VOLUME, dict=0x7f8ce4005a00) at glusterd-syncop.c:2006
#13 0x00007f8cfd2dc47f in __glusterd_handle_cli_stop_volume (
    req=req@entry=0x7f8cec0018b0) at glusterd-volume-ops.c:628
#14 0x00007f8cfd23bfde in glusterd_big_locked_handler (req=0x7f8cec0018b0, 
    actor_fn=0x7f8cfd2dc280 <__glusterd_handle_cli_stop_volume>)
    at glusterd-handler.c:81
#15 0x00007f8d087526d0 in synctask_wrap (old_task=<optimized out>) at syncop.c:375
#16 0x00007f8d06e13110 in ?? () from /lib64/libc.so.6
#17 0x0000000000000000 in ?? ()
(gdb) 
(gdb) f 8
#8  glusterd_bricks_select_stop_volume (dict=dict@entry=0x7f8ce4000c50, 
    op_errstr=op_errstr@entry=0x7f8cec409910, selected=selected@entry=0x7f8cec409850)
    at glusterd-op-sm.c:6182
6182	                        GLUSTERD_GET_BRICK_PIDFILE (pidfile, volinfo,
(gdb) p sizeof(pidfile)
$1 = 1024

#define GLUSTERD_GET_BRICK_PIDFILE(pidfile,volinfo,brickinfo, priv) do {      \
                char exp_path[PATH_MAX] = {0,};                               \
                char volpath[PATH_MAX]  = {0,};                               \
                GLUSTERD_GET_VOLUME_DIR (volpath, volinfo, priv);             \
                GLUSTERD_REMOVE_SLASH_FROM_PATH (brickinfo->path, exp_path);  \
                snprintf (pidfile, PATH_MAX, "%s/run/%s-%s.pid",              \
                          volpath, brickinfo->hostname, exp_path);      \
        } 
>>>>>>>>>>>>>>>>>>>>>>>


RCA: glusterd is crashed because pidfile array size is 1024 and data copying by the function GLUSTERD_GET_BRICK_PIDFILE length is PATH_MAX(4096) , because of overflowing the array glusterd is crashed.

After increase the size of array in parent function (glusterd_bricks_select_stop_volume) issue will resolve.

Regards
Mohit Agrawal

--- Additional comment from Worker Ant on 2017-02-08 01:51:46 EST ---

REVIEW: https://review.gluster.org/16560 (glusterd: glusterd is crashed at the time of stop volume) posted (#1) for review on master by MOHIT AGRAWAL (moagrawa)

--- Additional comment from Worker Ant on 2017-02-08 03:48:21 EST ---

REVIEW: https://review.gluster.org/16560 (glusterd: glusterd is crashed at the time of stop volume) posted (#2) for review on master by MOHIT AGRAWAL (moagrawa)

--- Additional comment from Worker Ant on 2017-02-08 11:48:04 EST ---

COMMIT: https://review.gluster.org/16560 committed in master by Atin Mukherjee (amukherj) 
------
commit 9ac193a19b0ca6d6548aeafa5c915b26396f8697
Author: Mohit Agrawal <moagrawa>
Date:   Wed Feb 8 12:20:55 2017 +0530

    glusterd: glusterd is crashed at the time of stop volume
    
    Problem: glusterd is crashed at the time of stop volume due to
             overflow of pidfile array after build rpm with default options.
    
    Solution: To avoid the crash update the pidfile array size.
    
    Test:    To test the patch followed below procedure
             1) Setup 1*2 environment and start the volume
             2) Stop the volume
             Before apply the patch glusterd is crashed.
    
    Note:  The crash is happened only after build rpm with rpmbuild -ba
           <spec> because _FORTIFY_SOURCE is enabled. This option tries to
           figure out possible overflow scenarios like the bug here and
           crash the process.
    
    BUG: 1420202
    Change-Id: I58a006bc0727843a7ed02a10b4ebd5dca39eae67
    Signed-off-by: Mohit Agrawal <moagrawa>
    Reviewed-on: https://review.gluster.org/16560
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: Atin Mukherjee <amukherj>
    CentOS-regression: Gluster Build System <jenkins.org>

--- Additional comment from Atin Mukherjee on 2017-02-08 12:03:01 EST ---



--- Additional comment from Atin Mukherjee on 2017-02-08 12:03:44 EST ---

Comment 1 Worker Ant 2017-02-09 04:33:42 UTC

REVIEW: https://review.gluster.org/16571 (glusterd: glusterd is crashed at the time of stop volume) posted (#1) for review on release-3.10 by MOHIT AGRAWAL (moagrawa)

Comment 2 Worker Ant 2017-02-09 12:32:48 UTC

COMMIT: https://review.gluster.org/16571 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 94374de6984da45e668d81a3794aa8137aabeb0f
Author: Mohit Agrawal <moagrawa>
Date:   Wed Feb 8 12:20:55 2017 +0530

    glusterd: glusterd is crashed at the time of stop volume
    
    Problem: glusterd is crashed at the time of stop volume due to
             overflow of pidfile array after build rpm with default options.
    
    Solution: To avoid the crash update the pidfile array size.
    
    Test:    To test the patch followed below procedure
             1) Setup 1*2 environment and start the volume
             2) Stop the volume
             Before apply the patch glusterd is crashed.
    
    Note:  The crash is happened only after build rpm with rpmbuild -ba
           <spec> because _FORTIFY_SOURCE is enabled. This option tries to
           figure out possible overflow scenarios like the bug here and
           crash the process.
    
    > BUG: 1420202
    > Change-Id: I58a006bc0727843a7ed02a10b4ebd5dca39eae67
    > Signed-off-by: Mohit Agrawal <moagrawa>
    > Reviewed-on: https://review.gluster.org/16560
    > NetBSD-regression: NetBSD Build System <jenkins.org>1420606
    > Smoke: Gluster Build System <jenkins.org>
    > Reviewed-by: N Balachandran <nbalacha>
    > Reviewed-by: Atin Mukherjee <amukherj>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > (cherry picked from commit 9ac193a19b0ca6d6548aeafa5c915b26396f8697)
    
    Change-Id: I4bcfe830e789a9b95fbcace4495bfda17dca0269
    BUG: 1420606
    Reviewed-on: https://review.gluster.org/16571
    Tested-by: MOHIT AGRAWAL <moagrawa>
    Reviewed-by: Atin Mukherjee <amukherj>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 3 Atin Mukherjee 2017-02-15 08:53:26 UTC

*** Bug 1422375 has been marked as a duplicate of this bug. ***

Comment 4 Shyamsundar 2017-03-06 17:45:52 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.