+++ This bug was initially created as a clone of Bug #1420202 +++ Description of problem: glusterd is crashed at the time of stop the volume Version-Release number of selected component (if applicable): glusterfs-3.11dev-0.56.git3cbf732.el7.x86_64 How reproducible: Allways Steps to Reproduce: 1.Setup 1*2 environment and start the volume 2.Stop the volume 3.glusterd is crashed Actual results: glusterd is crashed Expected results: it should not crash Additional info: --- Additional comment from Atin Mukherjee on 2017-02-08 01:20:51 EST --- (In reply to Mohit Agrawal from comment #0) > Description of problem: > glusterd is crashed at the time of stop the volume > > Version-Release number of selected component (if applicable): > > glusterfs-3.11dev-0.56.git3cbf732.el7.x86_64 Are you using some private rpms? > How reproducible: > Allways > > Steps to Reproduce: > 1.Setup 1*2 environment and start the volume > 2.Stop the volume > 3.glusterd is crashed > > Actual results: > > glusterd is crashed > Expected results: > it should not crash > > Additional info: --- Additional comment from Mohit Agrawal on 2017-02-08 01:29:02 EST --- Hi, I have build rpm on latest upstream code to reproduce one another issue. Regards Mohit Agrawal --- Additional comment from Mohit Agrawal on 2017-02-08 01:34:42 EST --- Hi, Below is the compile time error throwing at the time of source build >>>>>>>>>>>>>>>>>> In function 'snprintf', inlined from 'glusterd_bricks_select_stop_volume' at glusterd-op-sm.c:6182:25: /usr/include/bits/stdio2.h:64:3: warning: call to __builtin___snprintf_chk will always overflow destination buffer [enabled by default] return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, ^ In function 'snprintf', inlined from 'glusterd_bricks_select_remove_brick' at glusterd-op-sm.c:6291:25: /usr/include/bits/stdio2.h:64:3: warning: call to __builtin___snprintf_chk will always overflow destination buffer [enabled by default] return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, >>>>>>>>>>>>>>>>>> gluster v stop dist-repl Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y Connection failed. Please check if gluster daemon is operational. Below is the bt pattern for the same >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -5.1.2-12alpha.el7.x86_64 zlib-1.2.7-15.el7.x86_64 (gdb) bt #0 0x00007f8d06e015f7 in raise () from /lib64/libc.so.6 #1 0x00007f8d06e02ce8 in abort () from /lib64/libc.so.6 #2 0x00007f8d06e41317 in __libc_message () from /lib64/libc.so.6 #3 0x00007f8d06ed9b37 in __fortify_fail () from /lib64/libc.so.6 #4 0x00007f8d06ed7cf0 in __chk_fail () from /lib64/libc.so.6 #5 0x00007f8d06ed740b in __vsnprintf_chk () from /lib64/libc.so.6 #6 0x00007f8d06ed7328 in __snprintf_chk () from /lib64/libc.so.6 #7 0x00007f8cfd249e04 in snprintf (__fmt=0x7f8cfd346818 "%s/run/%s-%s.pid", __n=4096, __s=0x7f8cec407310 "`t@\354\214\177") at /usr/include/bits/stdio2.h:64 #8 glusterd_bricks_select_stop_volume (dict=dict@entry=0x7f8ce4000c50, op_errstr=op_errstr@entry=0x7f8cec409910, selected=selected@entry=0x7f8cec409850) at glusterd-op-sm.c:6182 #9 0x00007f8cfd257916 in glusterd_op_bricks_select (op=op@entry=GD_OP_STOP_VOLUME, dict=dict@entry=0x7f8ce4000c50, op_errstr=op_errstr@entry=0x7f8cec409910, selected=selected@entry=0x7f8cec409850, rsp_dict=rsp_dict@entry=0x7f8ce40177c0) at glusterd-op-sm.c:7645 #10 0x00007f8cfd2f42af in gd_brick_op_phase (op=GD_OP_STOP_VOLUME, op_ctx=op_ctx@entry=0x7f8ce4005a00, req_dict=0x7f8ce4000c50, op_errstr=op_errstr@entry=0x7f8cec409910) at glusterd-syncop.c:1685 #11 0x00007f8cfd2f4d33 in gd_sync_task_begin (op_ctx=op_ctx@entry=0x7f8ce4005a00, req=req@entry=0x7f8cec0018b0) at glusterd-syncop.c:1937 #12 0x00007f8cfd2f5030 in glusterd_op_begin_synctask (req=req@entry=0x7f8cec0018b0, ---Type <return> to continue, or q <return> to quit--- op=op@entry=GD_OP_STOP_VOLUME, dict=0x7f8ce4005a00) at glusterd-syncop.c:2006 #13 0x00007f8cfd2dc47f in __glusterd_handle_cli_stop_volume ( req=req@entry=0x7f8cec0018b0) at glusterd-volume-ops.c:628 #14 0x00007f8cfd23bfde in glusterd_big_locked_handler (req=0x7f8cec0018b0, actor_fn=0x7f8cfd2dc280 <__glusterd_handle_cli_stop_volume>) at glusterd-handler.c:81 #15 0x00007f8d087526d0 in synctask_wrap (old_task=<optimized out>) at syncop.c:375 #16 0x00007f8d06e13110 in ?? () from /lib64/libc.so.6 #17 0x0000000000000000 in ?? () (gdb) (gdb) f 8 #8 glusterd_bricks_select_stop_volume (dict=dict@entry=0x7f8ce4000c50, op_errstr=op_errstr@entry=0x7f8cec409910, selected=selected@entry=0x7f8cec409850) at glusterd-op-sm.c:6182 6182 GLUSTERD_GET_BRICK_PIDFILE (pidfile, volinfo, (gdb) p sizeof(pidfile) $1 = 1024 #define GLUSTERD_GET_BRICK_PIDFILE(pidfile,volinfo,brickinfo, priv) do { \ char exp_path[PATH_MAX] = {0,}; \ char volpath[PATH_MAX] = {0,}; \ GLUSTERD_GET_VOLUME_DIR (volpath, volinfo, priv); \ GLUSTERD_REMOVE_SLASH_FROM_PATH (brickinfo->path, exp_path); \ snprintf (pidfile, PATH_MAX, "%s/run/%s-%s.pid", \ volpath, brickinfo->hostname, exp_path); \ } >>>>>>>>>>>>>>>>>>>>>>> RCA: glusterd is crashed because pidfile array size is 1024 and data copying by the function GLUSTERD_GET_BRICK_PIDFILE length is PATH_MAX(4096) , because of overflowing the array glusterd is crashed. After increase the size of array in parent function (glusterd_bricks_select_stop_volume) issue will resolve. Regards Mohit Agrawal --- Additional comment from Worker Ant on 2017-02-08 01:51:46 EST --- REVIEW: https://review.gluster.org/16560 (glusterd: glusterd is crashed at the time of stop volume) posted (#1) for review on master by MOHIT AGRAWAL (moagrawa) --- Additional comment from Worker Ant on 2017-02-08 03:48:21 EST --- REVIEW: https://review.gluster.org/16560 (glusterd: glusterd is crashed at the time of stop volume) posted (#2) for review on master by MOHIT AGRAWAL (moagrawa) --- Additional comment from Worker Ant on 2017-02-08 11:48:04 EST --- COMMIT: https://review.gluster.org/16560 committed in master by Atin Mukherjee (amukherj) ------ commit 9ac193a19b0ca6d6548aeafa5c915b26396f8697 Author: Mohit Agrawal <moagrawa> Date: Wed Feb 8 12:20:55 2017 +0530 glusterd: glusterd is crashed at the time of stop volume Problem: glusterd is crashed at the time of stop volume due to overflow of pidfile array after build rpm with default options. Solution: To avoid the crash update the pidfile array size. Test: To test the patch followed below procedure 1) Setup 1*2 environment and start the volume 2) Stop the volume Before apply the patch glusterd is crashed. Note: The crash is happened only after build rpm with rpmbuild -ba <spec> because _FORTIFY_SOURCE is enabled. This option tries to figure out possible overflow scenarios like the bug here and crash the process. BUG: 1420202 Change-Id: I58a006bc0727843a7ed02a10b4ebd5dca39eae67 Signed-off-by: Mohit Agrawal <moagrawa> Reviewed-on: https://review.gluster.org/16560 NetBSD-regression: NetBSD Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: N Balachandran <nbalacha> Reviewed-by: Atin Mukherjee <amukherj> CentOS-regression: Gluster Build System <jenkins.org> --- Additional comment from Atin Mukherjee on 2017-02-08 12:03:01 EST --- --- Additional comment from Atin Mukherjee on 2017-02-08 12:03:44 EST ---
REVIEW: https://review.gluster.org/16571 (glusterd: glusterd is crashed at the time of stop volume) posted (#1) for review on release-3.10 by MOHIT AGRAWAL (moagrawa)
COMMIT: https://review.gluster.org/16571 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 94374de6984da45e668d81a3794aa8137aabeb0f Author: Mohit Agrawal <moagrawa> Date: Wed Feb 8 12:20:55 2017 +0530 glusterd: glusterd is crashed at the time of stop volume Problem: glusterd is crashed at the time of stop volume due to overflow of pidfile array after build rpm with default options. Solution: To avoid the crash update the pidfile array size. Test: To test the patch followed below procedure 1) Setup 1*2 environment and start the volume 2) Stop the volume Before apply the patch glusterd is crashed. Note: The crash is happened only after build rpm with rpmbuild -ba <spec> because _FORTIFY_SOURCE is enabled. This option tries to figure out possible overflow scenarios like the bug here and crash the process. > BUG: 1420202 > Change-Id: I58a006bc0727843a7ed02a10b4ebd5dca39eae67 > Signed-off-by: Mohit Agrawal <moagrawa> > Reviewed-on: https://review.gluster.org/16560 > NetBSD-regression: NetBSD Build System <jenkins.org>1420606 > Smoke: Gluster Build System <jenkins.org> > Reviewed-by: N Balachandran <nbalacha> > Reviewed-by: Atin Mukherjee <amukherj> > CentOS-regression: Gluster Build System <jenkins.org> > (cherry picked from commit 9ac193a19b0ca6d6548aeafa5c915b26396f8697) Change-Id: I4bcfe830e789a9b95fbcace4495bfda17dca0269 BUG: 1420606 Reviewed-on: https://review.gluster.org/16571 Tested-by: MOHIT AGRAWAL <moagrawa> Reviewed-by: Atin Mukherjee <amukherj> Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org>
*** Bug 1422375 has been marked as a duplicate of this bug. ***
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report. glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html [2] https://www.gluster.org/pipermail/gluster-users/