Bug 1227677
Summary: | Glusterd crashes and cannot start after rebalance | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | lenz <lenz73> |
Component: | glusterd | Assignee: | Susant Kumar Palai <spalai> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.7.0 | CC: | amukherj, bugs, chris, gluster-bugs, lenz73, nbalacha, nsathyan |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.7.2 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-06-20 09:49:40 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
lenz
2015-06-03 09:52:12 UTC
I can confirm that when removing all RPMs and manually installing the RPMs for 3.6.3, it works like a charm. Could you attach the glusterd core file? Was able to reproduce the issue, but not on rebalance restart path. On 3.7.1 rpm glusterd gets crashed upon starting rebalance. Here is the bt. #0 0x0000003455232625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x0000003455233e05 in abort () at abort.c:92 #2 0x0000003455270537 in __libc_message (do_abort=2, fmt=0x34553575ef "*** %s ***: %s terminated\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198 #3 0x0000003455302527 in __fortify_fail (msg=0x3455357595 "buffer overflow detected") at fortify_fail.c:32 #4 0x0000003455300410 in __chk_fail () at chk_fail.c:29 #5 0x00000034552ffb0b in ___vsnprintf_chk (s=0x64c1 <Address 0x64c1 out of bounds>, maxlen=25797, flags=6, slen=18446744073709551615, format=0x0, args=0x3455291390) at vsnprintf_chk.c:39 #6 0x00000034552ff9da in ___snprintf_chk (s=<value optimized out>, maxlen=<value optimized out>, flags=<value optimized out>, slen=<value optimized out>, format=<value optimized out>) at snprintf_chk.c:36 #7 0x00007f836a20a061 in snprintf (volinfo=0x7f8364008b60, op_errstr=<value optimized out>, len=140202165164256, cmd=1, cbk=0, op=<value optimized out>) at /usr/include/bits/stdio2.h:65 #8 glusterd_handle_defrag_start (volinfo=0x7f8364008b60, op_errstr=<value optimized out>, len=140202165164256, cmd=1, cbk=0, op=<value optimized out>) at glusterd-rebalance.c:234 #9 0x00007f836a20ac97 in glusterd_op_rebalance (dict=0x7f8372d435e4, op_errstr=0x7f835c40d420, rsp_dict=<value optimized out>) at glusterd-rebalance.c:823 #10 0x00007f836a1c4e79 in glusterd_op_commit_perform (op=GD_OP_REBALANCE, dict=0x7f8372d435e4, op_errstr=0x7f835c40d420, rsp_dict=0x0) at glusterd-op-sm.c:5126 #11 0x00007f836a1c6202 in glusterd_op_ac_send_commit_op (event=0x7f83581bd5b0, ctx=<value optimized out>) at glusterd-op-sm.c:4301 #12 0x00007f836a1c0679 in glusterd_op_sm () at glusterd-op-sm.c:6869 #13 0x00007f836a20b01b in __glusterd_handle_defrag_volume (req=0x81e59c) at glusterd-rebalance.c:515 #14 0x00007f836a19ce6f in glusterd_big_locked_handler (req=0x81e59c, actor_fn=0x7f836a20ad40 <__glusterd_handle_defrag_volume>) at glusterd-handler.c:83 #15 0x00007f83744f45b2 in synctask_wrap (old_task=<value optimized out>) at syncop.c:375 #16 0x00000034552438f0 in ?? () from /lib64/libc-2.12.so #17 0x0000000000000000 in ?? () From the backtrace the information for frame 7 and 8 looks wrong as they both have the same arguments. And also the "parameter" looks fishy. Need more investigation to see whether it is a optimization issue or something else. Was able to reproduce the issue, but not on rebalance restart path. On 3.7.1 rpm glusterd gets crashed upon starting rebalance. Here is the bt. #0 0x0000003455232625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x0000003455233e05 in abort () at abort.c:92 #2 0x0000003455270537 in __libc_message (do_abort=2, fmt=0x34553575ef "*** %s ***: %s terminated\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198 #3 0x0000003455302527 in __fortify_fail (msg=0x3455357595 "buffer overflow detected") at fortify_fail.c:32 #4 0x0000003455300410 in __chk_fail () at chk_fail.c:29 #5 0x00000034552ffb0b in ___vsnprintf_chk (s=0x64c1 <Address 0x64c1 out of bounds>, maxlen=25797, flags=6, slen=18446744073709551615, format=0x0, args=0x3455291390) at vsnprintf_chk.c:39 #6 0x00000034552ff9da in ___snprintf_chk (s=<value optimized out>, maxlen=<value optimized out>, flags=<value optimized out>, slen=<value optimized out>, format=<value optimized out>) at snprintf_chk.c:36 #7 0x00007f836a20a061 in snprintf (volinfo=0x7f8364008b60, op_errstr=<value optimized out>, len=140202165164256, cmd=1, cbk=0, op=<value optimized out>) at /usr/include/bits/stdio2.h:65 #8 glusterd_handle_defrag_start (volinfo=0x7f8364008b60, op_errstr=<value optimized out>, len=140202165164256, cmd=1, cbk=0, op=<value optimized out>) at glusterd-rebalance.c:234 #9 0x00007f836a20ac97 in glusterd_op_rebalance (dict=0x7f8372d435e4, op_errstr=0x7f835c40d420, rsp_dict=<value optimized out>) at glusterd-rebalance.c:823 #10 0x00007f836a1c4e79 in glusterd_op_commit_perform (op=GD_OP_REBALANCE, dict=0x7f8372d435e4, op_errstr=0x7f835c40d420, rsp_dict=0x0) at glusterd-op-sm.c:5126 #11 0x00007f836a1c6202 in glusterd_op_ac_send_commit_op (event=0x7f83581bd5b0, ctx=<value optimized out>) at glusterd-op-sm.c:4301 #12 0x00007f836a1c0679 in glusterd_op_sm () at glusterd-op-sm.c:6869 #13 0x00007f836a20b01b in __glusterd_handle_defrag_volume (req=0x81e59c) at glusterd-rebalance.c:515 #14 0x00007f836a19ce6f in glusterd_big_locked_handler (req=0x81e59c, actor_fn=0x7f836a20ad40 <__glusterd_handle_defrag_volume>) at glusterd-handler.c:83 #15 0x00007f83744f45b2 in synctask_wrap (old_task=<value optimized out>) at syncop.c:375 #16 0x00000034552438f0 in ?? () from /lib64/libc-2.12.so #17 0x0000000000000000 in ?? () From the backtrace the information for frame 7 and 8 looks wrong as they both have the same arguments. And also the "parameter" looks fishy. Need more investigation to see whether it is a optimization issue or something else. Missed about the "len" parameter which is 140202165164256 and does not make sense here. Not sure, it's gdb which is giving wrong data here or actually the program received the wrong value. As we can see snprintf is crashing here, the only reason I can think of snprintf crashing here is "size" argument being greater than the buffer with which the target is created. Susant~ REVIEW: http://review.gluster.org/11090 (glusterd: Buffer overflow causing crash for glusterd) posted (#1) for review on master by Susant Palai (spalai) REVIEW: http://review.gluster.org/11091 (glusterd: Buffer overflow causing crash for glusterd) posted (#1) for review on release-3.7 by Susant Palai (spalai) REVIEW: http://review.gluster.org/11090 (glusterd: Buffer overflow causing crash for glusterd) posted (#2) for review on master by Susant Palai (spalai) REVIEW: http://review.gluster.org/11091 (glusterd: Buffer overflow causing crash for glusterd) posted (#2) for review on release-3.7 by Susant Palai (spalai) REVIEW: http://review.gluster.org/11091 (glusterd: Buffer overflow causing crash for glusterd) posted (#3) for review on release-3.7 by Atin Mukherjee (amukherj) COMMIT: http://review.gluster.org/11091 committed in release-3.7 by Vijay Bellur (vbellur) ------ commit f56b94d85ae5063ba9eb97c6ed07fc869f0e4b53 Author: Susant Palai <spalai> Date: Thu Jun 4 22:37:11 2015 +0530 glusterd: Buffer overflow causing crash for glusterd Backport of http://review.gluster.org/11090 Problem: In GLUSTERD_GET_DEFRAG_PROCESS we are using PATH_MAX (4096) as the max size of the input for target path, but we have allocated NAME_MAX (255) size of buffer for the target. Now this crash is not seen with source, but seen with RPMS. The reason is _foritfy_fail. This check happens when _FORTIFY_SOURCE is enabled. This option tries to figure out possible overflow scenarios like the bug here and does crash the process. BUG: 1227677 Change-Id: I50cf83cb60c640e46cc7a1a8d3a8321b9147fba9 Signed-off-by: Susant Palai <spalai> Reviewed-on: http://review.gluster.org/11091 Reviewed-by: Atin Mukherjee <amukherj> Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System <jenkins.org> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.2, please reopen this bug report. glusterfs-3.7.2 has been announced on the Gluster Packaging mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://www.gluster.org/pipermail/packaging/2015-June/000006.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user Info not needed anymore. |