Bug 878004

Summary: glusterd segfaults in remove brick
Product: [Community] GlusterFS Reporter: Pranith Kumar K <pkarampu>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: medium    
Version: mainlineCC: amarts, gluster-bugs, nsathyan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 878873 (view as bug list) Environment:
Last Closed: 2013-07-24 17:30:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 878873    

Description Pranith Kumar K 2012-11-19 12:34:53 UTC
Description of problem:
Program terminated with signal 11, Segmentation fault.
#0  0x00007f07d836f923 in rpc_transport_submit_request (this=0xbabeb81e, req=0x13fd6c0) at rpc-transport.c:355
355		GF_VALIDATE_OR_GOTO("rpc_transport", this->ops, fail);
Missing separate debuginfos, use: debuginfo-install glibc-2.15-37.fc17.x86_64 keyutils-libs-1.5.5-2.fc17.x86_64 krb5-libs-1.10-5.fc17.x86_64 libcom_err-1.42-4.fc17.x86_64 libgcc-4.7.0-5.fc17.x86_64 libselinux-2.1.10-3.fc17.x86_64 libxml2-2.7.8-7.fc17.x86_64 openssl-1.0.0j-1.fc17.x86_64 zlib-1.2.5-6.fc17.x86_64
(gdb) bt
#0  0x00007f07d836f923 in rpc_transport_submit_request (this=0xbabeb81e, req=0x13fd6c0) at rpc-transport.c:355
#1  0x00007f07d836b8ee in rpcsvc_callback_submit (rpc=0xfeea60, trans=0xbabeb81e, prog=0x7f07d55fed50, procnum=1, proghdr=0x0, proghdrcount=0) at rpcsvc.c:882
#2  0x00007f07d536463a in glusterd_fetchspec_notify (this=0xff2c00) at glusterd.c:130
#3  0x00007f07d53b48b9 in glusterd_create_volfiles_and_notify_services (volinfo=0x7f07c8001570) at glusterd-volgen.c:3323
#4  0x00007f07d53d202f in glusterd_op_remove_brick (dict=0x7f07d6fef0b8, op_errstr=0x13fea20) at glusterd-brick-ops.c:1551
#5  0x00007f07d5384fab in glusterd_op_commit_perform (op=GD_OP_REMOVE_BRICK, dict=0x7f07d6fef0b8, op_errstr=0x13fea20, rsp_dict=0x7f07d6ff0008) at glusterd-op-sm.c:3163
#6  0x00007f07d53d5a73 in gd_sync_task_begin (op_ctx=0x7f07d6fef518, req=0x7f07d52e402c) at glusterd-syncop.c:542
#7  0x00007f07d53d5cc3 in glusterd_op_begin_synctask (req=0x7f07d52e402c, op=GD_OP_REMOVE_BRICK, dict=0x7f07d6fef518) at glusterd-syncop.c:604
#8  0x00007f07d53d03c4 in glusterd_handle_remove_brick (req=0x7f07d52e402c) at glusterd-brick-ops.c:806
#9  0x00007f07d85d0a19 in synctask_wrap (old_task=0xffe930) at syncop.c:129
#10 0x00000035d7245f30 in ?? () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
(gdb) p this
$1 = (rpc_transport_t *) 0xbabeb81e
(gdb) p this->name
Cannot access memory at address 0xbabeb88e


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Amar Tumballi 2012-11-29 10:15:39 UTC
http://review.gluster.org/4241

Comment 2 Anand Avati 2013-04-24 19:18:30 UTC
REVIEW: http://review.gluster.org/4885 (tests: Modified test to use remove-brick instead of 'start' variant) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)

Comment 3 Anand Avati 2013-04-30 07:32:37 UTC
COMMIT: http://review.gluster.org/4885 committed in master by Vijay Bellur (vbellur) 
------
commit f75be775a9b191eb74f6cb4c161d9af36f2fdc97
Author: Krishnan Parthasarathi <kparthas>
Date:   Thu Apr 25 00:28:07 2013 +0530

    tests: Modified test to use remove-brick instead of 'start' variant
    
    remove-brick start doesn't remove the brick from the volume immediately.
    It would wait until migration of data to other bricks are complete. Even
    when there is no data to be migrated, one can expect a finite delay from
    the time of remove-brick start command's exit and removal of brick(s).
    This may cause subsequent checks on brick count to fail in a
    non-deterministic manner.
    
    Also, renamed the test file name to reflect bug-id corresponding to
    community release.
    
    Change-Id: Ic43f011e251640decb68e46f4a10e0824ade0ac9
    BUG: 878004
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/4885
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Gluster Build System <jenkins.com>

Comment 4 Anand Avati 2013-05-21 07:30:42 UTC
REVIEW: http://review.gluster.org/5052 (tests: Modified test to use remove-brick instead of 'start' variant) posted (#1) for review on release-3.4 by Kaushal M (kaushal)

Comment 5 Anand Avati 2013-05-21 11:30:29 UTC
COMMIT: http://review.gluster.org/5052 committed in release-3.4 by Vijay Bellur (vbellur) 
------
commit 8abe8f794913daa81d9f1dc0ddab26c9c414abf8
Author: Kaushal M <kaushal>
Date:   Tue May 21 13:01:08 2013 +0530

    tests: Modified test to use remove-brick instead of 'start' variant
    
            Backport of change f75be77 from master
    
    remove-brick start doesn't remove the brick from the volume immediately.
    It would wait until migration of data to other bricks are complete. Even
    when there is no data to be migrated, one can expect a finite delay from
    the time of remove-brick start command's exit and removal of brick(s).
    This may cause subsequent checks on brick count to fail in a
    non-deterministic manner.
    
    Also, renamed the test file name to reflect bug-id corresponding to
    community release.
    
    BUG: 878004
    Change-Id: Ic6e1360ae5a5280d0d7efe8c3e9a0aa57dddb508
    Signed-off-by: Kaushal M <kaushal>
    Reviewed-on: http://review.gluster.org/5052
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>