+++ This bug was initially created as a clone of Bug #878004 +++ Description of problem: Program terminated with signal 11, Segmentation fault. #0 0x00007f07d836f923 in rpc_transport_submit_request (this=0xbabeb81e, req=0x13fd6c0) at rpc-transport.c:355 355 GF_VALIDATE_OR_GOTO("rpc_transport", this->ops, fail); Missing separate debuginfos, use: debuginfo-install glibc-2.15-37.fc17.x86_64 keyutils-libs-1.5.5-2.fc17.x86_64 krb5-libs-1.10-5.fc17.x86_64 libcom_err-1.42-4.fc17.x86_64 libgcc-4.7.0-5.fc17.x86_64 libselinux-2.1.10-3.fc17.x86_64 libxml2-2.7.8-7.fc17.x86_64 openssl-1.0.0j-1.fc17.x86_64 zlib-1.2.5-6.fc17.x86_64 (gdb) bt #0 0x00007f07d836f923 in rpc_transport_submit_request (this=0xbabeb81e, req=0x13fd6c0) at rpc-transport.c:355 #1 0x00007f07d836b8ee in rpcsvc_callback_submit (rpc=0xfeea60, trans=0xbabeb81e, prog=0x7f07d55fed50, procnum=1, proghdr=0x0, proghdrcount=0) at rpcsvc.c:882 #2 0x00007f07d536463a in glusterd_fetchspec_notify (this=0xff2c00) at glusterd.c:130 #3 0x00007f07d53b48b9 in glusterd_create_volfiles_and_notify_services (volinfo=0x7f07c8001570) at glusterd-volgen.c:3323 #4 0x00007f07d53d202f in glusterd_op_remove_brick (dict=0x7f07d6fef0b8, op_errstr=0x13fea20) at glusterd-brick-ops.c:1551 #5 0x00007f07d5384fab in glusterd_op_commit_perform (op=GD_OP_REMOVE_BRICK, dict=0x7f07d6fef0b8, op_errstr=0x13fea20, rsp_dict=0x7f07d6ff0008) at glusterd-op-sm.c:3163 #6 0x00007f07d53d5a73 in gd_sync_task_begin (op_ctx=0x7f07d6fef518, req=0x7f07d52e402c) at glusterd-syncop.c:542 #7 0x00007f07d53d5cc3 in glusterd_op_begin_synctask (req=0x7f07d52e402c, op=GD_OP_REMOVE_BRICK, dict=0x7f07d6fef518) at glusterd-syncop.c:604 #8 0x00007f07d53d03c4 in glusterd_handle_remove_brick (req=0x7f07d52e402c) at glusterd-brick-ops.c:806 #9 0x00007f07d85d0a19 in synctask_wrap (old_task=0xffe930) at syncop.c:129 #10 0x00000035d7245f30 in ?? () from /lib64/libc.so.6 #11 0x0000000000000000 in ?? () (gdb) p this $1 = (rpc_transport_t *) 0xbabeb81e (gdb) p this->name Cannot access memory at address 0xbabeb88e Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
CHANGE: http://review.gluster.org/4241 (glusterd: Protected conf->xprt_list racy access.) merged in master by Anand Avati (avati)
happened in a commit only in rhs2.1 branch of codepath, now fixed. not valid for rhs2.0
Pranith, Whether performing remove-brick of a brick on any volume type will cause glusterd segfaults ? Can you please provide the details on re-creating the issue.
I have only seen the back-trace while running some scripts, please check with kp, to see the reason why it happened and the valid steps.
As per comment #5, setting needinfo on assignee.
Steps to reproduce: 1) Create a volume with at least 2 bricks. 2) Start the volume. 3) Mount the volume using FUSE/NFS. 4) Remove one or more bricks from the volume.
Verified the fix on : ======================= root@king [Jul-08-2013-18:53:21] >gluster --version glusterfs 3.4.0.12rhs.beta3 built on Jul 6 2013 14:35:18 root@king [Jul-08-2013-18:53:46] >rpm -qa | grep glusterfs-server glusterfs-server-3.4.0.12rhs.beta3-1.el6rhs.x86_64 Verification steps: ================== 1. Created distribute volume with 4 bricks 2. Started the volume 3. Created fuse/nfs mount. Created files from mount point 4. Removed one brick. Also , verified the same steps with distribute-replicate volume. Result: ======= glusterd doesn't segfaults
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html