Bug 878873 - glusterd segfaults in remove brick
Summary: glusterd segfaults in remove brick
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: 2.0
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: ---
Assignee: krishnan parthasarathi
QA Contact: spandura
URL:
Whiteboard:
Depends On: 878004
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-21 12:57 UTC by Vidya Sakar
Modified: 2015-11-03 23:05 UTC (History)
10 users (show)

Fixed In Version: glusterfs-3.4.0qa5-1
Doc Type: Bug Fix
Doc Text:
Clone Of: 878004
Environment:
Last Closed: 2013-09-23 22:39:21 UTC
Embargoed:


Attachments (Terms of Use)

Description Vidya Sakar 2012-11-21 12:57:41 UTC
+++ This bug was initially created as a clone of Bug #878004 +++

Description of problem:
Program terminated with signal 11, Segmentation fault.
#0  0x00007f07d836f923 in rpc_transport_submit_request (this=0xbabeb81e, req=0x13fd6c0) at rpc-transport.c:355
355		GF_VALIDATE_OR_GOTO("rpc_transport", this->ops, fail);
Missing separate debuginfos, use: debuginfo-install glibc-2.15-37.fc17.x86_64 keyutils-libs-1.5.5-2.fc17.x86_64 krb5-libs-1.10-5.fc17.x86_64 libcom_err-1.42-4.fc17.x86_64 libgcc-4.7.0-5.fc17.x86_64 libselinux-2.1.10-3.fc17.x86_64 libxml2-2.7.8-7.fc17.x86_64 openssl-1.0.0j-1.fc17.x86_64 zlib-1.2.5-6.fc17.x86_64
(gdb) bt
#0  0x00007f07d836f923 in rpc_transport_submit_request (this=0xbabeb81e, req=0x13fd6c0) at rpc-transport.c:355
#1  0x00007f07d836b8ee in rpcsvc_callback_submit (rpc=0xfeea60, trans=0xbabeb81e, prog=0x7f07d55fed50, procnum=1, proghdr=0x0, proghdrcount=0) at rpcsvc.c:882
#2  0x00007f07d536463a in glusterd_fetchspec_notify (this=0xff2c00) at glusterd.c:130
#3  0x00007f07d53b48b9 in glusterd_create_volfiles_and_notify_services (volinfo=0x7f07c8001570) at glusterd-volgen.c:3323
#4  0x00007f07d53d202f in glusterd_op_remove_brick (dict=0x7f07d6fef0b8, op_errstr=0x13fea20) at glusterd-brick-ops.c:1551
#5  0x00007f07d5384fab in glusterd_op_commit_perform (op=GD_OP_REMOVE_BRICK, dict=0x7f07d6fef0b8, op_errstr=0x13fea20, rsp_dict=0x7f07d6ff0008) at glusterd-op-sm.c:3163
#6  0x00007f07d53d5a73 in gd_sync_task_begin (op_ctx=0x7f07d6fef518, req=0x7f07d52e402c) at glusterd-syncop.c:542
#7  0x00007f07d53d5cc3 in glusterd_op_begin_synctask (req=0x7f07d52e402c, op=GD_OP_REMOVE_BRICK, dict=0x7f07d6fef518) at glusterd-syncop.c:604
#8  0x00007f07d53d03c4 in glusterd_handle_remove_brick (req=0x7f07d52e402c) at glusterd-brick-ops.c:806
#9  0x00007f07d85d0a19 in synctask_wrap (old_task=0xffe930) at syncop.c:129
#10 0x00000035d7245f30 in ?? () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
(gdb) p this
$1 = (rpc_transport_t *) 0xbabeb81e
(gdb) p this->name
Cannot access memory at address 0xbabeb88e


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 Vijay Bellur 2012-11-29 00:27:13 UTC
CHANGE: http://review.gluster.org/4241 (glusterd: Protected conf->xprt_list racy access.) merged in master by Anand Avati (avati)

Comment 3 Amar Tumballi 2012-11-29 10:17:03 UTC
happened in a commit only in rhs2.1 branch of codepath, now fixed. not valid for rhs2.0

Comment 4 spandura 2013-01-04 06:11:29 UTC
Pranith, 

Whether performing remove-brick of a brick on any volume type will cause glusterd segfaults ? 

Can you please provide the details on re-creating the issue.

Comment 5 Pranith Kumar K 2013-01-07 03:54:24 UTC
I have only seen the back-trace while running some scripts, please check with kp, to see the reason why it happened and the valid steps.

Comment 6 Gowrishankar Rajaiyan 2013-01-07 05:11:47 UTC
As per comment #5, setting needinfo on assignee.

Comment 7 krishnan parthasarathi 2013-01-07 06:22:07 UTC
Steps to reproduce:

1) Create a volume with at least 2 bricks.
2) Start the volume.
3) Mount the volume using FUSE/NFS.
4) Remove one or more bricks from the volume.

Comment 10 spandura 2013-07-09 07:13:40 UTC
Verified the fix on :
=======================

root@king [Jul-08-2013-18:53:21] >gluster --version
glusterfs 3.4.0.12rhs.beta3 built on Jul  6 2013 14:35:18

root@king [Jul-08-2013-18:53:46] >rpm -qa | grep glusterfs-server
glusterfs-server-3.4.0.12rhs.beta3-1.el6rhs.x86_64

Verification steps:
==================
1. Created distribute volume with 4 bricks
2. Started the volume
3. Created fuse/nfs mount. Created files from mount point
4. Removed one brick.

Also , verified the same steps with distribute-replicate volume. 

Result:
=======
glusterd doesn't segfaults

Comment 11 Scott Haines 2013-09-23 22:39:21 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 12 Scott Haines 2013-09-23 22:43:43 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html


Note You need to log in before you can comment on or make changes to this bug.