Bug 878873 - glusterd segfaults in remove brick
glusterd segfaults in remove brick
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.0
Unspecified Unspecified
high Severity unspecified
: ---
: ---
Assigned To: krishnan parthasarathi
spandura
:
Depends On: 878004
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-21 07:57 EST by Vidya Sakar
Modified: 2015-11-03 18:05 EST (History)
10 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0qa5-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 878004
Environment:
Last Closed: 2013-09-23 18:39:21 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Vidya Sakar 2012-11-21 07:57:41 EST
+++ This bug was initially created as a clone of Bug #878004 +++

Description of problem:
Program terminated with signal 11, Segmentation fault.
#0  0x00007f07d836f923 in rpc_transport_submit_request (this=0xbabeb81e, req=0x13fd6c0) at rpc-transport.c:355
355		GF_VALIDATE_OR_GOTO("rpc_transport", this->ops, fail);
Missing separate debuginfos, use: debuginfo-install glibc-2.15-37.fc17.x86_64 keyutils-libs-1.5.5-2.fc17.x86_64 krb5-libs-1.10-5.fc17.x86_64 libcom_err-1.42-4.fc17.x86_64 libgcc-4.7.0-5.fc17.x86_64 libselinux-2.1.10-3.fc17.x86_64 libxml2-2.7.8-7.fc17.x86_64 openssl-1.0.0j-1.fc17.x86_64 zlib-1.2.5-6.fc17.x86_64
(gdb) bt
#0  0x00007f07d836f923 in rpc_transport_submit_request (this=0xbabeb81e, req=0x13fd6c0) at rpc-transport.c:355
#1  0x00007f07d836b8ee in rpcsvc_callback_submit (rpc=0xfeea60, trans=0xbabeb81e, prog=0x7f07d55fed50, procnum=1, proghdr=0x0, proghdrcount=0) at rpcsvc.c:882
#2  0x00007f07d536463a in glusterd_fetchspec_notify (this=0xff2c00) at glusterd.c:130
#3  0x00007f07d53b48b9 in glusterd_create_volfiles_and_notify_services (volinfo=0x7f07c8001570) at glusterd-volgen.c:3323
#4  0x00007f07d53d202f in glusterd_op_remove_brick (dict=0x7f07d6fef0b8, op_errstr=0x13fea20) at glusterd-brick-ops.c:1551
#5  0x00007f07d5384fab in glusterd_op_commit_perform (op=GD_OP_REMOVE_BRICK, dict=0x7f07d6fef0b8, op_errstr=0x13fea20, rsp_dict=0x7f07d6ff0008) at glusterd-op-sm.c:3163
#6  0x00007f07d53d5a73 in gd_sync_task_begin (op_ctx=0x7f07d6fef518, req=0x7f07d52e402c) at glusterd-syncop.c:542
#7  0x00007f07d53d5cc3 in glusterd_op_begin_synctask (req=0x7f07d52e402c, op=GD_OP_REMOVE_BRICK, dict=0x7f07d6fef518) at glusterd-syncop.c:604
#8  0x00007f07d53d03c4 in glusterd_handle_remove_brick (req=0x7f07d52e402c) at glusterd-brick-ops.c:806
#9  0x00007f07d85d0a19 in synctask_wrap (old_task=0xffe930) at syncop.c:129
#10 0x00000035d7245f30 in ?? () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
(gdb) p this
$1 = (rpc_transport_t *) 0xbabeb81e
(gdb) p this->name
Cannot access memory at address 0xbabeb88e


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 2 Vijay Bellur 2012-11-28 19:27:13 EST
CHANGE: http://review.gluster.org/4241 (glusterd: Protected conf->xprt_list racy access.) merged in master by Anand Avati (avati@redhat.com)
Comment 3 Amar Tumballi 2012-11-29 05:17:03 EST
happened in a commit only in rhs2.1 branch of codepath, now fixed. not valid for rhs2.0
Comment 4 spandura 2013-01-04 01:11:29 EST
Pranith, 

Whether performing remove-brick of a brick on any volume type will cause glusterd segfaults ? 

Can you please provide the details on re-creating the issue.
Comment 5 Pranith Kumar K 2013-01-06 22:54:24 EST
I have only seen the back-trace while running some scripts, please check with kp, to see the reason why it happened and the valid steps.
Comment 6 Gowrishankar Rajaiyan 2013-01-07 00:11:47 EST
As per comment #5, setting needinfo on assignee.
Comment 7 krishnan parthasarathi 2013-01-07 01:22:07 EST
Steps to reproduce:

1) Create a volume with at least 2 bricks.
2) Start the volume.
3) Mount the volume using FUSE/NFS.
4) Remove one or more bricks from the volume.
Comment 10 spandura 2013-07-09 03:13:40 EDT
Verified the fix on :
=======================

root@king [Jul-08-2013-18:53:21] >gluster --version
glusterfs 3.4.0.12rhs.beta3 built on Jul  6 2013 14:35:18

root@king [Jul-08-2013-18:53:46] >rpm -qa | grep glusterfs-server
glusterfs-server-3.4.0.12rhs.beta3-1.el6rhs.x86_64

Verification steps:
==================
1. Created distribute volume with 4 bricks
2. Started the volume
3. Created fuse/nfs mount. Created files from mount point
4. Removed one brick.

Also , verified the same steps with distribute-replicate volume. 

Result:
=======
glusterd doesn't segfaults
Comment 11 Scott Haines 2013-09-23 18:39:21 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html
Comment 12 Scott Haines 2013-09-23 18:43:43 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.