Bug 1260930

Summary: Glusterd crashed while enabling/disabling heal on an ec volume
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Bhaskarakiran <byarlaga>
Component: glusterdAssignee: Satish Mohan <smohan>
Status: CLOSED WONTFIX QA Contact: storage-qa-internal <storage-qa-internal>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: amukherj, mzywusko, nlevinki, rcyriac, sankarshan, sasundar, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-08 13:26:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1259992    
Bug Blocks:    
Attachments:
Description Flags
core file none

Description Bhaskarakiran 2015-09-08 09:04:52 UTC
Description of problem:
=======================

Seeing the glusterd crashes while enabling or disabling the heal when the IO is in progress on the fuse mount.

Backtrace :
===========

(gdb) bt
#0  0x00007f6016a5ff8b in __strcmp_sse42 () from /lib64/libc.so.6
#1  0x00007f600cd691e7 in glusterd_check_client_op_version_support (
    volname=0x7f5fed089860 "gfsa", op_version=op_version@entry=30703, 
    op_errstr=op_errstr@entry=0x7f5ff441e710) at glusterd-utils.c:9930
#2  0x00007f600cd3e7f7 in glusterd_op_stage_set_volume (
    dict=dict@entry=0x7f5fec69d27c, 
    op_errstr=op_errstr@entry=0x7f5ff441e710) at glusterd-op-sm.c:1306
#3  0x00007f600cd412fb in glusterd_op_stage_validate (
    op=GD_OP_SET_VOLUME, dict=dict@entry=0x7f5fec69d27c, 
    op_errstr=op_errstr@entry=0x7f5ff441e710, 
    rsp_dict=rsp_dict@entry=0x7f5fecf87f5c) at glusterd-op-sm.c:5406
#4  0x00007f600cd4147f in glusterd_op_ac_stage_op (
    event=0x7f5fed0d8af0, ctx=0x7f5fed097da0) at glusterd-op-sm.c:5164
#5  0x00007f600cd47a4f in glusterd_op_sm () at glusterd-op-sm.c:7371
#6  0x00007f600cd2e9ab in __glusterd_handle_stage_op (
    req=req@entry=0x7f601859106c) at glusterd-handler.c:1022
#7  0x00007f600cd2cc00 in glusterd_big_locked_handler (
    req=0x7f601859106c, 
    actor_fn=0x7f600cd2e6c0 <__glusterd_handle_stage_op>)
    at glusterd-handler.c:83
#8  0x00007f60182b7102 in synctask_wrap (old_task=<optimized out>)
    at syncop.c:381
#9  0x00007f60169750f0 in ?? () from /lib64/libc.so.6
#10 0x0000000000000000 in ?? ()
(gdb) 


Version-Release number of selected component (if applicable):
=============================================================
3.7.1-14

[root@transformers ~]# gluster --version
glusterfs 3.7.1 built on Aug 31 2015 23:59:02
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.


How reproducible:
=================
seen twice

Steps to Reproduce:
1. Create an ec volume and enable uss,snapshot and heal 
2. Fuse mount on the client and start the IO
3. Check the heal info command while the IO is in progress and try to disable the heal. 

Actual results:
==============
Glusterd crash


Expected results:
=================
no crashes


Additional info:
================
Attaching the core file.

Comment 2 Bhaskarakiran 2015-09-08 09:06:05 UTC
Created attachment 1071253 [details]
core file

Comment 3 Bhaskarakiran 2015-09-09 05:53:18 UTC
There is one more crash while enabling / disabling heal on an ec volume without any load from the client. Heal was in progress.

Backtrace:

(gdb) bt
#0  rpc_transport_submit_request (this=0x7f8a144b4fc0, 
    req=0x7f8a18833d90) at rpc-transport.c:399
#1  0x00007f8a30d709e9 in rpcsvc_callback_submit (rpc=<optimized out>, 
    trans=trans@entry=0x7f8a144b4fc0, 
    prog=prog@entry=0x7f8a25d95410 <glusterd_cbk_prog>, 
    procnum=procnum@entry=1, proghdr=proghdr@entry=0x0, 
    proghdrcount=proghdrcount@entry=0) at rpcsvc.c:1080
#2  0x00007f8a25a47f8d in glusterd_fetchspec_notify (
    this=<optimized out>) at glusterd.c:247
#3  0x00007f8a25ac6fde in glusterd_create_volfiles_and_notify_services (
    volinfo=<optimized out>) at glusterd-volgen.c:5464
#4  0x00007f8a25a79cec in glusterd_op_set_volume (
    errstr=0x7f8a18834880, dict=0x7f8a0b8e80ac) at glusterd-op-sm.c:2598
#5  glusterd_op_commit_perform (op=op@entry=GD_OP_SET_VOLUME, 
    dict=dict@entry=0x7f8a0b8e80ac, 
    op_errstr=op_errstr@entry=0x7f8a18834880, 
    rsp_dict=rsp_dict@entry=0x7f8a0b4d229c) at glusterd-op-sm.c:5530
#6  0x00007f8a25b00a09 in gd_commit_op_phase (op=GD_OP_SET_VOLUME, 
    op_ctx=op_ctx@entry=0x7f8a0b8e80ac, req_dict=0x7f8a0b8e80ac, 
    op_errstr=op_errstr@entry=0x7f8a18834880, 
    txn_opinfo=txn_opinfo@entry=0x7f8a188348a0)
    at glusterd-syncop.c:1365
#7  0x00007f8a25b02139 in gd_sync_task_begin (
    op_ctx=op_ctx@entry=0x7f8a0b8e80ac, req=req@entry=0x7f8a32b6501c)
    at glusterd-syncop.c:1882
#8  0x00007f8a25b022b0 in glusterd_op_begin_synctask (
    req=req@entry=0x7f8a32b6501c, op=op@entry=GD_OP_SET_VOLUME, 
    dict=dict@entry=0x7f8a0b8e80ac) at glusterd-syncop.c:1945
#9  0x00007f8a25aec20e in glusterd_handle_heal_enable_disable (
    volinfo=<optimized out>, dict=0x7f8a0b8e80ac, req=0x7f8a32b6501c)
    at glusterd-volume-ops.c:732
#10 __glusterd_handle_cli_heal_volume (req=req@entry=0x7f8a32b6501c)
    at glusterd-volume-ops.c:802
#11 0x00007f8a25a62c00 in glusterd_big_locked_handler (
    req=0x7f8a32b6501c, 
    actor_fn=0x7f8a25aebd60 <__glusterd_handle_cli_heal_volume>)
    at glusterd-handler.c:83
#12 0x00007f8a30fed102 in synctask_wrap (old_task=<optimized out>)
    at syncop.c:381
#13 0x00007f8a2f6ab0f0 in ?? () from /lib64/libc.so.6
#14 0x0000000000000000 in ?? ()
(gdb) q

Comment 7 Atin Mukherjee 2017-02-08 13:26:32 UTC
This crash was observed when ping time out was enabled for GlusterD to GlusterD communication. We don't have any future plan to enable this option back and hence closing this bug.