Bug 1471687 - [Ganesha] : Ganesha crashed within seconds post failover/failback in gsh_free(),possible memory corruption.
[Ganesha] : Ganesha crashed within seconds post failover/failback in gsh_free...
Status: NEW
Product: nfs-ganesha
Classification: Community
Component: NFS (Show other bugs)
2.4
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Frank Filz
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-17 05:15 EDT by Ambarish
Modified: 2017-07-17 05:17 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ambarish 2017-07-17 05:15:58 EDT
4 node cluster,4 clients mounted the volume via v4 and were running kernel untar in separate directories.

I was simulating failovers/failbacks by killing and restarting nfs-ganesha service on random nodes.

When IO resumed post failover,I saw that Ganesha crashed on one of my nodes with the following BT :

(gdb) bt
#0  __GI___libc_free (mem=0x6000000000000) at malloc.c:2933
#1  0x00007f79e7c79d76 in gsh_free (p=<optimized out>)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/include/abstract_mem.h:271
#2  glusterfs_close_my_fd (my_fd=0x7f7698090002)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/FSAL_GLUSTER/handle.c:1088
#3  0x00007f79e7c7b1ba in glusterfs_open2 (obj_hdl=0x7f76980d8ec0, state=0x7f7698090ee0, openflags=<optimized out>, 
    createmode=FSAL_EXCLUSIVE, name=<optimized out>, attrib_set=<optimized out>, 
    verifier=0x7f796df616c0 "TK)\001\070o", new_obj=0x7f796df61340, attrs_out=0x7f796df61350, 
    caller_perm_check=0x7f796df614bf) at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/FSAL_GLUSTER/handle.c:1443
#4  0x00005640a643a1ef in mdcache_open2 (obj_hdl=0x7f7710139728, state=0x7f7698090ee0, openflags=<optimized out>, 
    createmode=FSAL_EXCLUSIVE, name=0x0, attrs_in=0x7f796df615e0, verifier=0x7f796df616c0 "TK)\001\070o", 
    new_obj=0x7f796df61580, attrs_out=0x0, caller_perm_check=0x7f796df614bf)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_file.c:657
#5  0x00005640a636fcbb in fsal_open2 (in_obj=0x7f7710139728, state=0x7f7698090ee0, openflags=openflags@entry=2, 
    createmode=createmode@entry=FSAL_EXCLUSIVE, name=<optimized out>, attr=attr@entry=0x7f796df615e0, 
    verifier=verifier@entry=0x7f796df616c0 "TK)\001\070o", obj=obj@entry=0x7f796df61580, 
    attrs_out=attrs_out@entry=0x0) at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/fsal_helper.c:1846
#6  0x00005640a635b350 in open4_ex (arg=arg@entry=0x7f7924182008, data=data@entry=0x7f796df62180, 
    res_OPEN4=res_OPEN4@entry=0x7f76980f5e38, clientid=<optimized out>, owner=0x7f770007a440, 
    file_state=file_state@entry=0x7f796df61fa0, new_state=new_state@entry=0x7f796df61f8f)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_op_open.c:1441
#7  0x00005640a63a3469 in nfs4_op_open (op=0x7f7924182000, data=0x7f796df62180, resp=0x7f76980f5e30)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_op_open.c:1845
#8  0x00005640a639597d in nfs4_Compound (arg=<optimized out>, req=<optimized out>, res=0x7f7698035350)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_Compound.c:734
#9  0x00005640a6386b1c in nfs_rpc_execute (reqdata=reqdata@entry=0x7f79240008c0)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1281
#10 0x00005640a638818a in worker_run (ctx=0x5640a6e38e70)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1548
#11 0x00005640a6411889 in fridgethr_start_routine (arg=0x5640a6e38e70)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/support/fridgethr.c:550
---Type <return> to continue, or q <return> to quit---
#12 0x00007f79eabdde25 in start_thread (arg=0x7f796df63700) at pthread_create.c:308
#13 0x00007f79ea2ab34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb)

Version-Release number of selected component (if applicable):
-------------------------------------------------------------

glusterfs-ganesha-3.8.4-33.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.4-14.el7rhgs.x86_64


How reproducible:
-----------------

This was the first occurrence.

Note You need to log in before you can comment on or make changes to this bug.