Bug 1466704 - [Ganesha] : Ganesha crashed while running dbench
[Ganesha] : Ganesha crashed while running dbench
Status: CLOSED WONTFIX
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: nfs-ganesha (Show other bugs)
3.3
Unspecified Linux
unspecified Severity high
: ---
: ---
Assigned To: Kaleb KEITHLEY
Ambarish
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-30 05:46 EDT by Ambarish
Modified: 2017-08-10 03:08 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-10 03:08:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ambarish 2017-06-30 05:46:13 EDT
Description of problem:
-----------------------


2 node Ganesha HA cluster, 4 clients mounted a gluster volume via v4 and ran dbench in loop.

Ganesha crashed on one of my nodes.
The BT looks different from the one I opned in https://bugzilla.redhat.com/show_bug.cgi?id=1466700,though the use case is the same.

<BT>

(gdb) bt
#0  0x00007ff8553211f7 in raise () from /lib64/libc.so.6
#1  0x00007ff8553228e8 in abort () from /lib64/libc.so.6
#2  0x00007ff855360f47 in __libc_message () from /lib64/libc.so.6
#3  0x00007ff855366b54 in malloc_printerr () from /lib64/libc.so.6
#4  0x00007ff855369df7 in _int_malloc () from /lib64/libc.so.6
#5  0x00007ff85536c10c in malloc () from /lib64/libc.so.6
#6  0x0000563be257b10d in gsh_malloc__ (
    file=0x563be262ad60 "/builddir/build/BUILD/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_op_readdir.c", line=306, 
    function=<synthetic pointer>, n=12) at /usr/src/debug/nfs-ganesha-2.4.4/src/include/abstract_mem.h:78
#7  nfs4_readdir_callback (opaque=0x7ff7ca0e9b90, obj=0x7ff54806d928, attr=0x7ff7ca0e9d40, 
    mounted_on_fileid=12979408687758067389, cookie=<optimized out>, cb_state=<optimized out>)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_op_readdir.c:306
#8  0x0000563be253f829 in populate_dirent (name=<optimized out>, obj=0x7ff54806d928, 
    attrs=attrs@entry=0x7ff7ca0e9d40, dir_state=dir_state@entry=0x7ff7ca0e9e90, cookie=914646794536317759)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/fsal_helper.c:1321
#9  0x0000563be2607fef in mdcache_readdir (dir_hdl=0x7ff5a807f8a8, whence=<optimized out>, 
    dir_state=0x7ff7ca0e9e90, cb=0x563be253f7d0 <populate_dirent>, attrmask=122830, eod_met=0x7ff7ca0e9f5b)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:707
#10 0x0000563be25415dd in fsal_readdir (directory=directory@entry=0x7ff5a807f8a8, cookie=cookie@entry=0, 
    nbfound=nbfound@entry=0x7ff7ca0e9f5c, eod_met=eod_met@entry=0x7ff7ca0e9f5b, attrmask=122830, 
    cb=cb@entry=0x563be257af40 <nfs4_readdir_callback>, opaque=opaque@entry=0x7ff7ca0e9f60)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/fsal_helper.c:1505
#11 0x0000563be257bf0b in nfs4_op_readdir (op=0x7ff65c019690, data=0x7ff7ca0ea180, resp=0x7ff4d80181e0)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_op_readdir.c:631
#12 0x0000563be256897d in nfs4_Compound (arg=<optimized out>, req=<optimized out>, res=0x7ff4d80aae20)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_Compound.c:734
#13 0x0000563be2559b1c in nfs_rpc_execute (reqdata=reqdata@entry=0x7ff65c075880)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1281
#14 0x0000563be255b18a in worker_run (ctx=0x563be3e9dfc0)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1548
#15 0x0000563be25e4889 in fridgethr_start_routine (arg=0x563be3e9dfc0)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/support/fridgethr.c:550
#16 0x00007ff855d16e25 in start_thread () from /lib64/libpthread.so.0
#17 0x00007ff8553e434d in clone () from /lib64/libc.so.6
(gdb) 



</BT>


Version-Release number of selected component (if applicable):
------------------------------------------------------------

nfs-ganesha-2.4.4-10.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-31.el7rhgs.x86_64

How reproducible:
------------------

Fairly reproducible


Additional info:
----------------

[root@gqas014 tmp]# gluster v info
 
Volume Name: butcher
Type: Distributed-Disperse
Volume ID: 22c652d8-0754-438a-8131-373bad7c12ab
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x (4 + 2) = 24
Transport-type: tcp
Bricks:
Brick1: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick2: gqas007.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick3: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick4: gqas007.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick5: gqas014.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick6: gqas007.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick7: gqas014.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick8: gqas007.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick9: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick10: gqas007.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick11: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick12: gqas007.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick13: gqas014.sbu.lab.eng.bos.redhat.com:/bricks7/A1
Brick14: gqas007.sbu.lab.eng.bos.redhat.com:/bricks7/A1
Brick15: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Brick16: gqas007.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Brick17: gqas014.sbu.lab.eng.bos.redhat.com:/bricks9/A1
Brick18: gqas007.sbu.lab.eng.bos.redhat.com:/bricks9/A1
Brick19: gqas014.sbu.lab.eng.bos.redhat.com:/bricks10/A1
Brick20: gqas007.sbu.lab.eng.bos.redhat.com:/bricks10/A1
Brick21: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/A1
Brick22: gqas007.sbu.lab.eng.bos.redhat.com:/bricks11/A1
Brick23: gqas014.sbu.lab.eng.bos.redhat.com:/bricks12/A1
Brick24: gqas007.sbu.lab.eng.bos.redhat.com:/bricks12/A1
Options Reconfigured:
ganesha.enable: on
network.inode-lru-limit: 50000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
transport.address-family: inet
nfs.disable: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable
Comment 3 Daniel Gryniewicz 2017-06-30 11:18:02 EDT
This one is memory corruption.  Reproducing with ASAN or valgrind would be extremely helpful.

Note You need to log in before you can comment on or make changes to this bug.