Bug 1467074 - [Ganesha] : Ganesha crashed during IO from multiple v4 mounts (getattr path),possible memory corruption
[Ganesha] : Ganesha crashed during IO from multiple v4 mounts (getattr path),...
Status: CLOSED WONTFIX
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: nfs-ganesha (Show other bugs)
3.3
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Kaleb KEITHLEY
Ambarish
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-02 08:44 EDT by Ambarish
Modified: 2017-08-10 03:10 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-10 03:10:36 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ambarish 2017-07-02 08:44:06 EDT
Description of problem:
------------------------

4 node cluster,4 clients(1:1 mount via v4)

Workload : dbench,kernel untar

Ganesha crashed on one of my nodes and dumped the following core :

getattr path?

<BT>

(gdb) bt
#0  0x00007f6bcd09c1f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f6bcd09d8e8 in __GI_abort () at abort.c:90
#2  0x00007f6bcd0dbf47 in __libc_message (do_abort=2, fmt=fmt@entry=0x7f6bcd1e8608 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196
#3  0x00007f6bcd0e29e3 in malloc_printerr (ar_ptr=0x7f6860000020, ptr=0x7f686003cc20, str=0x7f6bcd1e5cb7 "corrupted double-linked list", action=<optimized out>) at malloc.c:5023
#4  malloc_consolidate (av=av@entry=0x7f6860000020) at malloc.c:4164
#5  0x00007f6bcd0e43a5 in _int_malloc (av=av@entry=0x7f6860000020, bytes=bytes@entry=1024) at malloc.c:3446
#6  0x00007f6bcd0e710c in __GI___libc_malloc (bytes=bytes@entry=1024) at malloc.c:2897
#7  0x000055e7c5fea2f7 in gsh_malloc__ (function=0x55e7c6086630 <__func__.23360> "nfs4_FSALattr_To_Fattr", line=3416, 
    file=0x55e7c6086ee0 "/builddir/build/BUILD/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs_proto_tools.c", n=1024) at /usr/src/debug/nfs-ganesha-2.4.4/src/include/abstract_mem.h:78
#8  nfs4_FSALattr_To_Fattr (args=args@entry=0x7f6b19740f00, Bitmap=Bitmap@entry=0x7f696416e068, Fattr=Fattr@entry=0x7f6860098d60) at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs_proto_tools.c:3416
#9  0x000055e7c5fea88e in file_To_Fattr (data=data@entry=0x7f6b19741180, request_mask=1433550, attr=attr@entry=0x7f6b19740fd0, Fattr=Fattr@entry=0x7f6860098d60, Bitmap=Bitmap@entry=0x7f696416e068)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs_proto_tools.c:3319
#10 0x000055e7c5fc7b02 in nfs4_op_getattr (op=0x7f696416e060, data=0x7f6b19741180, resp=0x7f6860098d50) at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_op_getattr.c:140
#11 0x000055e7c5fc297d in nfs4_Compound (arg=<optimized out>, req=<optimized out>, res=0x7f6860093b10) at /usr/src/debug/nfs-ganesha-2.4.4/src/Protocols/NFS/nfs4_Compound.c:734
#12 0x000055e7c5fb3b1c in nfs_rpc_execute (reqdata=reqdata@entry=0x7f696410ee20) at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1281
#13 0x000055e7c5fb518a in worker_run (ctx=0x55e7ca2b4790) at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1548
#14 0x000055e7c603e889 in fridgethr_start_routine (arg=0x55e7ca2b4790) at /usr/src/debug/nfs-ganesha-2.4.4/src/support/fridgethr.c:550
#15 0x00007f6bcda91e25 in start_thread (arg=0x7f6b19742700) at pthread_create.c:308
#16 0x00007f6bcd15f34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) 


</BT>

Version-Release number of selected component (if applicable):
-------------------------------------------------------------

nfs-ganesha-debuginfo-2.4.4-10.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-32.el7rhgs.x86_64


How reproducible:
-----------------

1/1



Additional info:
----------------

Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 6ade5657-45e2-43c7-8098-774417789a5e
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0
Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1
Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2
Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
client.event-threads: 4
server.event-threads: 4
cluster.lookup-optimize: on
ganesha.enable: on
features.cache-invalidation: on
server.allow-insecure: on
performance.stat-prefetch: off
transport.address-family: inet
nfs.disable: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable
[root@gqas013 tmp]#
Comment 4 Daniel Gryniewicz 2017-07-05 08:57:35 EDT
Definitely memory corruption.  This one is not on gqas007.  Can't look more, as no binaries installed.

Note You need to log in before you can comment on or make changes to this bug.