Bug 1461521 - [Ganesha] : Ganesha crashed while running Bonnie from multiple clients.
[Ganesha] : Ganesha crashed while running Bonnie from multiple clients.
Status: ON_QA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: nfs-ganesha (Show other bugs)
x86_64 Linux
unspecified Severity high
: ---
: RHGS 3.4.0
Assigned To: Kaleb KEITHLEY
: ZStream
Depends On:
Blocks: 1503134
  Show dependency treegraph
Reported: 2017-06-14 12:39 EDT by Ambarish
Modified: 2018-01-30 15:29 EST (History)
13 users (show)

See Also:
Fixed In Version: nfs-ganesha-2.5.4-1
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Ambarish 2017-06-14 12:39:10 EDT
Description of problem:

2 Node cluster.

3 clients mounted a 2*2 volume via v4 and were running Bonnie++ ina searate working directory.

Not sure if this is related but I was also collecting sosreports(which may run heal info and other gluster cmds on backend).

Other than this,nothing was done on the nodes.

Ganesha crashed on one of my nodes and dumped a core :

(gdb) bt
#0  0x00007fccfa7661f7 in raise () from /lib64/libc.so.6
#1  0x00007fccfa7678e8 in abort () from /lib64/libc.so.6
#2  0x0000562bd0b1ff1a in nfs_dupreq_rele (req=0x7fc8de517818, func=<optimized out>)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/RPCAL/nfs_dupreq.c:1256
#3  0x0000562bd0aa48e1 in nfs_rpc_execute (reqdata=reqdata@entry=0x7fc8de5177f0)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1405
#4  0x0000562bd0aa618a in worker_run (ctx=0x562bd24b3e90) at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1548
#5  0x0000562bd0b2f889 in fridgethr_start_routine (arg=0x562bd24b3e90)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/support/fridgethr.c:550
#6  0x00007fccfb15be25 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fccfa82934d in clone () from /lib64/libc.so.6

Version-Release number of selected component (if applicable):

[root@gqas009 tmp]# rpm -qa|grep ganesha

How reproducible:

Additional info:

Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 3b04b36a-1837-48e8-b437-fbc091b2f992
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Brick1: gqas007.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0
Brick2: gqas009.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1
Brick3: gqas007.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2
Brick4: gqas009.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3
Options Reconfigured:
ganesha.enable: on
features.cache-invalidation: on
server.allow-insecure: on
performance.stat-prefetch: off
transport.address-family: inet
nfs.disable: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable
Comment 6 Daniel Gryniewicz 2017-06-15 09:29:29 EDT
Okay, so there are 9 DRC fixes in 2.5, which fix things like use-after-free, refcounting, and null deref.  So this is very likely fixed in 2.5.  Can we put this off until the 2.5 rebase?  Or do all these patches need to be backported?
Comment 9 Kaleb KEITHLEY 2017-08-16 08:53:14 EDT
fix is in nfs-ganesha-2.5.x
Comment 10 Kaleb KEITHLEY 2017-10-05 07:28:23 EDT
POST with rebase to nfs-ganesha-2.5.x

Note You need to log in before you can comment on or make changes to this bug.