1461521 – [Ganesha] : Ganesha crashed while running Bonnie from multiple clients.

Bug 1461521 - [Ganesha] : Ganesha crashed while running Bonnie from multiple clients.

Summary: [Ganesha] : Ganesha crashed while running Bonnie from multiple clients.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	nfs-ganesha
Sub Component:
Version:	rhgs-3.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Kaleb KEITHLEY
QA Contact:	Manisha Saini
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1503134
TreeView+	depends on / blocked

Reported:	2017-06-14 16:39 UTC by Ambarish
Modified:	2018-09-24 11:49 UTC (History)
CC List:	13 users (show)
Fixed In Version:	nfs-ganesha-2.5.4-1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-09-04 06:53:35 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2018:2610	0	None	None	None	2018-09-04 06:55:06 UTC

Description Ambarish 2017-06-14 16:39:10 UTC

Description of problem:
-----------------------

2 Node cluster.

3 clients mounted a 2*2 volume via v4 and were running Bonnie++ ina searate working directory.

Not sure if this is related but I was also collecting sosreports(which may run heal info and other gluster cmds on backend).

Other than this,nothing was done on the nodes.


Ganesha crashed on one of my nodes and dumped a core :

(gdb) bt
#0  0x00007fccfa7661f7 in raise () from /lib64/libc.so.6
#1  0x00007fccfa7678e8 in abort () from /lib64/libc.so.6
#2  0x0000562bd0b1ff1a in nfs_dupreq_rele (req=0x7fc8de517818, func=<optimized out>)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/RPCAL/nfs_dupreq.c:1256
#3  0x0000562bd0aa48e1 in nfs_rpc_execute (reqdata=reqdata@entry=0x7fc8de5177f0)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1405
#4  0x0000562bd0aa618a in worker_run (ctx=0x562bd24b3e90) at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_worker_thread.c:1548
#5  0x0000562bd0b2f889 in fridgethr_start_routine (arg=0x562bd24b3e90)
    at /usr/src/debug/nfs-ganesha-2.4.4/src/support/fridgethr.c:550
#6  0x00007fccfb15be25 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fccfa82934d in clone () from /lib64/libc.so.6
(gdb) 



Version-Release number of selected component (if applicable):
-------------------------------------------------------------

[root@gqas009 tmp]# rpm -qa|grep ganesha
nfs-ganesha-2.4.4-8.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-25.el7rhgs.x86_64


How reproducible:
----------------
1/1

Additional info:
----------------

Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 3b04b36a-1837-48e8-b437-fbc091b2f992
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gqas007.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0
Brick2: gqas009.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1
Brick3: gqas007.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2
Brick4: gqas009.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3
Options Reconfigured:
ganesha.enable: on
features.cache-invalidation: on
server.allow-insecure: on
performance.stat-prefetch: off
transport.address-family: inet
nfs.disable: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable

Comment 6 Daniel Gryniewicz 2017-06-15 13:29:29 UTC

Okay, so there are 9 DRC fixes in 2.5, which fix things like use-after-free, refcounting, and null deref.  So this is very likely fixed in 2.5.  Can we put this off until the 2.5 rebase?  Or do all these patches need to be backported?

Comment 9 Kaleb KEITHLEY 2017-08-16 12:53:14 UTC

fix is in nfs-ganesha-2.5.x

Comment 10 Kaleb KEITHLEY 2017-10-05 11:28:23 UTC

POST with rebase to nfs-ganesha-2.5.x

Comment 16 errata-xmlrpc 2018-09-04 06:53:35 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2610

Note You need to log in before you can comment on or make changes to this bug.