Bug 1403714 - Ganesha + Multi-Volume/Single-Mount] - Ganesha crashes during inode_destroy
Summary: Ganesha + Multi-Volume/Single-Mount] - Ganesha crashes during inode_destroy
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: nfs-ganesha
Version: rhgs-3.2
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: RHGS 3.2.0
Assignee: Jiffin
QA Contact: Ambarish
URL:
Whiteboard:
Depends On:
Blocks: 1351528 1400780 1401160
TreeView+ depends on / blocked
 
Reported: 2016-12-12 09:17 UTC by Ambarish
Modified: 2017-03-28 06:52 UTC (History)
15 users (show)

Fixed In Version: nfs-ganesha-2.4.1-4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1400780
Environment:
Last Closed: 2017-03-23 06:27:19 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1353561 0 unspecified CLOSED Multiple bricks could crash after TCP port probing 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2017:0493 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.2.0 nfs-ganesha bug fix and enhancement update 2017-03-23 09:19:13 UTC

Internal Links: 1353561

Comment 2 Jiffin 2016-12-12 09:51:27 UTC
copying from https://bugzilla.redhat.com/show_bug.cgi?id=1400780#c9

From BT of core 1 in bz#1400780 and core 2 in bz#1401160 , it is clear that issue will hit only when ganesha is trying to remove a entry from its lru list. By default lru limit for ganesha's MD_CACHE is 25000 and in gfapi layer it is 131072. We suspect crashed occurred when there is race b/w removal of entry from ganesha and gluster layer.
I tried to reproduce similar issue with 3 volumes(two 1x2 and one 1x1) and clients no varying from 4 to 7. Also I tried with lower value for lru limit to 20 for ganesha and 100 for gluster. But never hit this with ongoing I/O's (ran dd and linux untar from different clients). In my setup the I/O continuously ran for atleast 4 hours, then it error out saying "no space left on the device".

But during clean up (rm -rf on same directories from different mount) I have consistently got crash with a similar BT during lru clean up. The crashes are more easily reproduced with lower lru limit value. When I increased the lru value to 150000 in ganesha, crash was not seen(may be it will crash eventually)

Comment 4 Atin Mukherjee 2016-12-14 12:56:28 UTC
Devel ack is provided as the crash is consistently reproducible.

Comment 10 Ambarish 2017-01-20 07:52:52 UTC
The reported issue was not reproducible on Ganesha 2.4.1-6,Gluster 3.8.4-12 on two tries.

Will reopen if hit again during regressions.

Comment 12 errata-xmlrpc 2017-03-23 06:27:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2017-0493.html


Note You need to log in before you can comment on or make changes to this bug.