Bug 1396968

Summary: NFS-Ganesha: Possible ref leak in case of volume export failure
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Soumya Koduri <skoduri>
Component: nfs-ganeshaAssignee: Soumya Koduri <skoduri>
Status: CLOSED ERRATA QA Contact: Ambarish <asoman>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, asoman, ffilz, jthottan, kkeithle, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.2.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: nfs-ganesha-2.4.1-2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-23 06:25:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1351528    

Description Soumya Koduri 2016-11-21 10:02:10 UTC
Description of problem:

While working on bug1393526, it was found out that NFS-Ganesha server intermittently crashes. Further RCA showed that while in the process of exporting a volume, in case if the volume is stopped, NFS-Ganesha server may crash with an assert. The reason being there is a ref leak in the volume export failure cleanup path. So whenever that code-path is taken, server shall crash with below assert
 - assert(export->refcnt  == 1);

Thanks to Pranith for the reproducer.

Version-Release number of selected component (if applicable):
nfs-ganesha-2.4-1

How reproducible:
Often

Steps to Reproduce:

Thanks to Pranith for reproducer. From bug1393526,
>>> 
Here are the steps which exposed this issue:
In terminal 1:
while true; do gluster --mode=script volume stop r2 ; sleep 30 ; gluster --mode=script volume start r2; sleep 30; done

In terminal 2:
while true; do /usr/libexec/ganesha/dbus-send.sh /etc/ganesha on r2; sleep 10 && showmount -e localhost; /usr/libexec/ganesha/dbus-send.sh /etc/ganesha off r2; showmount -e localhost; done

In terminal 3:
watch systemctl status nfs-ganesha
<<<

Actual results:
The server crashes with an assert sometimes.

Expected results:
Server shouldn't crash but handle the export failures gracefully.

Additional info:

Comment 6 Ambarish 2016-12-27 04:08:53 UTC
I could not reproduce the assert failure while exporting a stopped volume.

gluster : glusterfs-3.8.4-10
ganesha : 2.4.1-3

Verified.

Comment 8 errata-xmlrpc 2017-03-23 06:25:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2017-0493.html