Bug 1262191

Summary: nfs-ganesha: having acls and quota enabled for volume and nfs-ganesha coredump while creating data
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Saurabh <saujain>
Component: nfs-ganeshaAssignee: Jiffin <jthottan>
Status: CLOSED ERRATA QA Contact: Matt Zywusko <mzywusko>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: byarlaga, jthottan, kkeithle, mzywusko, ndevos, nlevinki, rcyriac, sankarshan, skoduri
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: nfs-ganesha-2.2.0-10 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-01 05:35:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1242148, 1251471    
Bug Blocks: 1260783    

Description Saurabh 2015-09-11 06:32:28 UTC
Description of problem:
While acls and quota was enabled for the volume. 
The data creation is hung as nfs-ganesha has coredumped.
During this I/O, add-brick and rebalance were also attemtpted successfully.

Version-Release number of selected component (if applicable):
glusterfs-3.7.1-14.el7rhgs.x86_64
nfs-ganesha-2.2.0-7.el7rhgs.x86_64

How reproducible:
Test was executed once only

Steps to Reproduce:
1. create a volume of 6x2 type, start it
2. configure nfs-ganesha
3. enable acls for the volume 
4. enable quota on the volume and set a limit of 100GB on "/"
5. mount the volume, start creating data
6. while data creation is going on, execute add-brick and rebalance

Actual results:
data creation is hung, as nfs-ganesha has coredumped,

(gdb) bt
#0  0x00007fef17b045d7 in raise () from /lib64/libc.so.6
#1  0x00007fef17b05cc8 in abort () from /lib64/libc.so.6
#2  0x00007fef17b44e07 in __libc_message () from /lib64/libc.so.6
#3  0x00007fef17b4c1fd in _int_free () from /lib64/libc.so.6
#4  0x00000000004f9ae5 in gsh_free ()
#5  0x00000000004f9eb3 in nfs4_ace_free ()
#6  0x00000000004f9ee7 in nfs4_acl_free ()
#7  0x00000000004faca5 in nfs4_acl_release_entry ()
#8  0x00000000004d9a0e in cache_inode_refresh_attrs ()
#9  0x00000000004dcc69 in cache_inode_lock_trust_attrs ()
#10 0x000000000047015a in cache_inode_get_changeid4 ()
#11 0x00000000004735b8 in nfs4_op_open ()
#12 0x000000000045eab5 in nfs4_Compound ()
#13 0x0000000000453a01 in nfs_rpc_execute ()
#14 0x00000000004545ad in worker_run ()
#15 0x000000000050afeb in fridgethr_start_routine ()
#16 0x00007fef1809fdf5 in start_thread () from /lib64/libpthread.so.0
#17 0x00007fef17bc51ad in clone () from /lib64/libc.so.6


Expected results:
NFS-ganesha should crash with the operations mentioned above. 

Additional info:

Comment 3 Jiffin 2015-09-14 06:16:07 UTC
I cannot reproduce the issue in my setup by using above mentioned steps. But I got same bt and similar problem if and only if the quota size exceeds. I ran it twice and I happened once.

Comment 5 Saurabh 2015-09-14 10:58:12 UTC
I was able to see the same coredump again with similar steps as mentioned in description section

Comment 6 Jiffin 2015-09-16 08:39:50 UTC
I didn't RCAed bug till now. But if I include latest acl related changes(already merged in upstream) , it is not reproduced any more. So for that we may need to backport two more ganesha patches to downstream

https://review.gerrithub.io/#/c/236924/ (BZ1251471)
https://review.gerrithub.io/#/c/240757/ (BZ1242148)

These bugs were deferred to 3.1.2.

Comment 9 Saurabh 2015-10-26 07:11:47 UTC
tested on , nfs-ganesha-2.2.0-10.el7rhgs.x86_64, glusterfs-3.7.5-0.3.el7rhgs.x86_64 with similar steps.

Comment 13 errata-xmlrpc 2016-03-01 05:35:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html