Description of problem: I executed the cthon lock test on vers=3, the nfs-ganesha crashed. The volume is also having acls enabled [root@nfs11 ~]# gluster volume status vol4 Status of volume: vol4 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.46.8:/rhs/brick1/d1r14 49156 0 Y 16720 Brick 10.70.46.27:/rhs/brick1/d1r24 49155 0 Y 31744 Brick 10.70.46.25:/rhs/brick1/d2r14 49157 0 Y 30081 Brick 10.70.46.29:/rhs/brick1/d2r24 49156 0 Y 22951 Brick 10.70.46.8:/rhs/brick1/d3r14 49157 0 Y 16738 Brick 10.70.46.27:/rhs/brick1/d3r24 49156 0 Y 31762 Brick 10.70.46.25:/rhs/brick1/d4r14 49158 0 Y 30099 Brick 10.70.46.29:/rhs/brick1/d4r24 49157 0 Y 22969 Brick 10.70.46.8:/rhs/brick1/d5r14 49158 0 Y 16756 Brick 10.70.46.27:/rhs/brick1/d5r24 49157 0 Y 31780 Brick 10.70.46.25:/rhs/brick1/d6r14 49159 0 Y 30117 Brick 10.70.46.29:/rhs/brick1/d6r24 49158 0 Y 22987 Self-heal Daemon on localhost N/A N/A Y 10581 Quota Daemon on localhost N/A N/A Y 22205 Self-heal Daemon on 10.70.46.25 N/A N/A Y 21878 Quota Daemon on 10.70.46.25 N/A N/A Y 31886 Self-heal Daemon on 10.70.46.27 N/A N/A Y 24236 Quota Daemon on 10.70.46.27 N/A N/A Y 2719 Self-heal Daemon on 10.70.46.29 N/A N/A Y 14763 Quota Daemon on 10.70.46.29 N/A N/A Y 26234 Self-heal Daemon on 10.70.46.22 N/A N/A Y 1465 Quota Daemon on 10.70.46.22 N/A N/A Y 15541 Self-heal Daemon on 10.70.46.39 N/A N/A Y 20442 Quota Daemon on 10.70.46.39 N/A N/A Y 1841 Task Status of Volume vol4 ------------------------------------------------------------------------------ There are no active volume tasks Version-Release number of selected component (if applicable): glusterfs-3.7.1-9.el6rhs.x86_64 nfs-ganesha-2.2.0-5.el6rhs.x86_64 How reproducible: happened for first time Steps to Reproduce: 1. create a volume of 6x2 type, start it 2. configure nfs-ganesha, enable acls for the volume. 3. execute cthon lock test for vers=3, use the below command. time ./server -l -o vers=3 -p /vol4 -m /mnt -N 3 <host-IP> Actual results: the -N 3, helps us to run the lock three time in a loop, first two time it has passed but third time during an UNLOCK operation it had failed as nfs-ganesha process crashed, (gdb) bt #0 0x00000000004939fa in nlm_send_async () #1 0x00000000004950dc in nlm4_send_grant_msg () #2 0x000000000049f1d8 in state_async_func_caller () #3 0x000000000050d836 in fridgethr_start_routine () #4 0x0000003e96a07a51 in start_thread () from /lib64/libpthread.so.0 #5 0x0000003e966e896d in clone () from /lib64/libc.so.6 PS:- there was not failover triggerd. Expected results: cthon lock with vers=3 should be a pass. Additional info:
issue is seen even if acls are disabled,
Created attachment 1051659 [details] nfs11 nfs-ganesha coredump
I ran the cthon lock tests on two machines for about 10 times in a loop. Haven't seen the crash [root@clus1 ~]# showmount -e localhost Export list for localhost: /vol1 (everyone) /vol2 (everyone) [root@clus1 ~]# [root@clus1 ~]# getenforce Enforcing [root@clus1 ~]# [root@dhcp42-219 cthon04]# ./server -l -p /vol1 -m /tmp/mnt 10.70.42.141 -N 10 ........... ......... Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [7420.29 +/- 51.49 KB/s]. Parent: 14.1 - F_ULOCK [ 0, ENDING] PASSED. Test #15 - Test 2nd open and I/O after lock and close. Parent: Second open succeeded. Parent: 15.0 - F_LOCK [ 0, ENDING] PASSED. Parent: 15.1 - F_ULOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ]. Parent: Read 'abcdefghij' from testfile [ 0, 11 ]. Parent: 15.2 - COMPARE [ 0, b] PASSED. ** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total). ** CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total). Congratulations, you passed the locking tests! All tests completed Request setup from QE to reproduce and further analyse the bug. Meanwhile trying to get all the required debuginfo packages to analyse the core.
Tried below tests on QE tests - * ran cthon tests for about 6 times on 2 different volumes using two different clients multiple times * restarted nfs-ganesha, statd and other services in between. But unable to reproduce the issue.
Closing this bug as this issue doesn't seem reproducible. We shall check if nfs-ganesha debuginfo package is appropriate in a separate bug.
Bug 1244792 is used for tracking and fixing the debuginfo problem. *** This bug has been marked as a duplicate of bug 1257957 ***