Bug 1379962 - Ganesha crashes with segfault while accessing files from Windows client.
Summary: Ganesha crashes with segfault while accessing files from Windows client.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: nfs-ganesha
Version: rhgs-3.2
Hardware: x86_64
OS: Windows
unspecified
urgent
Target Milestone: ---
: RHGS 3.2.0
Assignee: Soumya Koduri
QA Contact: surabhi
URL:
Whiteboard:
Depends On: 1378089
Blocks: 1351528
TreeView+ depends on / blocked
 
Reported: 2016-09-28 09:20 UTC by Shashank Raj
Modified: 2017-03-23 06:23 UTC (History)
11 users (show)

Fixed In Version: nfs-ganesha-2.4.1-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1378089
Environment:
Last Closed: 2017-03-23 06:23:49 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:0493 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.2.0 nfs-ganesha bug fix and enhancement update 2017-03-23 09:19:13 UTC

Description Shashank Raj 2016-09-28 09:20:36 UTC
+++ This bug was initially created as a clone of Bug #1378089 +++

Description of problem:

Ganesha crashes with segfault while accessing files from Windows client.

Version-Release number of selected component (if applicable):

[root@dhcp43-116 /]# rpm -qa|grep ganesha
glusterfs-ganesha-3.8.3-0.6.git7956718.el7.centos.x86_64
nfs-ganesha-gluster-2.4-0.rc4.el7.centos.x86_64
nfs-ganesha-debuginfo-2.4-0.rc4.el7.centos.x86_64
nfs-ganesha-2.4-0.rc4.el7.centos.x86_64

How reproducible:

Consistent

Steps to Reproduce:
1.Create a nfs-ganesha cluster.
2.Create a volume and enable ganesha on it.
3.Mounted the volume on linux client and created 10000 (100KB) files.
4.Mount the volume on Windows client and try accessing the files inside nfs share.
5.Observe that ganesha crashes with seg fault error with below bt:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f3022ec4700 (LWP 10382)]
0x00007f309d6b08fd in glusterfs_reopen2 (obj_hdl=0x7f2fd0007498, state=0x0, 
    openflags=1)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/FSAL/FSAL_GLUSTER/handle.c:1953
1953		old_openflags = my_share_fd->openflags;
(gdb) bt
#0  0x00007f309d6b08fd in glusterfs_reopen2 (obj_hdl=0x7f2fd0007498, state=0x0, 
    openflags=1)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/FSAL/FSAL_GLUSTER/handle.c:1953
#1  0x000000000053217e in mdcache_reopen2 (obj_hdl=0x7f2fd0002ae8, state=0x0, 
    openflags=1)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_file.c:779
#2  0x000000000043329c in fsal_reopen2 (obj=0x7f2fd0002ae8, state=0x0, 
    openflags=1, check_permission=true)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/FSAL/fsal_helper.c:1842
#3  0x00000000004b9e5e in state_nlm_share2 (obj=0x7f2fd0002ae8, share_access=1, 
    share_deny=0, owner=0x7f2f98009560, state=0x0, reclaim=false, unshare=false)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/SAL/state_share.c:805
#4  0x00000000004ba6eb in state_nlm_share (obj=0x7f2fd0002ae8, share_access=1, 
    share_deny=0, owner=0x7f2f98009560, state=0x0, reclaim=false)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/SAL/state_share.c:894
#5  0x000000000049547d in nlm4_Share (args=0x7f2f7c006128, req=0x7f2f7c005f68, 
    res=0x7f2f98007dd0)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/Protocols/NLM/nlm_Share.c:12---Type <return> to continue, or q <return> to quit---
2
#6  0x000000000044ad6b in nfs_rpc_execute (reqdata=0x7f2f7c005f40)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1281
#7  0x000000000044b625 in worker_run (ctx=0x217b9e0)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1548
#8  0x000000000050079f in fridgethr_start_routine (arg=0x217b9e0)
    at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/support/fridgethr.c:550
#9  0x00007f30a002cdc5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f309f6ec1cd in clone () from /lib64/libc.so.6

Actual results:

Ganesha crashes with segfault while accessing files from Windows client.

Expected results:

There should not be any crash

Additional info:

Core file will be attached.

--- Additional comment from Shashank Raj on 2016-09-21 09:14:52 EDT ---

Core file can be accessed at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1378089/

--- Additional comment from Soumya Koduri on 2016-09-21 09:27:51 EDT ---

(gdb) bt
#0  0x00007f1a18db28fd in glusterfs_reopen2 (obj_hdl=0x7f1929e13ec8, state=0x0, openflags=1) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/FSAL/FSAL_GLUSTER/handle.c:1953
#1  0x000000000053217e in mdcache_reopen2 (obj_hdl=0x7f1929e14288, state=0x0, openflags=1) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_file.c:779
#2  0x000000000043329c in fsal_reopen2 (obj=0x7f1929e14288, state=0x0, openflags=1, check_permission=true) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/FSAL/fsal_helper.c:1842
#3  0x00000000004b9e5e in state_nlm_share2 (obj=0x7f1929e14288, share_access=1, share_deny=0, owner=0x7f1910006960, state=0x0, reclaim=false, unshare=false) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/SAL/state_share.c:805
#4  0x00000000004ba6eb in state_nlm_share (obj=0x7f1929e14288, share_access=1, share_deny=0, owner=0x7f1910006960, state=0x0, reclaim=false) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/SAL/state_share.c:894
#5  0x000000000049547d in nlm4_Share (args=0x7f195c186218, req=0x7f195c186058, res=0x7f1910006030) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/Protocols/NLM/nlm_Share.c:122
#6  0x000000000044ad6b in nfs_rpc_execute (reqdata=0x7f195c186030) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1281
#7  0x000000000044b625 in worker_run (ctx=0x139d0d0) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1548
#8  0x000000000050079f in fridgethr_start_routine (arg=0x139d0d0) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/support/fridgethr.c:550
#9  0x00007f1a1b72edc5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f1a1adee1cd in clone () from /lib64/libc.so.6
(gdb) p my_share_fd
$1 = (struct glusterfs_fd *) 0x110
(gdb) l
1948	#endif
1949	
1950		/* This can block over an I/O operation. */
1951		PTHREAD_RWLOCK_wrlock(&obj_hdl->lock);
1952	
1953		old_openflags = my_share_fd->openflags;
1954	
1955		/* We can conflict with old share, so go ahead and check now. */
1956		status = check_share_conflict(&myself->share, openflags, false);
1957	
(gdb) 
1958		if (FSAL_IS_ERROR(status)) {
1959			PTHREAD_RWLOCK_unlock(&obj_hdl->lock);
1960			return status;
1961		}
1962	
1963		/* Set up the new share so we can drop the lock and not have a
1964		 * conflicting share be asserted, updating the share counters.
1965		 */
1966		update_share_counters(&myself->share, old_openflags, openflags);
1967	
(gdb) f 3
#3  0x00000000004b9e5e in state_nlm_share2 (obj=0x7f1929e14288, share_access=1, share_deny=0, owner=0x7f1910006960, state=0x0, reclaim=false, unshare=false) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/SAL/state_share.c:805
805		fsal_status = fsal_reopen2(obj, state, openflags, true);
(gdb) l
800			openflags |= FSAL_O_RECLAIM;
801	
802		/* Use reopen2 to open or re-open the file and check for share
803		 * conflict.
804		 */
805		fsal_status = fsal_reopen2(obj, state, openflags, true);
806	
807		if (FSAL_IS_ERROR(fsal_status)) {
808			LogDebug(COMPONENT_STATE,
809				 "fsal_reopen2 failed with %s",
(gdb) f 5
#5  0x000000000049547d in nlm4_Share (args=0x7f195c186218, req=0x7f195c186058, res=0x7f1910006030) at /usr/src/debug/nfs-ganesha-2.4-rc4-0.1.1-Source/Protocols/NLM/nlm_Share.c:122
122		state_status = state_nlm_share(obj,
(gdb) l
117				 "REQUEST RESULT: nlm4_Share %s",
118				 lock_result_str(res->res_nlm4share.stat));
119			return NFS_REQ_OK;
120		}
121	
122		state_status = state_nlm_share(obj,
123					       arg->share.access,
124					       arg->share.mode,
125					       nlm_owner,
126					       nlm_state,
(gdb) p nlm_state
$2 = (state_t *) 0x0
(gdb) p rc
$3 = -1
(gdb) 



I guess fix is to bail out in nlm4_Share in case of error (rc !=0).

Comment 2 Soumya Koduri 2016-09-29 11:13:52 UTC
Patch posted upstream for review - https://review.gerrithub.io/296399

Comment 6 surabhi 2016-11-25 07:28:40 UTC
Executed following on the latest ganesha setup:

1.Create a nfs-ganesha cluster.
2.Create a volume and enable ganesha on it.
3.Mounted the volume on linux client and created 10000 (100KB) files.
4.Mount the volume on Windows client and try accessing the files inside nfs share.

No crash is seen and operations succeeds.
Also tried with multiple windows clients and executing multiple operations like file creates, dir creates, rcopy No issues seen.

Moving the BZ to verified.

nfs-ganesha-2.4.1-1.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.1-1.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-5.el7rhgs.x86_64

Comment 8 errata-xmlrpc 2017-03-23 06:23:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2017-0493.html


Note You need to log in before you can comment on or make changes to this bug.