Bug 981612

Summary: smbd generates continuous core files when FSCT is run
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ujjwala <ujjwala>
Component: sambaAssignee: Raghavendra Talur <rtalur>
Status: CLOSED ERRATA QA Contact: Sudhir D <sdharane>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: crh, sdharane, vagarwal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.15rhs Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-23 22:32:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 956495    

Description Ujjwala 2013-07-05 09:28:11 UTC
Description of problem:
When the FSCT tool is run on the Samba-VFS share, smbd generates core files continously till the / is filled.

Version-Release number of selected component (if applicable):
glusterfs 3.4.0.12rhs.beta1 built on Jun 28 2013 06:41:37


How reproducible:
3/3

Steps to Reproduce:
1. On a 4 node cluster, create a distribute volume.
2. Create a Samba-VFS share in /etc/samba/smb.conf.
2. Do the FSCT test setup on one of the nodes for 100 users.
3. Run the FSCT test from windows client.
4. Within a few mins, smbd starts generating core files in /var/log/samba/cores/smbd till the / is filled.
5. FSCT test exits as fail

Actual results:


Expected results:
Core files should not be created and the FSCT tests should run fine.

Additional info:

Comment 4 Ujjwala 2013-08-01 09:02:09 UTC
I was able to recreate the issue on the latest samba-glusterfs-3.6.9-156.1 build

# glusterfs -V
glusterfs 3.4.0.13rhs built on Jul 28 2013 15:22:54
# smbd -V
Version 3.6.9-156.1.el6rhs

Comment 7 Raghavendra Talur 2013-08-01 13:35:45 UTC
My mistake, 

I had told Ujjwala that glusterd restarts automatically whenever a change is made in glusterd.vol. I was wrong.


So using the same workaround that Avati had given for allowing clients to connect using non-privileged ports should work. We have to remember to restart glusterd after the change is made in glusterd.vol.

I am removing the test blocker keyword. We will run the fsct test again tomorrow.

Comment 8 Raghavendra Talur 2013-08-01 21:41:12 UTC
Posted patch to handle this gracefully even if workaround is not used.

https://code.engineering.redhat.com/gerrit/#/c/11037/

Comment 9 Christopher R. Hertel 2013-08-06 20:19:46 UTC
The underlying problem is that the GlusterFS server daemon uses (by default) trusted port numbers (those below 1024) to allow the Gluster client to connect.  Samba, however, generates a lot of processes so a lot of ports are needed in order to connect via libgfapi to the Gluster server.  The system quickly runs out of trusted ports, and the Gluster server will not allow additional clients to connect.

The core dump is due to libgfapi not checking for a NULL return value when the server fails to accept the connection because the calling port number is not within the trusted range.  The fix referenced in comment #8 handles the NULL return value, but does not provide a mechanism for ensuring that connections are made.  The connection will still fail, but smbd will not crash.

The work-around allowing clients to connect using non-privileged ports should work and should be applied for all connections via localhost.  We should also verify that the SO_REUSEADDR option is set on the client ports (though I believe that this is already the case).

Comment 10 Ujjwala 2013-08-07 07:05:19 UTC
Tested it on the following build -
glusterfs 3.4.0.15rhs built on Aug  4 2013 22:34:15
samba-glusterfs-3.6.9-156.2.el6rhs.x86_64
I don't see any smbd core files.

Comment 13 Scott Haines 2013-09-23 22:32:16 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html