Description of problem: When the FSCT tool is run on the Samba-VFS share, smbd generates core files continously till the / is filled. Version-Release number of selected component (if applicable): glusterfs 3.4.0.12rhs.beta1 built on Jun 28 2013 06:41:37 How reproducible: 3/3 Steps to Reproduce: 1. On a 4 node cluster, create a distribute volume. 2. Create a Samba-VFS share in /etc/samba/smb.conf. 2. Do the FSCT test setup on one of the nodes for 100 users. 3. Run the FSCT test from windows client. 4. Within a few mins, smbd starts generating core files in /var/log/samba/cores/smbd till the / is filled. 5. FSCT test exits as fail Actual results: Expected results: Core files should not be created and the FSCT tests should run fine. Additional info:
I was able to recreate the issue on the latest samba-glusterfs-3.6.9-156.1 build # glusterfs -V glusterfs 3.4.0.13rhs built on Jul 28 2013 15:22:54 # smbd -V Version 3.6.9-156.1.el6rhs
My mistake, I had told Ujjwala that glusterd restarts automatically whenever a change is made in glusterd.vol. I was wrong. So using the same workaround that Avati had given for allowing clients to connect using non-privileged ports should work. We have to remember to restart glusterd after the change is made in glusterd.vol. I am removing the test blocker keyword. We will run the fsct test again tomorrow.
Posted patch to handle this gracefully even if workaround is not used. https://code.engineering.redhat.com/gerrit/#/c/11037/
The underlying problem is that the GlusterFS server daemon uses (by default) trusted port numbers (those below 1024) to allow the Gluster client to connect. Samba, however, generates a lot of processes so a lot of ports are needed in order to connect via libgfapi to the Gluster server. The system quickly runs out of trusted ports, and the Gluster server will not allow additional clients to connect. The core dump is due to libgfapi not checking for a NULL return value when the server fails to accept the connection because the calling port number is not within the trusted range. The fix referenced in comment #8 handles the NULL return value, but does not provide a mechanism for ensuring that connections are made. The connection will still fail, but smbd will not crash. The work-around allowing clients to connect using non-privileged ports should work and should be applied for all connections via localhost. We should also verify that the SO_REUSEADDR option is set on the client ports (though I believe that this is already the case).
Tested it on the following build - glusterfs 3.4.0.15rhs built on Aug 4 2013 22:34:15 samba-glusterfs-3.6.9-156.2.el6rhs.x86_64 I don't see any smbd core files.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html