Bug 207289 - GFS and samba problem when accessing one file simoultaneously
GFS and samba problem when accessing one file simoultaneously
Product: Fedora
Classification: Fedora
Component: GFS (Show other bugs)
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Abhijith Das
Depends On:
  Show dependency treegraph
Reported: 2006-09-20 10:55 EDT by sandra
Modified: 2008-05-06 12:22 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-05-06 12:22:37 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description sandra 2006-09-20 10:55:58 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20060426 Firefox/

Description of problem:

We have two Fedora 5 Servers clustered with GFS. We installed samba and exported the same shares in both of them.
All went fine at first, with people accessing to theirs own files and so, but for some programs (minitab, matlab, ...) people need to access the same file at once. Then samba begins to fail and clients hang. In order to fix samba is necessary to restart the service. We've tried to put the shares in a filesystem without GFS and all goes well, people can access the same file without problems simultaneously.

Is a weird behaviour because the shares are exported from the two servers, but we really only access files simoultaneuosly using the first server, the other server exports the shares too but isn't used by that clients.

I don't know how to debug this problem to see what is happening. It seems something related to GFS and Samba.
I have seen mails of people with samba+GFS problems, but we aren't using the same configuration, and the GFS rpm are updated:
Any help will be greatly apreciated.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Log in four/five Windows XP client
2.Try execute simoultaneously the minitab program.
3.Samba and PC hangs.

Minitab program is in one GFS share, but if we put it in a share without GFS all goes well.

Actual Results:
Samba Hangs
Minitab hangs also.

Expected Results:
The file is accessed without problems

Additional info:
With other programs (matlab) the same is happening
Comment 1 sandra 2006-10-09 10:52:35 EDT
I attach more information (strace) about samba hangs.
It seems that is a GFS (not samba) exclusivelly problem, and happens when 4 or
more users access to same information simoultaneously.

I proved "strace -f -ttT -o /tmp/smbd.out -p <smbd-pid>" to guess what's
happenning, and it seems that system calls like write,open,flock, never finish
until samba is restarted.

4665  11:09:31.068381 kill(4666, SIG_0 <unfinished ...>
4665  11:09:31.068750 <... kill resumed> ) = -1 EPERM (Operation not permitted)
4665  11:09:31.068996 kill(4665, SIG_0 <unfinished ...>
4665  11:09:31.069260 <... kill resumed> ) = 0 <0.000205>
4665  11:09:31.069458 kill(4667, SIG_0 <unfinished ...>
4665  11:09:31.069617 <... kill resumed> ) = 0 <0.000099>
4665  11:09:31.069781 open("cint95-intel.mtw", O_RDONLY|O_LARGEFILE <unfinished ...>
4665  11:09:31.070150 <... open resumed> ) = 22 <0.000293>
4665  11:09:31.070396 geteuid32( <unfinished ...>
4665  11:09:31.070649 <... geteuid32 resumed> ) = 503 <0.000195>
4665  11:09:31.070937 write(19, "prova03 opened file cint95-intel"..., 67
<unfinished ...>
4665  11:09:31.071282 <... write resumed> ) = 67 <0.000261>
4665  11:09:31.071511 flock(22, 0x60 /* LOCK_??? */ <unfinished ...>
4665  11:09:31.071770 <... flock resumed> ) = 0 <0.000197>
4665  11:09:31.072127 write(5,
"\0\0\0g\377SMB\242\0\0\0\0\210\1\310\0\0\0\0\0\0\0\0\0"..., 107 <unfinished ...>
4665  11:09:31.072447 <... write resumed> ) = 107 <0.000212>
4665  11:09:31.242316 <... geteuid32 resumed> ) = 503 <0.000118>
4665  11:09:31.242405 write(19, "close fd=22 fnum=6371 (numopen=2"..., 34) = 34
4665  11:09:31.242572 nanosleep({0, 2000001},  <unfinished ...>
4667  11:09:31.245063 kill(4665, SIG_0) = 0 <0.000018>
4665  11:09:31.248047 <... nanosleep resumed> NULL) = 0 <0.005406>
4665  11:09:31.249355 nanosleep({0, 2000001}, NULL) = 0 <0.002621>
4665  11:09:31.252091 nanosleep({0, 2000001}, NULL) = 0 <0.003853>
4665  11:09:31.256088 nanosleep({0, 2000001}, NULL) = 0 <0.003906>
.................. a lot of nanosleeps ..............................
4665  11:10:04.887037 nanosleep({0, 2000001},  <unfinished ...>
4665  11:10:04.887219 <... nanosleep resumed> 0) = ? ERESTART_RESTARTBLOCK (To
be restarted) <0.000111>
4665  11:10:04.888197 +++ killed by SIGKILL +++
4667  11:10:04.890712 kill(4665, SIG_0 <unfinished ...>
4666  11:10:04.920965 kill(4665, SIG_0) = -1 ESRCH (No such process) <0.000017>
4667  11:10:04.934486 kill(4665, SIG_0 <unfinished ...> 
Comment 2 Abhijith Das 2006-11-27 18:00:11 EST
Hi Sandra,
I believe you were able to get past this issue with the CVS RHEL4 codebase on
RHES. (http://www.redhat.com/archives/linux-cluster/2006-October/msg00291.html)
Can you please verify that it works for you on fedora as well, so I can close
this bugzilla? If not, we need to find a solution to this.

Comment 3 sandra 2006-11-28 07:40:10 EST
Hi Abhi,

I finally compiled CVS RHEL4 for Fedora 5, but It was impossible to make ccsd
work. Perhaps is because the Fedora 5, that we've for testing, have GFS versions
previously installed and they were causing interferences with this new
installation. I was unable to make it works even with ccs in debug mode, so I
gave up it. In fact, we are requesting to spain@redhat.com for RHEL academic
license + GFS offers.
Best Regards,

Sandra Hernández
Comment 4 Bug Zapper 2008-04-03 23:47:58 EDT
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers
Comment 5 Bug Zapper 2008-05-06 12:22:35 EDT
This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.