147017 – NFS Mounting from SCO exports hang

Bug 147017 - NFS Mounting from SCO exports hang

Summary: NFS Mounting from SCO exports hang

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	nfs-utils
Sub Component:
Version:	3
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Steve Dickson
QA Contact:	Ben Levenson
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-02-03 18:23 UTC by bart Mcfarling
Modified:	2007-11-30 22:10 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-07-20 13:46:30 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
tethereal log (120.87 KB, application/x-bzip) 2005-02-03 20:41 UTC, bart Mcfarling	no flags	Details
Begining to end hang (4.50 KB, application/x-bzip) 2005-02-03 20:57 UTC, bart Mcfarling	no flags	Details
tethereal -w hang (68.06 KB, application/x-bzip) 2005-02-04 15:45 UTC, bart Mcfarling	no flags	Details
/var/log/messages (695 bytes, text/plain) 2005-02-04 15:45 UTC, bart Mcfarling	no flags	Details
netstat -c (7.32 KB, text/plain) 2005-02-04 15:47 UTC, bart Mcfarling	no flags	Details
netstat -s (3.30 KB, text/plain) 2005-02-04 15:47 UTC, bart Mcfarling	no flags	Details
View All

Description bart Mcfarling 2005-02-03 18:23:46 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:
mounting a SCO nfs export to a FC3 machine hangs when you try to read
a large file (appears to be aroud 64k or larger). Most of the time 2
or 3 minutes passes and the file will appear,sometimes it never does.
This just started happening at the same time i had to start putting
the nfsvers=2 option in /etc/fstab. It seems the more you use it, the
less likely it is to hang.


Version-Release number of selected component (if applicable):
kernel-2.6.10-1.741_FC3

How reproducible:
Sometimes

Steps to Reproduce:
1.vim/cat a file over  64k
2.
3.
    

Actual Results:  terminal hangs, <CTRL C> wont kill it

Expected Results:  file should have opened/ displayed

Additional info:

NFS export is on a SCO Open Server 5.0.7 (but will do the same thing
on Open Server 5.0.4)

Comment 1 Steve Dickson 2005-02-03 19:31:58 UTC

would it be possible to post a b2zip ethetreal trace
of the hang?

Comment 2 bart Mcfarling 2005-02-03 20:41:14 UTC

Created attachment 110617 [details]
tethereal log

Comment 3 bart Mcfarling 2005-02-03 20:43:06 UTC

Not that familiar with ethereal. This is a tethereal log for about
45seconds of the hang. 192.9.23.74 is the SCO server, 192.9.23.12 is
my FC3 machine. Hope this helps.

Comment 4 bart Mcfarling 2005-02-03 20:57:57 UTC

Created attachment 110622 [details]
Begining to end hang

Here is a log from the start to the finish of a hang, about 2 mins.
I greped the SCO server IP address.

Comment 5 Steve Dickson 2005-02-04 14:07:55 UTC

hmm... there definitely appears to be a large number of 
retransmission....

To get a better look would you mind creating a binary capture
file by using the -w flag to tethereal similar to:
    tethereal -w /tmp/trace.pcap host server and host client
I should have been more explicit in my original request... 
my bad... :)

Also would it be possible to post the netstat -c stats from the 
client side and the netstat -s from the server side....

Finally, when the process hangs on the client, could you get a 
system wide trace by "echo t > /proc/sysrq-trigger", then posting 
the relevant parts from /var/log/messages....

Comment 6 bart Mcfarling 2005-02-04 15:45:19 UTC

Created attachment 110647 [details]
tethereal -w hang

Comment 7 bart Mcfarling 2005-02-04 15:45:57 UTC

Created attachment 110648 [details]
/var/log/messages

This is all that came out in /var/log/messages

Comment 8 bart Mcfarling 2005-02-04 15:47:06 UTC

Created attachment 110650 [details]
netstat -c

Comment 9 bart Mcfarling 2005-02-04 15:47:33 UTC

Created attachment 110652 [details]
netstat -s

Comment 10 Steve Dickson 2005-02-04 20:36:16 UTC

Thanks for the info.... 

So is it valid to say this hang only happen on the v2 filesystems
and not on v3 mounted filesystems.

Comment 11 bart Mcfarling 2005-02-04 20:47:05 UTC

Yes, other nfs mounts that are exported from newer linux boxes do not
suffer from this problem. Everyone in the company that uses FC suffers
from this with our sco machines. I also saw a usenet post that someone
else was having the same problem with SCO exports.(dont know if I can
find it again).  I cant say if its a SCO specific problem or not I'm a
little suspicious though.  Older RedHat Linux versions dont seem to
have this problem, nor do our Enterprise servers that mount from SCO.

Bart

Comment 12 Tom Mitchell 2005-04-01 01:56:55 UTC

There seems to be a number of nfs mount bugs like this that
need info about the server and if it provides NFS via udp, tcp, both.
For me the older servers do not provide tcp and mount and autofs
both appear to hang.  Apparently there is a problem checking for services
and the 'wrong' type of connection is tried and hangs.
For me the critical info is 'rpcinfo'.....

  $ rpcinfo -p troubleserver | grep nfs
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
  $ rpcinfo -p happyserver | grep nfs
    100003    4   tcp   2049  nfs

Older Linux and older appliance software wuold be involved.
    100003    3   tcp   2049  nfs
    100003    2   tcp   2049  nfs
    100003    3   udp   2049  nfs
    100003    2   udp   2049  nfs

Our current work arround is to specify tcp/udp to match the server in the 
mount line and mount maps.

In my case the server is not SCO but an older RH release.. 
(Kernel... 2.4.20-28.8bigmem)
(the other bugzillas that might match what I see are 140631, 144556, 144758)

Comment 13 Steve Dickson 2005-04-01 13:07:39 UTC

Could you please try the util-linux in 

http://people.redhat.com/steved/bz152956

It fixes a problem that appears to be simlar to this one...

Comment 14 Tom Mitchell 2005-04-01 18:28:13 UTC

It does help here.....

I installed the new rpm in a safe place so I could compare and contrast....

testing with old code: $ rpm -q --file `which mount`
util-linux-2.12a-21

# mount  bxtop:/export/files /tmp/MMM
mount to NFS server 'bxtop' failed: server is down.  # yep my problem...

Now with mount from the new rpm as per steved above... I installed it in
/tmp/MMMroot and testing it looks good....
[root@box tmp]# /tmp/MMMroot/bin/mount  bxtop:/export/files /tmp/MMM
[root@box tmp]#                <--- Most excelent silence is golden
[root@box tmp]#
[root@box tmp]# df  # Now check...
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda3             22928148  10042224  11721212  47% /
none                    513108         0    513108   0% /dev/shm
bxtop:/export/files 222029360  90235928 131793432  41% /tmp/MMM
# and Yes it mounted...

So yes this does help my observed problem.

Comment 15 Steve Dickson 2005-04-01 18:33:50 UTC


*** This bug has been marked as a duplicate of 152956 ***

Comment 16 Steve Dickson 2005-04-01 18:34:56 UTC

Thank you for your feedback!

Comment 17 bart Mcfarling 2005-04-04 13:20:44 UTC

didnt fix our sco problem.

Comment 18 Steve Dickson 2005-04-04 16:49:33 UTC

You said:

"This just started happening at the same time i had to start putting
the nfsvers=2 option in /etc/fstab. It seems the more you use it, the
less likely it is to hang."

and I'm sure I understand what you mean. Are you saying that 
using v2 causes the hang or does not cause the hang?

Comment 19 bart Mcfarling 2005-04-04 17:10:39 UTC

We used rh 9 for a while then upgraded our workstations to FC3 and we couldnt
mount anything via nfs. Then we figured out we had to add nfsvers=2 to
/etc/fstab in order to get our machines to mount the exported nfs directory. I
guess what i was saying in a round about way was that it was not broken in
previous releases of 
RedHat and I figured that there was a big update to the nfs stuff that caused it
to break because I now had to specify the nfs version in addition to the hang. I
have backed my workstation up to RHEL3WS and the nfs works flawlessly.  We do
still have a couple of FC3 machines in place if you need any output from one of
them. I will be spending the rest of the day upgrading my machine to RHEL4WS.
I'll let you know if its in there too.

Bart

Comment 20 Ehud Karni 2005-07-20 16:29:46 UTC

I have the same problem - Can't mount an SCO - 
OpenServer(TM) Release 5 (scosysv 3.2 5.0.5 i386)) 
export (exported with "/   -access=access=bu-fs-n:rmt-fs,root=bu-fs-n:rmt-fs").

If the /etc/fstab line does NOT contains "nfsver=2", I get the message:
mount to NFS server 'bu-sdr' failed: server is down.
When it DOES contain "nfsver=2" the system crashes.

Comment 21 Matthew Miller 2006-07-10 22:57:02 UTC

Fedora Core 3 is now maintained by the Fedora Legacy project for security
updates only. If this problem is a security issue, please reopen and
reassign to the Fedora Legacy product. If it is not a security issue and
hasn't been resolved in the current FC5 updates or in the FC6 test
release, reopen and change the version to match.

Thank you!

Comment 22 Ehud Karni 2006-07-11 20:21:40 UTC

Works OK in kernel 2.6.17-1.2145_FC5.
Previous kernels (2.6.15) had problems when trying to mount more than 1 NFS disk.

Note You need to log in before you can comment on or make changes to this bug.