29374 – file locking over NFS not working

Bug 29374 - file locking over NFS not working

Summary: file locking over NFS not working

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	kernel
Sub Component:
Version:	7.1
Hardware:	i386
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Pete Zaitcev
QA Contact:	David Lawrence
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2001-02-25 12:07 UTC by Thilo Mezger
Modified:	2005-10-31 22:00 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2001-03-08 21:57:46 UTC
Embargoed:

Attachments	(Terms of Use)
excerpt from log file (550 bytes, text/plain) 2001-02-25 12:08 UTC, Thilo Mezger	no flags	Details
detailed scenario again (840 bytes, text/plain) 2001-02-27 18:47 UTC, Thilo Mezger	no flags	Details
strace output of running pine (1.01 KB, text/plain) 2001-02-27 18:49 UTC, Thilo Mezger	no flags	Details
View All

Description Thilo Mezger 2001-02-25 12:07:49 UTC

My test environment looks like this:

The NFS server ("argos") is running Red Hat Linux 7.0 with all updates 
applied and it did a very good job serving RHL7.0 NFS clients.

My Wolverine test machine ("silicad") is mounting /home via NFS and 
automount from the server.  As soon as a process tries to lock a file 
(e.g. pine locking the mailbox or konquerer locking its bookmarks file) 
the system is hanging and it looks like the server is away - but it isn't.

Everything works fine there are only 7.0 machines in the network.

The clients syslogs to a loghost.  Logs are attached.

Comment 1 Thilo Mezger 2001-02-25 12:08:25 UTC

Created attachment 11020 [details]
excerpt from log file

Comment 2 Bob Matthews 2001-02-26 16:27:51 UTC

Can you strace the process which is doing the locking and attach the output? 
I'd like to know where it is getting stuck or spinning.

Comment 3 Glen Foster 2001-02-26 23:57:03 UTC

We (Red Hat) should really try to resolve this before next release.

Comment 4 Thilo Mezger 2001-02-27 18:47:40 UTC

Created attachment 11220 [details]
detailed scenario again

Comment 5 Thilo Mezger 2001-02-27 18:49:02 UTC

Created attachment 11221 [details]
strace output of running pine

Comment 6 Thilo Mezger 2001-02-27 18:50:22 UTC

running "/usr/sbin/nhfsstone /home/joe" on a directory which is mounted via NFS 
turned out to be a good test case:

- it works with Guinness
- it fails with Wolverine

Comment 7 Jeff Johnson 2001-03-07 20:26:17 UTC

I failed to reproduce the problem using nhfstone with
	client	kernel-2.4.0-0.43.12smp	nfs-utils-0.2.1-10
	server	kernel-2.2.16-7		nfs-utils-0.2-2
all using nfsv2 on udp.

I also tried nfs-utils-0.3.1 on both sides, no problem, and tried
kernel-2.4.2-0.1.20smp on
the client as well, still no problem.

I also tried using pine/mutt with /var/spool/mail mounted on client, while
watching
the protocol using tcpdump, no propblem

So, can you supply the exact packages for kernel/nfs-utils/pine installed on
both client and
server?

Could also try your nhfstone failure on a manually mounted (no autofs)
directory?

Thanks.

Comment 8 Thilo Mezger 2001-03-08 21:39:12 UTC

Server ("Guinness" with all official patches applied and nothing else):
kernel-2.2.17-14
nfs-utils-0.1.9.1-7
 
Client ("Wolverine" as is):
kernel-2.4.1-0.1.9
nfs-utils-0.2.1-10

OK, I tried both now: auto-mounted and manually mounted filesystems.  Both 
worked now and I wasn't able to reproduce my bug anymore.

It's too bad I changed my network card from a NE2000 compatible 10MBit/s to a 
3COM 100MBit/s network - now everything works for me.

This is really strange as I was able to reproduce this bug for about 3 times 
(i.e. re-installing for 3 times).  And the old network-card with RHL7.0 on 
both client and server had always worked perfectly which made me think that it 
was Wolverine...

Still, I don't really understand what has happened here - I suppose it might 
have something to do with the ne2000 driver...?!  God knows...

Comment 9 Bob Matthews 2001-03-08 21:56:18 UTC

We have had one other report of a problem with the NE2000, although it was
generating a kernel hang.

Comment 10 Arjan van de Ven 2001-03-19 15:46:25 UTC

I will close this fixed as there is no way this can be reproduced anymore
and it works now; I have a similar situation (also not ne2000) that works just
fine.

Note You need to log in before you can comment on or make changes to this bug.