Bug 15121

Summary: NFS Locking fails with large amounts of data
Product: [Retired] Red Hat Linux Reporter: Need Real Name <lawrence>
Component: nfs-utilsAssignee: Michael K. Johnson <johnsonm>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: nahay
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-12-15 00:36:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
tar file of test case, please read README file, modify Makefile macro and type make none

Description Need Real Name 2000-08-02 14:10:41 UTC
Our application creates large binary files. To create these files we first create a zero length file
and lock it (fcntl(... SETLK)). Then we write a second data file. When the write is complete
we fsync the data file and then close both file descriptors. This should release the lock.
When a large amount of data is written quickly, the lock is not released (about 100 MB is enough).

I've generated this bug with the Linux box at kernel 2.2.12-20 and 2.2.12-32. I've used data servers
on Solaris 2.5.1, Solaris 2.7 and HP-UX 11.00.

By the way, our Solaris boxes do have patch 105299-02 installed (sunsolve ID 4071076) so this is
not a repeat of that problem (thank you for referencing that in the other bug reports it did solve
another problem I was having).

Please contact me at lawrence and I will provide a full testcase via ftp.

Jay Lawrence

Comment 1 Need Real Name 2000-08-02 14:13:46 UTC
Created attachment 1814 [details]
tar file of test case, please read README file, modify Makefile macro and type make

Comment 2 Need Real Name 2000-08-02 14:52:22 UTC
Further testing has been done. If the remote directory is on HP 10.20 it also fails.

If the remote directory is on another Linux 2.2.12 machine or a Network Appliance
file server the problem does not occur. The NetApp might be an anomoly because it
is on a gigabit network and may be consuming the data VERY fast coming off my
100TX linux machines.


Comment 3 Cristian Gafton 2000-08-09 02:35:20 UTC
assigned to johnsonm

Comment 4 Need Real Name 2000-08-09 11:35:04 UTC
I tried to append this previously but it somehow disappeared....

A customer of mine had his remote file systems mounted with

         mount -o rw,nolock ....

In this configuration the problem did NOT occur.

In my opinion this is a crazy way to mount a file system since it bypassed NFS locking all together
and could lead to corruption problems, but it did serve to isolate the bug to NFS locking. With 'nolock'
the lock is maintained locally on the client and the problem did not occur.

Jay

Comment 5 Michael Nahay 2001-05-31 22:01:14 UTC
Hello,
Has there been any progress on this issue?
Thanks,
Michael


Comment 6 Arjan van de Ven 2001-05-31 22:04:43 UTC
Could you try our 2.2.19 errata kernel? it has majorly revamped NFS code,
including NFSv3 support.