184223 – File locking via fcntl() broken for NFS mounts in RHEL4u2

Bug 184223 - File locking via fcntl() broken for NFS mounts in RHEL4u2

Summary: File locking via fcntl() broken for NFS mounts in RHEL4u2

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Steve Dickson
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-03-07 14:52 UTC by Steve Dickson
Modified:	2012-06-20 16:03 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-06-20 16:03:25 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Steve Dickson 2006-03-07 14:52:40 UTC

Description of problem:

Cooperative file locks using the fcntl() system call do
not seem to work on NFS mounted drives in rhel4u2.  The
python script appended to this message demonstrates the
problem.

The script forks off 70 child processes.  Each child attempts
to obtain a lock on a file.  When the lock is granted the child
simply writes a message to the file, releases the lock and exits.
On a working system, this all occurs very quickly.  On rhel4,
child processes are left hanging around and locks are not granted
for a long time.  It takes between 4 and 20 minutes for the child
processes all to access the file and exit.

How reproducible:
Use the following script:

#!/usr/bin/python

import os, sys, time, fcntl

FNAME = "test.txt"

NFORKS = 70

def dochild():
    fp = open(FNAME, 'a+b')
    fcntl.lockf(fp.fileno(), fcntl.LOCK_EX)
    fp.write("Hello from: " + str(os.getpid()) + '\n')
    fcntl.lockf(fp.fileno(), fcntl.LOCK_UN)
    fp.close()

for x in xrange(NFORKS):
    pid = os.fork()
    if pid == 0: # Child
        dochild()
        sys.exit(0)



Steps to Reproduce:
1. mount nfs filesystem
2. run python script
3.
  
Actual results:
Take several minutes to complete

Expected results:
Should take seconds to complete

Additional info:

Comment 1 Matthew Miller 2008-05-01 18:01:10 UTC

So, this works fine in RHEL 5 but still seems to be a problem on 4. Will a fix
be forthcoming? Thanks.

Comment 2 Maarten Broekman 2008-07-22 15:55:13 UTC

I'm not sure if I'm seeing a similar problem on my RHEL4.6 system.

Kernel:
2.6.9-67.0.4.ELhugemem #1 SMP Fri Jan 18 05:11:24 EST 2008 i686 athlon i386 
GNU/Linux

strace:
clone(Process 23784 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0xf6fda708) = 23784
[pid 23783] close(1)                    = 0
[pid 23783] close(2)                    = 0
[pid 23783] time(NULL)                  = 1216299609
[pid 23783] wait4(-1, Process 23783 suspended
 <unfinished ...>
[pid 23784] open("/opt/applogs/rad/ac_data/log/ac.lock", O_WRONLY) = 3
[pid 23784] fcntl64(3, F_SETLKW, {type=F_WRLCK, whence=SEEK_CUR, start=0, 
len=0}

The process hangs at this point.

Comment 3 Jiri Pallich 2012-06-20 16:03:25 UTC

Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.

Note You need to log in before you can comment on or make changes to this bug.