Bug 187210

Summary: flock(2) LOCK_EX doesn't work as documented
Product: [Fedora] Fedora Reporter: JW <ohtmvyyn>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED UPSTREAM QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 5CC: pfrields, wtogami
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-10-17 05:20:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description JW 2006-03-29 08:35:56 UTC
Description of problem:
If a single process creates a flock(LOCK_EX) lock on a file then opens another
file descriptor and attempt to make another flock(LOCK_EX) then the second call
blocks indefinitely.


Version-Release number of selected component (if applicable):
kernel-2.6.14-1.1644_FC4

How reproducible:
Always

Steps to Reproduce:
1. Compile this:
    fd1 = open(file, O_RDWR|O_CREAT, 0666);
    err = flock(fd1, LOCK_EX);
    fd2 = open(file, O_RDWR|O_CREAT, 0666);
    err = flock(fd2, LOCK_EX);
    // not reached
    exit(0);
2.Run the above
3.
  
Actual results:
Program doesnt exit

Expected results:
Program should exit

Additional info:
According to flock(2) man entry it would seem that the same process should be
able to create multiple locks of the same type on same file.
This is the (desirable) way that locking happens with fcntl(2) and on Microsoft
Windows.

So why would a process want to hold multiple locks on same file?
It is something that will happen when writing structured code.
As in:
     f1(){ flock(LOCK_EX); do_something(); flock(LOCK_UN); }

     f2() { flock(LOCK_EX); do_extra_things(); f1(); etc; flock(LOCK_UN); }

flock is broke, fcntl is broken, .... surely the simple things must be easy
enough to get right?

Comment 1 Dave Jones 2006-09-17 03:15:12 UTC
[This comment added as part of a mass-update to all open FC4 kernel bugs]

FC4 has now transitioned to the Fedora legacy project, which will continue to
release security related updates for the kernel.  As this bug is not security
related, it is unlikely to be fixed in an update for FC4, and has been migrated
to FC5.

Please retest with Fedora Core 5.

Thank you.


Comment 2 Dave Jones 2006-10-17 00:28:09 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 3 Dave Jones 2006-10-17 04:44:33 UTC
I asked the upstream flock() maintainer to take a look at this.
Here's his response:

Sorry, you've misunderstood how file locks work.  Unfortunately, flock()
isn't specified by POSIX or Single Unix (it came from BSD).  But it is
documented to work this way in the Linux flock(2) manpage:

       If a process uses open(2) (or similar) to obtain more than one descrip-
       tor  for  the same file, these descriptors are treated independently by
       flock().  An attempt to lock the file using one of these file  descrip-
       tors  may  be  denied  by  a  lock that the calling process has already
       placed via another descriptor.

Your 'structured code' example is just using the wrong tool for
the job -- a file lock is inappropriate; you probably wanted to use
pthread_mutex_lock()

Change state to NOTABUG ... and title to 'working exactly as documented' ;-)

Comment 4 JW 2006-10-17 04:55:42 UTC
>Your 'structured code' example is just using the wrong tool for
>the job -- a file lock is inappropriate; you probably wanted to use
>pthread_mutex_lock()

Wow! What an insightful presumption.  You have instantly denigrated a bug based
on some totally insane assumption.

The file lock is being used to lock a real file, not to serialize access to a
thread-shared resource.

Please, given that open() shares the underlying file structure then it would be
correct, regardless of what the flock() maintainer says, to also correct to
share locks.

Sorry, but the maintainer is quite wrong.

Also, your contention that the function works as documented flies in the face of
the actual documentation.  To quote:

   "A  process  may  only  hold one type of lock (shared or exclusive) on a"
   "file.  Subsequent flock() calls on an already locked file will  convert"
   "an existing lock to the new lock mode."

So a subsequent lock on already locked file is supposed to work.  Because that
is exactly what the documentation says.  Read it first, before making the
outlandish claim that the function works "as documented".



Comment 5 Dave Jones 2006-10-17 05:20:23 UTC
Feel free to argue your case on linux-kernel.org

Any change in behaviour in this function would have to happen upstream regardless.