Bug 160844

Summary: dangling POSIX locks after close
Product: Red Hat Enterprise Linux 4 Reporter: Peter Staubach <staubach>
Component: kernelAssignee: Peter Staubach <staubach>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: netllama
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0132 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-07 19:09:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168429    
Attachments:
Description Flags
Test program to reproduce the situation.
none
Proposed patch
none
Proposed patch none

Description Peter Staubach 2005-06-17 19:01:54 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.7) Gecko/20050416 Red Hat/1.0.3-1.4.1 Firefox/1.0.3

Description of problem:
There is a race between fcntl and close which can result in orphanned POSIX
locks.  In general, these races are handled via the support to remove locks
when the last reference to an open file is released.  Sometimes, this does
not occur in a timely fashion and could cause problems.  This can happen
because an application was holding a reference to an open file, which would
delay the last reference to an open file being released.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Compile and run the attach test program.


Actual Results:  The test program attempts to lock what should be a completely unlocked file.
The lock attempt fails because there is a lock outstanding on the file, long
after it should have been released.

Expected Results:  The lock attempt made in the test program should succeed.

Additional info:

This bug has already been fixed in RHEL 3 U6.  The problem continues to exist
in RHEL 4, although not with the same severity.

Comment 1 Peter Staubach 2005-06-17 19:03:52 UTC
Created attachment 115635 [details]
Test program to reproduce the situation.

Comment 2 Peter Staubach 2005-06-17 19:05:42 UTC
The test program should be run with a "delay" argument of at least 30 seconds.
This is due to some issue with the multithreading on RHEL 4 which I've not
looked into yet.

Comment 3 Peter Staubach 2005-06-29 13:40:58 UTC
Created attachment 116125 [details]
Proposed patch

Comment 4 Peter Staubach 2005-06-29 13:41:53 UTC
A patch containing these changes has been accepted upstream into the '-mm'
release.

Comment 6 Peter Staubach 2005-07-27 19:36:22 UTC
Created attachment 117209 [details]
Proposed patch

Comment 7 Peter Staubach 2005-07-27 19:38:23 UTC
The proposed patch was sent upstream.  After review, it was simplified very
slightly and has been sent to Linus for inclusion into his kernel.

Comment 12 Red Hat Bugzilla 2006-03-07 19:09:20 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html