Bug 177934 - dlm_release_lockspace from app A can cause app B to break
dlm_release_lockspace from app A can cause app B to break
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: dlm (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Christine Caulfield
Cluster QE
: FutureFeature
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-01-16 11:22 EST by Lon Hohberger
Modified: 2009-04-16 16:00 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2006-0558
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-10 17:26:59 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Untested patch (1.76 KB, patch)
2006-01-17 10:43 EST, Christine Caulfield
no flags Details | Diff
Tested Patch (1.51 KB, patch)
2006-01-23 06:54 EST, Christine Caulfield
no flags Details | Diff

  None (edit)
Description Lon Hohberger 2006-01-16 11:22:43 EST
Description of problem:

If you call dlm_release_lockspace() in libdlm from userland from an application
A, application B's reference to the open lockspace will become invalid if
application B has no locks granted at the time of the release.

e.g.

Program A        Program B
Create LS 'Foo'
                 Open LS 'Foo'
Acquire lock
Release lock
Release LS
                 Acquire lock   <-- returns -1 / ENOENT

Version-Release number of selected component (if applicable): Current 1/13/2006
CVS - STABLE/RHEL4

How reproducible: 100% 

Expected results: Application B should not have its handle invalidated.

Additional info: A simple, effective workaround is to have app. B detect the
ENOENT failure, close/reopen/recreate the lockspace, and retry the lock request.
 This takes a few seconds, but works in testing.

Other possible solutions:
- Use AUTOFREE or be able to set this flag from libdlm?
- Have libdlm return EBUSY if there are other lockspace users when
dlm_release_lockspace is called.
- Use a reference count on create/open for the number of local users of a LS,
and decrement the count when it reaches 0.
Comment 1 Lon Hohberger 2006-01-16 12:10:48 EST
"Use a reference count on create/open for the number of local users of a LS,
and decrement the count when it reaches 0."

should be:

"Use a reference count on create/open for the number of local users of a LS,
and decrement the count on release.  Only actually fully release the LS when the
refcnt reaches 0."
Comment 2 Christine Caulfield 2006-01-17 10:43:55 EST
Created attachment 123300 [details]
Untested patch

Here's an (untested) patch that might do the job. It needs testing for all the
open/close/delete /open/delete/close etc conditions of course.
Comment 3 Christine Caulfield 2006-01-23 06:54:46 EST
Created attachment 123572 [details]
Tested Patch

I've tested this patch and it "works for me" (tm)
Comment 4 Lon Hohberger 2006-02-21 10:01:54 EST
Patch works for me too.
Comment 5 Christine Caulfield 2006-02-22 04:04:50 EST
Fix in -rSTABLE:

Checking in device.c;
/cvs/cluster/cluster/dlm-kernel/src/device.c,v  <--  device.c
new revision: 1.24.2.1.4.1.2.7; previous revision: 1.24.2.1.4.1.2.6
done

fix in -rRHEL4 (for U4)

Checking in device.c;
/cvs/cluster/cluster/dlm-kernel/src/device.c,v  <--  device.c
new revision: 1.24.2.7; previous revision: 1.24.2.6
done
Comment 8 Red Hat Bugzilla 2006-08-10 17:26:59 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0558.html

Note You need to log in before you can comment on or make changes to this bug.