Bug 177934
| Summary: | dlm_release_lockspace from app A can cause app B to break | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] Red Hat Cluster Suite | Reporter: | Lon Hohberger <lhh> | ||||||
| Component: | dlm | Assignee: | Christine Caulfield <ccaulfie> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 4 | CC: | ccaulfie, cluster-maint | ||||||
| Target Milestone: | --- | Keywords: | FutureFeature | ||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | RHBA-2006-0558 | Doc Type: | Enhancement | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2006-08-10 21:26:59 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
"Use a reference count on create/open for the number of local users of a LS, and decrement the count when it reaches 0." should be: "Use a reference count on create/open for the number of local users of a LS, and decrement the count on release. Only actually fully release the LS when the refcnt reaches 0." Created attachment 123300 [details]
Untested patch
Here's an (untested) patch that might do the job. It needs testing for all the
open/close/delete /open/delete/close etc conditions of course.
Created attachment 123572 [details]
Tested Patch
I've tested this patch and it "works for me" (tm)
Patch works for me too. Fix in -rSTABLE: Checking in device.c; /cvs/cluster/cluster/dlm-kernel/src/device.c,v <-- device.c new revision: 1.24.2.1.4.1.2.7; previous revision: 1.24.2.1.4.1.2.6 done fix in -rRHEL4 (for U4) Checking in device.c; /cvs/cluster/cluster/dlm-kernel/src/device.c,v <-- device.c new revision: 1.24.2.7; previous revision: 1.24.2.6 done An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0558.html |
Description of problem: If you call dlm_release_lockspace() in libdlm from userland from an application A, application B's reference to the open lockspace will become invalid if application B has no locks granted at the time of the release. e.g. Program A Program B Create LS 'Foo' Open LS 'Foo' Acquire lock Release lock Release LS Acquire lock <-- returns -1 / ENOENT Version-Release number of selected component (if applicable): Current 1/13/2006 CVS - STABLE/RHEL4 How reproducible: 100% Expected results: Application B should not have its handle invalidated. Additional info: A simple, effective workaround is to have app. B detect the ENOENT failure, close/reopen/recreate the lockspace, and retry the lock request. This takes a few seconds, but works in testing. Other possible solutions: - Use AUTOFREE or be able to set this flag from libdlm? - Have libdlm return EBUSY if there are other lockspace users when dlm_release_lockspace is called. - Use a reference count on create/open for the number of local users of a LS, and decrement the count when it reaches 0.