Description of problem: If you call dlm_release_lockspace() in libdlm from userland from an application A, application B's reference to the open lockspace will become invalid if application B has no locks granted at the time of the release. e.g. Program A Program B Create LS 'Foo' Open LS 'Foo' Acquire lock Release lock Release LS Acquire lock <-- returns -1 / ENOENT Version-Release number of selected component (if applicable): Current 1/13/2006 CVS - STABLE/RHEL4 How reproducible: 100% Expected results: Application B should not have its handle invalidated. Additional info: A simple, effective workaround is to have app. B detect the ENOENT failure, close/reopen/recreate the lockspace, and retry the lock request. This takes a few seconds, but works in testing. Other possible solutions: - Use AUTOFREE or be able to set this flag from libdlm? - Have libdlm return EBUSY if there are other lockspace users when dlm_release_lockspace is called. - Use a reference count on create/open for the number of local users of a LS, and decrement the count when it reaches 0.
"Use a reference count on create/open for the number of local users of a LS, and decrement the count when it reaches 0." should be: "Use a reference count on create/open for the number of local users of a LS, and decrement the count on release. Only actually fully release the LS when the refcnt reaches 0."
Created attachment 123300 [details] Untested patch Here's an (untested) patch that might do the job. It needs testing for all the open/close/delete /open/delete/close etc conditions of course.
Created attachment 123572 [details] Tested Patch I've tested this patch and it "works for me" (tm)
Patch works for me too.
Fix in -rSTABLE: Checking in device.c; /cvs/cluster/cluster/dlm-kernel/src/device.c,v <-- device.c new revision: 1.24.2.1.4.1.2.7; previous revision: 1.24.2.1.4.1.2.6 done fix in -rRHEL4 (for U4) Checking in device.c; /cvs/cluster/cluster/dlm-kernel/src/device.c,v <-- device.c new revision: 1.24.2.7; previous revision: 1.24.2.6 done
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0558.html