Description of problem: Version-Release number of selected component (if applicable): How reproducible: I hit this while running 4.7 rgmanager regression tests. It appears that a lock will not be released from gulm after the process holding it dies. This will cause the lock to hang if it is attempted again (like in the case of service relocation). Steps to Reproduce: 1. magma_tool lock mylock 2. ^Z and kill the process 3. try the lock again... it's stuck
RPM versions: rgmanager-debuginfo-1.9.78-1 rgmanager-1.9.78-1 magma-debuginfo-1.0.8-1 magma-1.0.8-1 magma-plugins-1.0.14-1 magma-plugins-debuginfo-1.0.14-1 magma-devel-1.0.8-1 gulm-devel-1.0.10-0 gulm-1.0.10-0 gulm-debuginfo-1.0.10-0
So, this is a bug in gulm which I think is affecting rgmanager. If you kill a process which has a gulm lock, the lock is never released.
Note: it may be a "works as intended" method of operation to support lock failover. I'm looking at the gulm code to see if Slave-side caching of locks is done by-connection (I expect it is, but I don't know). If so, I am contemplating making a patch to gulm which will allow locks to be dropped if the connection from Slave->client (on the same host) or Master->client (on the same host) dies. The reason we can't do client->Master (on a different host) is because failover of support of the masters requires the ability to reconnect as-needed.
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
http://sources.redhat.com/git/?p=cluster.git;a=commit;h=e18bde1e5732937e4c7b3e536dfea5bb183f14c2
Modified.
As it turns out, this bugzilla was due to an incorrect unlock check in rgmanager; a two-line patch fixes it. Gulm does hold locks open even if the process dies, but I do not believe this is a bug - rather, I believe this is a "works as intended".
Fix verified in rgmanager-1.9.80-1.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0791.html