Description of problem: This is a bug in the way SM manages multiple recovery events. A specific arrangement would be required to see this: - nodes A,B,C,D,E are in the cluster - nodes A,B,C,D,E are in the fence domain (FD) - A,B,C are using gfs X - C,D,E are using gfs Y The bug is possible on node C if A fails creating a recovery event for X (rev1), and just after that D fails creating a recovery event for Y (rev2). If the two nodes fail at once there won't be a problem because a single recovery event will be created. The timing of the consecutive failures would need to be just right. The problem arises when the group representing the fence domain (FD) is moved from rev1 into rev2. This makes the groups in rev2 depend on FD recovery, but removes the dependecy of rev1 groups on FD recovery. In actual fact, both rev1 and rev2 groups depend on FD recovery, but the code has no way right now to make two rev's depend on the same group. When the FD dependency is removed from rev1, recovery for the higher level groups in rev1 (which are the dlm and gfs groups for X) goes ahead without waiting for FD recovery to finish. Both A and D will still be fenced, and given how recovery works it's likely to happen before gfs recovery on X begins. But, if gfs-X recovery happens to start before A is fenced, and A isn't really dead and comes back to life and writes to X, then X could be corrupted. If manual fencing is used, then it becomes very likely that recovery for gfs-X happens before A is fenced, and you have to hope A won't come back to life and write to X. Version-Release number of selected component (if applicable): How reproducible: I doubt anyone has seen this in practice. The arrangement of fs mounts is unusual and there are multiple places in the process where a special timing of events is needed. Steps to Reproduce: 1. see above, using fence_manual helps a lot 2. 3. Actual results: you'll see gfs-X recovery happen before A is fenced Expected results: gfs-X recovery won't happen until after A is fenced Additional info:
I haven't come up with any simple fixes to this problem. We'll have to see how complex the solution I have in mind ends up being.
Added this description as a comment in the code it affects. Is fixed in RHEL5 code.