Description of problem: When a node joins or leaves the cluster, cman sends a notification callback for every remaining node in the cluster, rather than just one per transition. Version-Release number of selected component (if applicable): 5.0+ How reproducible: Every time Steps to Reproduce: 1. Start a cluster 2. Run the cman/test/libtest program on one node 3. start and stop nodes Actual results: You will see lots of "callback called reason = 1, arg=0" messages when a node leaves or joins. One per node in the cluster Expected results: A single callback for each transition. Additional info: The new confchg callback does not suffer from this, and is recommended (though hardly used) in new code.
Created attachment 313972 [details] Patch This patch fixes the problem by using the sync() features of corosync. We can't simply apply this to RHEL5 (or maybe not even STABLE2) because it will break wire-protocol compatibility. Steve plans to fix this for corosync but maybe not for earlier openais releases, so if we want to fix this for versions before cluster3 some careful additional thought will be needed.
After much thought, I think that a 'fix' is not worth it. In fact a callback per node might actually be useful.