Description of problem: When a process pauses for longer then the token timeout, the other processors in the system form a new ring. The remaining processor then eventually reschedules and processes the pending membership multicast messages in its kernel queues. This wreaks havok on the membership of the other nodes. While a proper kernel shouldn't pause for long periods, its a reality that many kernels still have long periods of spinlocking without scheduling and no proper preemption. This patch resolves the scenario by creating a timer which records a time stamp at an interval that is the token timeout / 5. Then if a process executes the membership algorithm by receiving a join message, the current time is retrieved and compared to the timestamp. If they differ by more then token timeout / 2, it is assumed the process couldn't schedule (because it couldn't trigger the timer callbacks via poll) and calls totemnet to flush any pending multicasts in the file descriptor responsible for receiving multicast messages. This results in the old membership messages being thrown away allowing the new membership to form properly. This can be tested by ctrl-z a corosync process in a 8 node cluster. Then use fg to bring it into the foreground. Pre-patch - bad news - post patch, prints a notice and proceeds properly. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1.setup 8 node cluster 2.ctrl-z 1 node 3.wait until other nodes form new ring 4. fg ctrl-z node Actual results: totem membership explodes Expected results: new ring formed properly Additional info:
patch posted to ml.