Created attachment 518697 [details] Backported patch from Corosync Backport of Corosync b8a061ae28e7c874b66fa1d35ab01f53d1d36b42
Waiting for resolving of https://bugzilla.redhat.com/show_bug.cgi?id=722522
Created attachment 525057 [details] 2011-09-27-0001-Deliver-all-messages-from-my_high_seq_recieved-to-th Deliver all messages from my_high_seq_recieved to the last gap Backport of corosync 2ec4ddb039b310b308a8748c88332155afd62608 This patch passes two test cases: ------- Test #1 ------- Two node cluster - run cpgbench on each node modify totemsrp with following defines: Two test cases: ------- Test #2 ------- 5 node cluster start 5 nodes randomly at about same time, start 5 nodes randomly at about same time, wait 10 seconds and attempt to send a message. If message blocks on "TRY_AGAIN" likely a message loss has occured. Wait a few minutes without cyclng the nodes and see if the TRY_AGAIN state becomes unblocked. If it doesn't the test case has failed Signed-off-by: Steven Dake <sdake> Reviewed-by: Jan Friesse <jfriesse>
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, when OpenAIS was used in a lossy network, and a large number of configuration changes occurred, OpenAIS sometimes terminated unexpectedly. To solve this problem, the underlying source code has been modified, and OpenAIS no longer crashes in the scenario described.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0180.html
*** Bug 818644 has been marked as a duplicate of this bug. ***