Created attachment 518697 [details]
Backported patch from Corosync
Backport of Corosync b8a061ae28e7c874b66fa1d35ab01f53d1d36b42
Waiting for resolving of https://bugzilla.redhat.com/show_bug.cgi?id=722522
Created attachment 525057 [details]
Deliver all messages from my_high_seq_recieved to the last gap
Backport of corosync 2ec4ddb039b310b308a8748c88332155afd62608
This patch passes two test cases:
Two node cluster - run cpgbench on each node
modify totemsrp with following defines:
Two test cases:
5 node cluster
start 5 nodes randomly at about same time, start 5 nodes randomly at about
same time, wait 10 seconds and attempt to send a message. If message blocks
on "TRY_AGAIN" likely a message loss has occured. Wait a few minutes without
cyclng the nodes and see if the TRY_AGAIN state becomes unblocked.
If it doesn't the test case has failed
Signed-off-by: Steven Dake <firstname.lastname@example.org>
Reviewed-by: Jan Friesse <email@example.com>
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
Previously, when OpenAIS was used in a lossy network, and a large number of configuration changes occurred, OpenAIS sometimes terminated unexpectedly. To solve this problem, the underlying source code has been modified, and OpenAIS no longer crashes in the scenario described.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
*** Bug 818644 has been marked as a duplicate of this bug. ***