Bug 830799
| Summary: | Nodes do not agree on CPG membership, messages lost | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Jan Friesse <jfriesse> |
| Component: | corosync | Assignee: | Jan Friesse <jfriesse> |
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 6.4 | CC: | abeekhof, agk, asalkeld, fdinitto, jerome.flesch, jfriesse, jkortus, jpallich, nstraz, sbradley, sdake, tlavigne |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | corosync-1.4.1-13.el6 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 820821 | Environment: | |
| Last Closed: | 2013-02-21 07:50:27 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 820821, 863940, 869609 | ||
| Bug Blocks: | 895654 | ||
| Attachments: | |||
|
Description
Jan Friesse
2012-06-11 12:47:15 UTC
Created attachment 591832 [details]
Proposed patch - part 1 - Never choose downlist with localnode
Test scenario is follows:
- node 1, node 2
- node 1 is paused
- node 2 sees node 1 dead
- node 1 unpaused
- node 1 and 2 both choose same dowlist message which includes node 2 ->
node 2 is efectivelly disconnected
Patch includes additional test if left_node is localnode. If so, such
downlist is ignored.
Created attachment 591833 [details]
Proposed patch - part 2 - Process join list after downlists
let's say following situation will happen:
- we have 3 nodes
- on wire messages looks like D1,J1,D2,J2,D3,J3 (D is downlist, J is
joinlist)
- let's say, D1 and D3 contains node 2
- it means that J2 is applied, but right after that, D1 (or D3) is
applied what means, node 2 is again considered down
It's solved by collecting joinlists and apply them after downlist, so
order is:
- apply best matching downlist
- apply all joinlists
Created attachment 591834 [details]
Proposed patch - part 3 - Enhance downlist selection algorithm
Enhance downlist selection algorithm
Let's say we have 2 nodes:
- node 2 is paused
- node 1 create membership (one node)
- node 2 is unpaused
Result is that node 1 downlist is selected, so it means that
from node 2 point of view, node 1 was never down.
Patch solves situation by adding additional check for largest
previous membership.
So current tests are:
1) largest (previous #nodes - #nodes know to have left)
2) (then) largest previous membership
3) (and last as a tie-breaker) node with smallest nodeid
Created attachment 640856 [details]
Proposed patch - part 4 - Fix problem with sync operations under very rare circumstances
This patch creates a special message queue for synchronization messages.
This prevents a situation in which messages are queued in the
new_message_queue but have not yet been originated from corrupting the
synchronization process.
Created attachment 640861 [details]
Proposed patch - part 5 - Handle segfault in backlog_get
If instance->memb_state is not OPERATION or RECOVERY, we was passing
NULL to cs_queue_used call.
Created attachment 640862 [details]
Proposed patch - part 6 - Add waiting_trans_ack also to fragmentation layer
Patch for support waiting_trans_ack may fail if there is synchronization
happening between delivery of fragmented message. In such situation,
fragmentation layer is waiting for message with correct number, but it
will never arrive.
Solution is to handle (callback) change of waiting_trans_ack and use
different queue.
*** Bug 889564 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0497.html *** Bug 884770 has been marked as a duplicate of this bug. *** |