Bug 758476 - When there are multiple node joins close together some nodes can be kicked out of the cluster.
Summary: When there are multiple node joins close together some nodes can be kicked ou...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Corosync Cluster Engine
Classification: Retired
Component: cpg
Version: 1.3
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jan Friesse
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-29 21:49 UTC by John Thompson
Modified: 2020-03-27 19:09 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-27 19:09:25 UTC


Attachments (Terms of Use)

Description John Thompson 2011-11-29 21:49:43 UTC
Description of problem:
We are using a 10 node cluster and this problem occurs regularly when there are multiple node joins at about the same time.

When a node joins just as the cluster is finishing off a join for another
node (some nodes have gone OPERATIONAL from RECOVERY and some nodes are still
in RECOVERY) some of the nodes build a transitional membership such that some of the nodes are classified as leaving.

The problem is that we see a CLM & CPG config change with some of the nodes
marked as left.  A second CLM config change is then seen with them having
joined again.

For CPG the left nodes don't rejoin the group - as shown in corosync-cpgtool,
they are not group members any more.  These nodes have not actually left but
are still part of the cluster, they still appear in more debug showing that
they are part of cluster membership.  They are removed because they were at a different state than one of the nodes in the cluster and that node was chosen for the downlist, which meant they were thought to have left.

The left nodes have a more advanced ring seq id as they had already gone operational. In this case the nodes that were in RECOVERY for the last node join are on the previous ring seq id, this causes the transitional membership to be calculated such that the operational nodes are removed.

Version-Release number of selected component (if applicable):
Corosync 1.3.4

How reproducible:
Very reproducible, 1 in 5 startups.

Steps to Reproduce:
1. Have 10 nodes, all starting corosync at about the same time
2. Look for CLM & CPG notification that a node has left.
  
Actual results:
Nodes that had joined a CPG group are marked as left and are not considered part of the CPG group membership.

Expected results:
I would expect that when there are node joins close together that CPG wouldn't kick members out of the group or would enter them back in.

Additional info:
Here is some debug from node-1 when the issue occurs, you can see the transitional membership & see member 5 is joining, node 12 joined just earlier.  From the ring seq you can see that some of the nodes must have transitioned to operational prior to this node transitioning.

[TOTEM ] totemsrp.c:1998 entering RECOVERY state.
[TOTEM ] totemsrp.c:2040 TRANS [0] member 192.168.255.1:
[TOTEM ] totemsrp.c:2040 TRANS [1] member 192.168.255.10:
[TOTEM ] totemsrp.c:2040 TRANS [2] member 192.168.255.11:
[TOTEM ] totemsrp.c:2040 TRANS [3] member 192.168.255.12:
[TOTEM ] totemsrp.c:2044 position [0] member 192.168.255.1:
[TOTEM ] totemsrp.c:2048 previous ring seq 20 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 60 high delivered 60 received flag 1
[TOTEM ] totemsrp.c:2044 position [1] member 192.168.255.3:
[TOTEM ] totemsrp.c:2048 previous ring seq 24 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 0 high delivered 0 received flag 1
[TOTEM ] totemsrp.c:2044 position [2] member 192.168.255.4:
[TOTEM ] totemsrp.c:2048 previous ring seq 24 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 0 high delivered 0 received flag 1
[TOTEM ] totemsrp.c:2044 position [3] member 192.168.255.5:
[TOTEM ] totemsrp.c:2048 previous ring seq 0 rep 192.168.255.5
[TOTEM ] totemsrp.c:2054 aru 0 high delivered 0 received flag 1
[TOTEM ] totemsrp.c:2044 position [4] member 192.168.255.6:
[TOTEM ] totemsrp.c:2048 previous ring seq 24 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 0 high delivered 0 received flag 1
[TOTEM ] totemsrp.c:2044 position [5] member 192.168.255.8:
[TOTEM ] totemsrp.c:2048 previous ring seq 24 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 0 high delivered 0 received flag 1
[TOTEM ] totemsrp.c:2044 position [6] member 192.168.255.9:
[TOTEM ] totemsrp.c:2048 previous ring seq 24 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 0 high delivered 0 received flag 1
[TOTEM ] totemsrp.c:2044 position [7] member 192.168.255.10:
[TOTEM ] totemsrp.c:2048 previous ring seq 20 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 60 high delivered 60 received flag 1
[TOTEM ] totemsrp.c:2044 position [8] member 192.168.255.11:
[TOTEM ] totemsrp.c:2048 previous ring seq 20 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 60 high delivered 60 received flag 1
[TOTEM ] totemsrp.c:2044 position [9] member 192.168.255.12:
[TOTEM ] totemsrp.c:2048 previous ring seq 20 rep 192.168.255.1
[TOTEM ] totemsrp.c:2054 aru 60 high delivered 60 received flag 1


Note You need to log in before you can comment on or make changes to this bug.