Bug 472815 - Message continuation doesn't match previous frag e: 0 - a: 242
Message continuation doesn't match previous frag e: 0 - a: 242
Status: CLOSED DUPLICATE of bug 261381
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openais (Show other bugs)
5.3
All Linux
medium Severity medium
: rc
: ---
Assigned To: Steven Dake
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-11-24 15:13 EST by Nate Straz
Modified: 2016-04-26 11:10 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-12-05 16:01:59 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
core dump from tank-01, gzipped (71.62 KB, application/x-gzip)
2008-11-24 16:01 EST, Nate Straz
no flags Details

  None (edit)
Description Nate Straz 2008-11-24 15:13:19 EST
Description of problem:

I saw $summary on one node while I was running revolver.  Four nodes out of a six node cluster were shot.

[TOTEM] entering GATHER state from 8. 
[TOTEM] entering GATHER state from 11. 
[TOTEM] Saving state aru 0 high seq received 0 
[TOTEM] Storing new sequence id for ring e60 
[TOTEM] entering COMMIT state. 
[TOTEM] entering RECOVERY state. 
[TOTEM] position [0] member 10.15.89.61: 
[TOTEM] previous ring seq 3676 rep 10.15.89.61 
[TOTEM] aru 331 high delivered 287 received flag 1 
[TOTEM] position [1] member 10.15.89.63: 
[TOTEM] previous ring seq 3676 rep 10.15.89.61 
[TOTEM] aru 331 high delivered 287 received flag 1 
[TOTEM] position [2] member 10.15.89.64: 
[TOTEM] previous ring seq 3676 rep 10.15.89.61 
[TOTEM] aru 331 high delivered 287 received flag 1 
[TOTEM] position [3] member 10.15.89.91: 
[TOTEM] previous ring seq 3660 rep 10.15.89.91 
[TOTEM] aru 0 high delivered 0 received flag 1 
[TOTEM] position [4] member 10.15.89.93: 
[TOTEM] previous ring seq 3676 rep 10.15.89.61 
[TOTEM] aru 331 high delivered 287 received flag 1 
[TOTEM] position [5] member 10.15.89.94: 
[TOTEM] previous ring seq 3676 rep 10.15.89.61 
[TOTEM] aru 331 high delivered 287 received flag 1 
[TOTEM] Did not need to originate any messages in recovery. 
[CLM  ] CLM CONFIGURATION CHANGE 
[CLM  ] New Configuration: 
[CLM  ] Members Left: 
[CLM  ] Members Joined: 
[CLM  ] CLM CONFIGURATION CHANGE 
[CLM  ] New Configuration: 
[CLM  ]  r(0) ip(10.15.89.61)  
[CLM  ]  r(0) ip(10.15.89.63)  
[CLM  ]  r(0) ip(10.15.89.64)  
[CLM  ]  r(0) ip(10.15.89.91)  
[CLM  ]  r(0) ip(10.15.89.93)  
[CLM  ]  r(0) ip(10.15.89.94)  
[CLM  ] Members Left: 
[CLM  ] Members Joined: 
[CLM  ]  r(0) ip(10.15.89.61)  
[CLM  ]  r(0) ip(10.15.89.63)  
[CLM  ]  r(0) ip(10.15.89.64)  
[CLM  ]  r(0) ip(10.15.89.91)  
[CLM  ]  r(0) ip(10.15.89.93)  
[CLM  ]  r(0) ip(10.15.89.94)  
[SYNC ] This node is within the primary component and will provide service. 
[TOTEM] entering OPERATIONAL state. 
[CMAN ] quorum regained, resuming activity 
[CMAN ] quorum lost, blocking activity 
[TOTEM] Message continuation doesn't match previous frag e: 0 - a: 242 
[TOTEM] Throwing away broken message: continuation 0, index 0 

After this, aisexec was not running on the system.  The cman init script failed trying to start cman.

Version-Release number of selected component (if applicable):
openais-0.80.3-21.el5
cman-2.0.97-1.el5

How reproducible:
Unknown
Comment 1 Nate Straz 2008-11-24 15:16:55 EST
On other nodes I did see messages like this:

morph-03 openais[2707]: [CLM  ] got nodejoin message 10.15.89.93 
morph-03 openais[2707]: [CLM  ] got nodejoin message 10.15.89.94 
morph-03 openais[2707]: [CLM  ] got nodejoin message 10.15.89.61 
morph-03 openais[2707]: [CLM  ] got nodejoin message 10.15.89.63 
morph-03 openais[2707]: [CLM  ] got nodejoin message 10.15.89.64 
morph-03 openais[2707]: [EVT  ] Can't find cluster node at r(0) ip(10.15.89.91)  
morph-03 openais[2707]: [CPG  ] got joinlist message from node 4 
morph-03 openais[2707]: [CPG  ] got joinlist message from node 6 
morph-03 openais[2707]: [CPG  ] got joinlist message from node 7 
morph-03 openais[2707]: [CPG  ] got joinlist message from node 2
Comment 3 Nate Straz 2008-11-24 16:01:44 EST
Created attachment 324536 [details]
core dump from tank-01, gzipped

Here's the core dump from tank-01.  It's an i386 core from aisexec from package openais-0.80.3-21.el5
Comment 4 Steven Dake 2008-12-05 16:01:59 EST
this is a dup of 261381.

*** This bug has been marked as a duplicate of bug 261381 ***

Note You need to log in before you can comment on or make changes to this bug.