Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 647290

Summary: if a node originates more then 512 messages in recovery it will sigabort (assert)
Product: Red Hat Enterprise Linux 5 Reporter: Benjamin Kahn <bkahn>
Component: openaisAssignee: Steven Dake <sdake>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.5CC: bkahn, cluster-maint, edamato, fnadge, jkortus, jwest, pm-eus, sdake
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openais-0.80.3-22.el5_3.16 Doc Type: Bug Fix
Doc Text:
Previously, an abort signal caused the cluster to exit if more then 500 messages were originated in the RECOVERY state. This update resolves this issue and behaves as expected in the RECOVERY state.
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-05 15:25:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 588500    
Bug Blocks:    

Description Benjamin Kahn 2010-10-27 19:54:44 UTC
This bug has been copied from bug #588500 and has been proposed
to be backported to 5.3 z-stream (EUS).

Comment 4 Jaroslav Kortus 2010-11-01 15:18:03 UTC
Recovered successfully with 687 messages:

Nov  1 10:08:41 z2 openais[9796]: [TOTEM] The token was lost in the OPERATIONAL state. 
Nov  1 10:08:41 z2 openais[9796]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes). 
Nov  1 10:08:41 z2 openais[9796]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). 
Nov  1 10:08:41 z2 openais[9796]: [TOTEM] entering GATHER state from 2. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] entering GATHER state from 11. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] Creating commit token because I am the rep. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] Saving state aru c2f7 high seq received c5ba 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] Storing new sequence id for ring 84 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] entering COMMIT state. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] entering RECOVERY state. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] position [0] member 10.15.89.15: 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] previous ring seq 128 rep 10.15.89.14 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] aru c2f7 high delivered c2f7 received flag 0 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] position [1] member 10.15.89.16: 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] previous ring seq 128 rep 10.15.89.14 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] aru c2f7 high delivered c2f7 received flag 0 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] position [2] member 10.15.89.17: 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] previous ring seq 128 rep 10.15.89.14 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] aru c2f7 high delivered c2f7 received flag 0 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] copying all old ring messages from c2f8-c5ba. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] Originated 687 messages in RECOVERY. 
[...]

openais-0.80.3-22.el5_3.16

Comment 6 errata-xmlrpc 2010-11-05 15:25:55 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0828.html

Comment 7 Florian Nadge 2011-01-03 09:44:46 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, an abort signal caused the cluster to exit if more then 500 messages were originated in the RECOVERY state. This update resolves this issue and behaves as expected in the RECOVERY state.