Bug 547828 - There is an invalid assertion in totemsrp which can cause a sigabort.
Summary: There is an invalid assertion in totemsrp which can cause a sigabort.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openais
Version: 5.4
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Steven Dake
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-12-15 19:08 UTC by Steven Dake
Modified: 2016-04-26 13:42 UTC (History)
3 users (show)

Fixed In Version: openais-0.80.6-12.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:48:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch to remove assertions. (809 bytes, application/octet-stream)
2009-12-15 19:31 UTC, Steven Dake
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0180 0 normal SHIPPED_LIVE openais bug fix update 2010-03-29 12:18:57 UTC

Description Steven Dake 2009-12-15 19:08:57 UTC
Description of problem:
there is an invalid assertion in totemsrp.  It expects the membership will always be 2 or greater, which in some cases is not true.

Version-Release number of selected component (if applicable):
openais-0.80.6-8.el5_4

How reproducible:
rare

Steps to Reproduce:
1. not sure what was done to reproduce the issue
2.
3.
  
Actual results:
assertion occured in openais executive

Expected results:
no assertion occurs in openais executive

Additional info:

Comment 1 Steven Dake 2009-12-15 19:10:25 UTC
Stack backtrace was found on Frantisek Reznicek's machine.  The backtrace is as follows:

Thread 1 (process 11529):
#0  0x0000003238630265 in raise () from /lib64/libc.so.6
#1  0x0000003238631d10 in abort () from /lib64/libc.so.6
#2  0x00000032386296e6 in __assert_fail () from /lib64/libc.so.6
#3  0x000000000040d867 in memb_state_gather_enter (instance=0x2aaaaaaad010, 
    gather_from=12) at totemsrp.c:1691
#4  0x000000000040e4b1 in message_handler_memb_join (instance=0x2aaaaaaad010, 
    msg=0x1c706284, msg_len=<value optimized out>, 
    endian_conversion_needed=<value optimized out>) at totemsrp.c:3971
#5  0x0000000000409f6e in rrp_deliver_fn (context=0x1c705bc0, msg=0x1c706284, 
    msg_len=112) at totemrrp.c:1319
#6  0x00000000004084fb in net_deliver_fn (handle=<value optimized out>, 
    fd=<value optimized out>, revents=<value optimized out>, data=0x1c705c00)
    at totemnet.c:695
#7  0x0000000000405d10 in poll_run (handle=0) at aispoll.c:402
#8  0x0000000000418834 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at main.c:620

Comment 2 Steven Dake 2009-12-15 19:12:53 UTC
The assertion at line 1691 (and also in the memb_join_message_send function)
are invalid because they rely on membership count being 2 or greater.  It is
possible for this assertion to trigger with only 1 member in the cluster, since
the my_proc_list[1] may contain the same entry as is contained in
my_proc_list[0] since my_proc_list[1] is not a valid data entry in this
situation

Comment 3 Steven Dake 2009-12-15 19:31:39 UTC
Created attachment 378598 [details]
patch to remove assertions.

Comment 6 errata-xmlrpc 2010-03-30 07:48:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0180.html


Note You need to log in before you can comment on or make changes to this bug.