Bug 584559 - originating 206 or more messages in recovery causes totem to block
Summary: originating 206 or more messages in recovery causes totem to block
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openais
Version: 5.4
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Steven Dake
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 544680
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-21 20:53 UTC by Benjamin Kahn
Modified: 2016-04-26 13:39 UTC (History)
9 users (show)

Fixed In Version: openais-0.80.3-22.el5_3.14
Doc Type: Bug Fix
Doc Text:
Previously, the openais protocol would fail to recover when originating more then 206 messages in the recovery state. The size of the recovery queue was increased to a more reasonable value. With this update, the protocol finishes the recovery as expected.
Clone Of:
Environment:
Last Closed: 2010-11-05 15:25:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
corefile of aisexec during/after recovery (2.22 MB, application/x-gzip)
2010-10-27 11:44 UTC, Jaroslav Kortus
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0828 0 normal SHIPPED_LIVE openais bug fix update 2010-11-05 15:25:37 UTC

Description Benjamin Kahn 2010-04-21 20:53:47 UTC
This bug has been copied from bug #544680 and has been proposed
to be backported to 5.3 z-stream (EUS).

Comment 4 Florian Nadge 2010-10-18 17:00:20 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, the openais protocol would fail to recover when originating more then 206 messages in the recovery state. The size of the recovery queue was increased to a more reasonable value. With this update, the protocol finishes the recovery as expected.

Comment 5 Jaroslav Kortus 2010-10-27 11:43:49 UTC
There seems to be a problem with the recovery.

How did it happen:
1. 4-node cluster, service cman start.
2. nice -19 cpgverify > /dev/null & (3 times on each node)
3. simulated packet loss of up to 99% via netem on z1
4. when there was small number of messages in recovery (<200) it behaved more or less as expected, but when I let it to cumulate the messages a bit more, then it resulted in cluster-wide meltdown of aisexec.

Oct 27 06:24:51 z2 openais[7139]: [TOTEM] Retransmit List: 574c 574d 56bd 56c0 5702 5704 5705 5706 5707 5729 572a  
Oct 27 06:24:51 z2 openais[7139]: [TOTEM] Retransmit List: 5471 54b8 569c 572b 574b  
Oct 27 06:25:01 z2 openais[7139]: [TOTEM] The token was lost in the OPERATIONAL state. 
Oct 27 06:25:01 z2 openais[7139]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes). 
Oct 27 06:25:01 z2 openais[7139]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). 
Oct 27 06:25:01 z2 openais[7139]: [TOTEM] entering GATHER state from 2. 
Oct 27 06:25:15 z2 openais[7139]: [TOTEM] Saving state aru 5470 high seq received 5896 
Oct 27 06:25:15 z2 openais[7139]: [TOTEM] Storing new sequence id for ring 6c 
Oct 27 06:25:15 z2 openais[7139]: [TOTEM] entering COMMIT state. 
Oct 27 06:25:25 z2 openais[7139]: [TOTEM] entering GATHER state from 13. 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] entering GATHER state from 11. 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] Creating commit token because I am the rep. 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] Storing new sequence id for ring 70 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] entering COMMIT state. 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] entering RECOVERY state. 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] position [0] member 10.15.89.15: 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] previous ring seq 100 rep 10.15.89.14 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] aru 5470 high delivered 5470 received flag 0 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] position [1] member 10.15.89.16: 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] previous ring seq 100 rep 10.15.89.14 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] aru 5470 high delivered 5470 received flag 0 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] position [2] member 10.15.89.17: 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] previous ring seq 100 rep 10.15.89.14 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] aru 5470 high delivered 5470 received flag 0 
Oct 27 06:25:45 z2 openais[7139]: [TOTEM] copying all old ring messages from 5471-5896. 
Oct 27 06:25:45 z2 dlm_controld[7165]: cluster is down, exiting
Oct 27 06:25:45 z2 gfs_controld[7171]: cluster is down, exiting
Oct 27 06:25:45 z2 fenced[7159]: cluster is down, exiting
Oct 27 06:25:45 z2 kernel: dlm: closing connection to node 4
Oct 27 06:25:45 z2 kernel: dlm: closing connection to node 3
Oct 27 06:25:45 z2 kernel: dlm: closing connection to node 2
Oct 27 06:25:47 z2 kernel: dlm: closing connection to node 1
Oct 27 06:26:08 z2 ccsd[7133]: Unable to connect to cluster infrastructure after 30 seconds. 

$ rpm -q openais
openais-0.80.3-22.el5_3.15

I will attach the core file.

Comment 6 Jaroslav Kortus 2010-10-27 11:44:51 UTC
Created attachment 455953 [details]
corefile of aisexec during/after recovery

Comment 7 Jaroslav Kortus 2010-10-27 11:46:28 UTC
Linux z2 2.6.18-128.23.2.el5 #1 SMP Fri Sep 17 06:21:22 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

As it is mixed cluster, the other nodes (ia64, i686) have their respective versions of the same kernel.

Comment 8 Steven Dake 2010-10-27 16:29:14 UTC
(gdb) where
#0  0x00000038cd630215 in raise () from /lib64/libc.so.6
#1  0x00000038cd631cc0 in abort () from /lib64/libc.so.6
#2  0x00000038cd629696 in __assert_fail () from /lib64/libc.so.6
#3  0x0000000000412047 in message_handler_memb_commit_token (
    instance=0x2aaaaaaae010, msg=<value optimized out>, 
    msg_len=<value optimized out>, 
    endian_conversion_needed=<value optimized out>) at ../include/queue.h:118
#4  0x0000000000409c6e in rrp_deliver_fn (context=0x2aaab0000a10, 
    msg=0x2aaab0056e08, msg_len=312) at totemrrp.c:1319
#5  0x00000000004080d6 in net_deliver_fn (handle=<value optimized out>, 
    fd=<value optimized out>, revents=<value optimized out>, 
    data=0x2aaab0056760) at totemnet.c:676
#6  0x0000000000405b60 in poll_run (handle=0) at aispoll.c:402
#7  0x0000000000418652 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at main.c:617
(gdb) up
#3  0x0000000000412047 in message_handler_memb_commit_token (
    instance=0x2aaaaaaae010, msg=<value optimized out>, 
    msg_len=<value optimized out>, 
    endian_conversion_needed=<value optimized out>) at ../include/queue.h:118
118		assert (queue->tail != queue->head);
(gdb) print *memb_commit_token
$1 = {header = {type = 4 '\004', encapsulated = 0 '\0', 
    endian_detector = 65314, nodeid = 4}, token_seq = 4, ring_id = {rep = {
      nodeid = 2, family = 2, 
      addr = "\n\017Y\017\b\000\002\000\n\017Y\017\b\000\004"}, seq = 112}, 
  retrans_flg = 0, memb_index = 3, addr_entries = 3, 
  end_of_commit_token = 0x2aaab0056e3e "\002"}

This reported core dump is a dup of Bug #631835.  Thanks for the test case, I'll give it a go on the parent bug - should help understand how to fix this existing issue.  Changing this bug back to ON_QA.  Since the bug described in this bugzilla is actually fixed, but another issue is causing you test case to segfault.

Comment 9 Steven Dake 2010-10-27 16:40:49 UTC
(gdb) print instance->retrans_message_queue
$3 = {head = 499, tail = 499, used = 499, usedhw = 499, size = 500, 
  items = 0x2aaab0000b40, size_per_item = 176, iterator = 0, mutex = {
    __data = {__lock = 1, __count = 0, __owner = 7139, __nusers = 1, 
      __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = "\001\000\000\000\000\000\000\000�\033\000\000\001", '\0' <repeats 26 times>, __align = 1}}

Comment 10 Steven Dake 2010-10-27 17:00:35 UTC
This sigabort is a dup of bug #588627 which was not copied to the 5.3 z stream.  It is caused when the retrans list becomes full.  The bug described in this bugzilla is fixed.  Please ignore comments about Bug #631835 in comment #8 although the stack trace is valid.

If you want 588627 backported to z stream, please speak with Dean Jansa.

Comment 11 Jaroslav Kortus 2010-11-01 15:18:07 UTC
together with fix for bug 647290 this now looks much better:
Nov  1 10:08:41 z2 openais[9796]: [TOTEM] The token was lost in the OPERATIONAL state. 
Nov  1 10:08:41 z2 openais[9796]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes). 
Nov  1 10:08:41 z2 openais[9796]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). 
Nov  1 10:08:41 z2 openais[9796]: [TOTEM] entering GATHER state from 2. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] entering GATHER state from 11. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] Creating commit token because I am the rep. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] Saving state aru c2f7 high seq received c5ba 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] Storing new sequence id for ring 84 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] entering COMMIT state. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] entering RECOVERY state. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] position [0] member 10.15.89.15: 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] previous ring seq 128 rep 10.15.89.14 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] aru c2f7 high delivered c2f7 received flag 0 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] position [1] member 10.15.89.16: 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] previous ring seq 128 rep 10.15.89.14 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] aru c2f7 high delivered c2f7 received flag 0 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] position [2] member 10.15.89.17: 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] previous ring seq 128 rep 10.15.89.14 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] aru c2f7 high delivered c2f7 received flag 0 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] copying all old ring messages from c2f8-c5ba. 
Nov  1 10:09:01 z2 openais[9796]: [TOTEM] Originated 687 messages in RECOVERY. 

openais-0.80.3-22.el5_3.16

Comment 13 errata-xmlrpc 2010-11-05 15:25:42 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0828.html


Note You need to log in before you can comment on or make changes to this bug.