Bug 619496 - make corosync more resilient to delayed multicast packets
make corosync more resilient to delayed multicast packets
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: corosync (Show other bugs)
6.0
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: Steven Dake
Cluster QE
: ZStream
Depends On:
Blocks: 619536 638592
  Show dependency treegraph
 
Reported: 2010-07-29 12:54 EDT by Steven Dake
Modified: 2016-04-26 11:20 EDT (History)
5 users (show)

See Also:
Fixed In Version: corosync-1.2.3-22.el6
Doc Type: Bug Fix
Doc Text:
OpenAIS has been enabled to work in network environments wherein multicast messages are slightly delayed when compared to token messages.
Story Points: ---
Clone Of:
: 619536 (view as bug list)
Environment:
Last Closed: 2011-05-19 10:24:04 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch that introduces the tuneable (6.00 KB, patch)
2010-07-29 14:06 EDT, Steven Dake
no flags Details | Diff

  None (edit)
Description Steven Dake 2010-07-29 12:54:11 EDT
Description of problem:
Many network switches use a software component to "emulate multicast" by sending a multicast to the switch.  Then the switch sends to every member of the igmp group.  This multicast has extra latency compared to the unicast token (I've measured about 200 usec).  When a processor receives a token, it adds all unreceived messages to a retransmit list.  These retransmits result in extra network bandwidth consumption, when in fact the multicast regular message is not lost, but just delayed.

Version-Release number of selected component (if applicable):
corosync-1.2.3-17.e6

How reproducible:
seems 100% using Cisco infrastructure in RH IT labs

Steps to Reproduce:
1. start two node corosync cluster with totem configured to output debug info
2. run cpgbench
3. see retransmits occur

We can tell multicast is delayed by adding a small delay before transmitting the token.  Another mechanism is to use traffic shaping netem as follows to delay the token:
tc qdisc add dev eth0 root handle 1: prio
tc qdisc add dev eth0 parent 1:3 handle 30: netem delay 1ms
tc filter add dev eth0 protocol ip parent 1:0 prio 3 u32 match ip dst 10.16.144.
40/32 flowid 1:3

(note 10.16.144.40 is the target of the next token).
  
Actual results:
when multicast is delayed, totem retransmits messages unnecessarily

Expected results:
no messages should be transmitted unnecessarily

Additional info:
Comment 1 Steven Dake 2010-07-29 12:55:07 EDT
For those that don't see this problem in their switches, it is possible to emulate via netem by changing the ip address above to the multicast address (hence introducing a 1ms multicast transmit delay).
Comment 2 Steven Dake 2010-07-29 14:06:42 EDT
Created attachment 435364 [details]
patch that introduces the tuneable
Comment 5 Douglas Silas 2011-01-11 18:11:46 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
OpenAIS has been enabled to work in network environments wherein multicast messages are slightly delayed when compared to token messages.
Comment 8 errata-xmlrpc 2011-05-19 10:24:04 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0764.html

Note You need to log in before you can comment on or make changes to this bug.