619496 – make corosync more resilient to delayed multicast packets

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 619496 - make corosync more resilient to delayed multicast packets

Summary: make corosync more resilient to delayed multicast packets

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	corosync
Sub Component:
Version:	6.0
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Steven Dake
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	619536 638592
TreeView+	depends on / blocked

Reported:	2010-07-29 16:54 UTC by Steven Dake
Modified:	2016-04-26 15:20 UTC (History)
CC List:	5 users (show)
Fixed In Version:	corosync-1.2.3-22.el6
Doc Type:	Bug Fix
Doc Text:	OpenAIS has been enabled to work in network environments wherein multicast messages are slightly delayed when compared to token messages.
Clone Of:
Clones:	619536 (view as bug list)
Environment:
Last Closed:	2011-05-19 14:24:04 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
patch that introduces the tuneable (6.00 KB, patch) 2010-07-29 18:06 UTC, Steven Dake	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2011:0764	0	normal	SHIPPED_LIVE	corosync bug fix update	2011-05-18 18:08:44 UTC

Description Steven Dake 2010-07-29 16:54:11 UTC

Description of problem:
Many network switches use a software component to "emulate multicast" by sending a multicast to the switch.  Then the switch sends to every member of the igmp group.  This multicast has extra latency compared to the unicast token (I've measured about 200 usec).  When a processor receives a token, it adds all unreceived messages to a retransmit list.  These retransmits result in extra network bandwidth consumption, when in fact the multicast regular message is not lost, but just delayed.

Version-Release number of selected component (if applicable):
corosync-1.2.3-17.e6

How reproducible:
seems 100% using Cisco infrastructure in RH IT labs

Steps to Reproduce:
1. start two node corosync cluster with totem configured to output debug info
2. run cpgbench
3. see retransmits occur

We can tell multicast is delayed by adding a small delay before transmitting the token.  Another mechanism is to use traffic shaping netem as follows to delay the token:
tc qdisc add dev eth0 root handle 1: prio
tc qdisc add dev eth0 parent 1:3 handle 30: netem delay 1ms
tc filter add dev eth0 protocol ip parent 1:0 prio 3 u32 match ip dst 10.16.144.
40/32 flowid 1:3

(note 10.16.144.40 is the target of the next token).
  
Actual results:
when multicast is delayed, totem retransmits messages unnecessarily

Expected results:
no messages should be transmitted unnecessarily

Additional info:

Comment 1 Steven Dake 2010-07-29 16:55:07 UTC

For those that don't see this problem in their switches, it is possible to emulate via netem by changing the ip address above to the multicast address (hence introducing a 1ms multicast transmit delay).

Comment 2 Steven Dake 2010-07-29 18:06:42 UTC

Created attachment 435364 [details]
patch that introduces the tuneable

Comment 5 Douglas Silas 2011-01-11 23:11:46 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
OpenAIS has been enabled to work in network environments wherein multicast messages are slightly delayed when compared to token messages.

Comment 8 errata-xmlrpc 2011-05-19 14:24:04 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0764.html

Note You need to log in before you can comment on or make changes to this bug.