Bug 735912

Summary: Cman RRP - Include mcast per NIC patch and set default threshold to 3
Product: Red Hat Enterprise Linux 6 Reporter: Jan Friesse <jfriesse>
Component: clusterAssignee: Fabio Massimo Di Nitto <fdinitto>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2CC: ccaulfie, cluster-maint, jkortus, lhh, rpeterso, teigland
Target Milestone: rcKeywords: TechPreview
Target Release: 6.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: cluster-3.0.12.1-19.el6 Doc Type: Technology Preview
Doc Text:
Improve integration between cman and corosync for Redundant Ring Protocol
Story Points: ---
Clone Of:
: 917773 (view as bug list) Environment:
Last Closed: 2011-12-06 15:06:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 743047, 917773    
Attachments:
Description Flags
proposed patch
none
new patch none

Description Jan Friesse 2011-09-06 06:55:14 UTC
Description of problem:
1) There is need for corosync RRP to have different mcast address for all NICs
2) Default threshold should be set to more reasonable value 3

Version-Release number of selected component (if applicable):
Newest for 6.2

How reproducible:
See https://bugzilla.redhat.com/show_bug.cgi?id=722469

Additional info:
Threshold key is totem.rrp_problem_count_threshold

Expected results:
corosync-objctl -a | grep rrp_proble
-> 3

Comment 8 Fabio Massimo Di Nitto 2011-09-08 08:28:28 UTC
Created attachment 522060 [details]
proposed patch

Unit test results:

0) no altname, no <cman/>

Multicast addresses: 239.192.99.73

[root@clusternet-node2 ~]# corosync-objctl |grep rrp
totem.rrp_mode=none

1) autoselection:

no <cman/>

      <altname name="clusternet-node1-eth2"/>
      <altname name="clusternet-node2-eth2"/>

Multicast addresses: 239.192.99.73 239.192.99.74

[root@clusternet-node2 ~]# corosync-objctl |grep rrp
totem.rrp_mode=passive
totem.rrp_problem_count_threshold=3

2) force cman main multicast on primary interface

  <cman>
   <multicast addr="239.192.100.1"/>
  </cman>
  <clusternodes>
    <clusternode name="clusternet-node1-eth1" votes="1" nodeid="1">
      <altname name="clusternet-node1-eth2"/>

[root@clusternet-node2 daemon]# cman_tool status
Multicast addresses: 239.192.100.1 239.192.99.74

[root@clusternet-node2 ~]# corosync-objctl |grep rrp
totem.rrp_mode=passive
totem.rrp_problem_count_threshold=3

3) force mcast all over (note that altname need mcast per node!)

  <cman>
   <multicast addr="239.192.100.1"/>
  </cman>

      <altname name="clusternet-node1-eth2" mcast="239.192.100.2"/>
      <altname name="clusternet-node2-eth2" mcast="239.192.100.2"/>

[root@clusternet-node2 daemon]# cman_tool status
Multicast addresses: 239.192.100.1 239.192.100.2

[root@clusternet-node2 ~]# corosync-objctl |grep rrp
totem.rrp_mode=passive
totem.rrp_problem_count_threshold=3

Comment 10 Fabio Massimo Di Nitto 2011-09-08 09:18:48 UTC
Created attachment 522068 [details]
new patch

The previous patch had an error when <totem tag was present in cluster.conf.

Previous unit test is still valid.

Add the following cases:

<totem rrp_mode="active"/>

[root@clusternet-node2 xml]# corosync-objctl |grep rrp
cluster.totem.rrp_mode=active
totem.rrp_mode=active
totem.rrp_problem_count_threshold=3


<totem rrp_mode="active" rrp_problem_count_threshold="10"/>

[root@clusternet-node2 xml]# corosync-objctl |grep rrp
cluster.totem.rrp_mode=active
cluster.totem.rrp_problem_count_threshold=10
totem.rrp_mode=active
totem.rrp_problem_count_threshold=10

Comment 11 Fabio Massimo Di Nitto 2011-09-08 09:21:24 UTC
The "today I just donĀ“t know how to copy/paste"

  <totem rrp_problem_count_threshold="10"/>

[root@clusternet-node2 xml]# corosync-objctl |grep rrp
cluster.totem.rrp_problem_count_threshold=10
totem.rrp_problem_count_threshold=10
totem.rrp_mode=passive

Comment 15 Fabio Massimo Di Nitto 2011-10-03 06:55:15 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Improve integration between cman and corosync for Redundant Ring Protocol

Comment 17 errata-xmlrpc 2011-12-06 15:06:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1516.html