Bug 681269

Summary: Add UDPU support to corosync or rebase to 1.3.x in RHEL6
Product: Red Hat Enterprise Linux 6 Reporter: Don Hoover <donhoover>
Component: corosyncAssignee: Steven Dake <sdake>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.0CC: cluster-maint
Target Milestone: rcKeywords: Rebase
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Rebase: Bug Fixes and Enhancements
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-01 18:19:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Don Hoover 2011-03-01 15:52:19 UTC
Description of problem:
Rebase corosync to 1.3.


Version-Release number of selected component (if applicable):
corosync 1.3.x


Expected results:

For years now we have been struggling with multicast issues since using RHCS in RHEL5 and now RHEL6.  

The fact is that multicast is often poorly implemented and usually at least implemented differently among the various network switch hardware vendors.

Corosync 1.3 adds UDPU as a transport method.  This will give us another option in addition to the only existing transports available Broadcast(works great but not supported by RH), and multicast(spotty and unreliable depending on network hardware used at site). The dependency on multicast is one of the single biggest things to cause problems for new users, and supportability.

I would strongly request that RH either rebases the corosync to 1.3 or fully backports the UDPU support asap.
 
I would think that since corosync was just introduced with RHEL 6.0, a rebase to follow the development tree would be a better option since RHEL 6.0 has a long life ahead of it.

Comment 3 Steven Dake 2011-03-01 18:19:40 UTC
This is already addressed for RHEL 6.1 in Bug #657041, Bug #624558, and Bug #568164.  The short answer is UDPU is supported in "technical preview" mode for 6.1 with plans to bring to full support status in 6.2 and later.  (As always, plans are subject to change).

Regards
-steve

Comment 4 Steven Dake 2011-03-01 18:20:25 UTC

*** This bug has been marked as a duplicate of bug 568164 ***

Comment 5 Don Hoover 2011-04-01 15:56:36 UTC
So.  I am testing RHEL 6.1 Beta, and added <cman transport="udpu"/> to my cluster configuration, and cman accepts it, and it passes the xml check via ccs_config_validate correctly.

BUT, it does not actually seem to taking effect in corosync.

I see this in the corosync objctl, but other than that I don't see anything that reefers to udpu in the corosync running config.
>cluster.cman.transport=udpu


So to me, it looks like corosync is still trying to use multicast.

[root@uskysvlts01q0 (10.240.48.232) ] # tcpdump -vnn -i eth0 port 5405
    10.240.48.232.5404 > 10.240.48.233.5405: UDP, length 107
    10.240.48.234.5404 > 10.240.48.232.5405: UDP, length 107
    10.240.48.232.5404 > 239.192.92.34.5405: UDP, length 119
    10.240.48.232.5404 > 10.240.48.233.5405: UDP, length 107
    10.240.48.234.5404 > 10.240.48.232.5405: UDP, length 107
    10.240.48.232.5404 > 239.192.92.34.5405: UDP, length 119


RPMS are:
corosync-1.2.3-29.el6.x86_64
cman-3.0.12-35.el6.x86_64


Is udpu not working if you try to configure it from cman?  Its it only working if you do it via corosync.conf?

Comment 6 Steven Dake 2011-04-01 22:21:19 UTC
Don,

Thanks for the report.  The parent Bug #657041 is currently in ON_QA (meaning it hasn't passed QA yet).  I have asked the assigned engineer to take a look at your concern.

Thanks
-steve

Comment 7 Steven Dake 2011-04-01 22:51:09 UTC
Don,

Lon ran the udpu mode and got no multicast messages.

Could you attach your config file?  Perhaps there is some configuration error that our schema checker isn't processing.

Thanks

Comment 8 Lon Hohberger 2011-04-01 23:00:53 UTC
(In reply to comment #5)
> So.  I am testing RHEL 6.1 Beta, and added <cman transport="udpu"/> to my
> cluster configuration, and cman accepts it, and it passes the xml check via
> ccs_config_validate correctly.
> 
> BUT, it does not actually seem to taking effect in corosync.

Seems to work for me:

https://bugzilla.redhat.com/show_bug.cgi?id=657041#c15

It requires a cluster-wide restart to change transport mechanisms.  Though the cluster configuration will change and appear change in cluster.conf and in confdb, corosync does not change transports run-time.

Comment 9 Don Hoover 2011-04-12 15:51:31 UTC
(In reply to comment #7)
> Don,
> 
> Lon ran the udpu mode and got no multicast messages.
> 
> Could you attach your config file?  Perhaps there is some configuration error
> that our schema checker isn't processing.
> 
> Thanks

Stephen, attached information in:
https://bugzilla.redhat.com/show_bug.cgi?id=657041#c17