Bug 657041

Summary: Cman doesn't allow user to select udpu (UDP unicast) corosync transport mechanism
Product: Red Hat Enterprise Linux 6 Reporter: Steven Dake <sdake>
Component: clusterAssignee: Fabio Massimo Di Nitto <fdinitto>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: low Docs Contact:
Priority: high    
Version: 6.0CC: agk, ccaulfie, cfeist, cluster-maint, djansa, donhoover, esammons, fdinitto, jfriesse, jkortus, lhh, riek, rpeterso, sdake, ssaha, swhiteho, teigland
Target Milestone: rcKeywords: FutureFeature, TechPreview
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: cluster-3.0.12-27.el6 Doc Type: Technology Preview
Doc Text:
This technology is technical preview. To enable the UDPU transport, manually add the line <cman transport="udpu"/> to the cluster.conf file. Please note that because of BZ#695795, it is not possible to override totem defaults with the totem tag. A complete cluster restart is required when making this change. The udpu transport is not on-wire compatible with the default multicast transport.
Story Points: ---
Clone Of: 640995 Environment:
Last Closed: 2011-05-19 12:54:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 640995    
Bug Blocks:    
Attachments:
Description Flags
Corosync-objctl output, still using multicast. none

Description Steven Dake 2010-11-24 18:30:45 UTC
+++ This bug was initially created as a clone of Bug #640995 +++

Created attachment 452105 [details]
Proposed patch

Description of problem:
Corosync has support for infiniband and udpu (future version). Cman doesn't. Add  support for that.

Version-Release number of selected component (if applicable):
trunk

How reproducible:
100%

Steps to Reproduce:
1. Try to configure cluster.conf with udpu
  
Actual results:
Isn't possible

Expected results:
Be possible

Additional info:
Depends on corosync 2.x

--- Additional comment from lhh on 2010-10-07 13:26:24 EDT ---

+		[TX_MECH_UDP] = "udp",
+		[TX_MECH_UDPB] = "udp",
+		[TX_MECH_UDPU] = "udpu",
+		[TX_MECH_RDMA] = "iba",


Shouldn't the TX_MECH_UDPB = "udbp" ?

--- Additional comment from sdake on 2010-10-07 13:31:52 EDT ---

Not sure what udbp is, maybe an extra cut and paste.

Regards
-steve

--- Additional comment from lhh on 2010-10-07 14:03:18 EDT ---

UDP broadcast, according to the patch.

--- Additional comment from lhh on 2010-10-07 14:04:58 EDT ---

i.e.

udp = udp multicast
udpb = broadcast
udpu = udp unicast
iba = infiniband

--- Additional comment from sdake on 2010-10-07 14:09:52 EDT ---

In that case, udp is the correct transport.  The broadcast support is implemented in the udp multicast transport driver.

--- Additional comment from sdake on 2010-10-07 14:12:42 EDT ---

+     <attribute name="transport" rha:description="Specifies transport mechanism to use. Available values are udp (multicast default), udpb (broadcast), rdma (Infiniband).  corosync.conf(5)" rha:sample="">

should list udpu as well.

Regards
-steve

--- Additional comment from jfriesse on 2010-10-08 03:38:37 EDT ---

Created attachment 452288 [details]
Proposed patch - try2

Same as previous patch but fixes relax NG description.

--- Additional comment from jfriesse on 2010-11-10 09:39:15 EST ---

Patch committed to git branch STABLE31 as c399f4c0f0d7cc4467a68c41715c404b2afb9425

Comment 2 Steven Dake 2010-11-24 18:38:43 UTC
Honza,

What is the magic incantation to enable udpu from cman?

Comment 3 Fabio Massimo Di Nitto 2010-11-25 06:56:15 UTC
<cman transport="..."/> is the key. I quickly tested udpu and standard multicast udp. I don't have access to IBA to test that.

Fabio

Comment 4 Perry Myers 2010-11-25 13:10:25 UTC
We actually don't want to support iba for Cluster.  We should just focus on enabling only udpu on top of standard mcast.

Comment 11 Perry Myers 2010-11-30 14:00:06 UTC
Removing infiniband from $subject since we don't intend to allow clusters over iba, just udpu and multicast

Comment 12 Fabio Massimo Di Nitto 2011-01-06 10:09:39 UTC
http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=c399f4c0f0d7cc4467a68c41715c404b2afb9425

This commit has been upstream for sometime.

Comment 14 Steven Dake 2011-04-01 22:16:55 UTC
Fabio,

According to Bug #681269 comment #5, cman transport=udpu is not working as intended.  I am loosing internet access in about 1 hr for the weekend, but if you have an opportunity can you have a look at that comment?

Thanks
-steve

Comment 15 Lon Hohberger 2011-04-01 22:53:46 UTC
Works for me:

18:49:31.761271 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 135)
    192.168.122.20.5405 > 192.168.122.21.5405: UDP, length 107
18:49:31.762334 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 135)
    192.168.122.21.5405 > 192.168.122.20.5405: UDP, length 107
18:49:31.971623 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 147)
    192.168.122.20.35029 > 192.168.122.21.5405: UDP, length 119

[root@snap ~]# rpm -q corosync cman
corosync-1.2.3-34.el6.x86_64
cman-3.0.12-41.el6.x86_64

...

<cluster config_version="20" name="lhh-rhel6">
        <cman expected_votes="3" two_node="0" transport="udpu"/>
        <quorumd label="cereal"/>
        ...
</cluster>

...

Apr 01 18:48:24 corosync [TOTEM ] Initializing transport (UDP/IP Unicast).

Comment 17 Don Hoover 2011-04-12 15:46:58 UTC
Well, the latest version us customers have access to is corosync-1.2.3-29 and cman-3.0.12-35 which are the versions in the RHEL6.1 Beta ISO's, which are the latest that us RHN customers have access too.    It's possible that it works with the latest versions you seem to have that I assume will be in the 6.1GA, but its not working as far as I can tell with the 6.1BETA rpms.

I did a full cluster restart, verified the cluster.conf config is validating properly and its still using multicast.

Here is my sanitized cluster.conf.
---------------------------------------------
<?xml version="1.0"?>
<cluster config_version="59" name="rhat6_test">
        <clusternodes>
                <clusternode name="node1" nodeid="1">
                        <fence>
                                <method name="ILO">
                                        <device name="ilo_node1" port="1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="node2" nodeid="2">
                        <fence>
                                <method name="ILO">
                                        <device name="ilo_node2" port="2"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="node3" nodeid="3">
                        <fence>
                                <method name="ILO">
                                        <device name="ilo_node3" port="3"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman transport="udpu"/>
        <fencedevices>
                <fencedevice agent="node1_ilo" ipaddr="x.x.x.x" login="x" name="ilo_node1" passwd="x"/>
                <fencedevice agent="node2_ilo" ipaddr="x.x.x.x" login="x" name="ilo_node2" passwd="x"/>
                <fencedevice agent="node3_ilo" ipaddr="x.x.x.x" login="x" name="ilo_node3" passwd="x"/>
        </fencedevices>
        <totem consensus="60000" join="6000" token="100000" token_retransmits_before_loss_const="20"/>
        <fence_daemon clean_start="0" post_fail_delay="20" post_join_delay="20"/>
        <logging>               
                <logging_daemon debug="on" name="rgmanager"/>
                <logging_daemon debug="on" name="fenced"/>
                <logging_daemon debug="on" name="qdiskd"/>
                <logging_daemon debug="on" name="groupd"/>
                <logging_daemon debug="on" name="corosync"/>
                <logging_daemon debug="on" name="dlm_controld"/>
                <logging_daemon debug="on" name="gfs_controld"/>
                <logging_daemon debug="on" name="corosync" subsys="QUORUM"/>
                <logging_daemon debug="on" name="corosync" subsys="CONFDB"/>
                <logging_daemon debug="on" name="corosync" subsys="CLM"/>
                <logging_daemon debug="on" name="corosync" subsys="CPG"/>
                <logging_daemon debug="on" name="corosync" subsys="MAIN"/>
                <logging_daemon debug="on" name="corosync" subsys="SERV"/>
                <logging_daemon debug="on" name="corosync" subsys="CMAN"/>
                <logging_daemon debug="on" name="corosync" subsys="TOTEM"/>
                <logging_daemon debug="on" name="corosync" subsys="CKPT"/>
                <logging_daemon debug="on" name="corosync" subsys="EVT"/>
        </logging>
        <dlm plock_ownership="1" plock_rate_limit="0"/>
        <gfs_controld plock_rate_limit="0"/>
</cluster>



----------------------------

corosync-objctl only shows one instances of udpu mentioned:

# corosync-objctl |grep udpu
cluster.cman.transport=udpu


I attached the whole sanitized corosync-objctl output for fun.


I realize that we are close to 6.1GA but I am very surprised its not working in the Beta as mentioned in the release.

Comment 18 Don Hoover 2011-04-12 15:48:43 UTC
Created attachment 491522 [details]
Corosync-objctl output, still using multicast.

Comment 19 Fabio Massimo Di Nitto 2011-04-12 17:50:06 UTC
Don: this appears to be related to
https://bugzilla.redhat.com/show_bug.cgi?id=689128 (reported against Fedora) and https://bugzilla.redhat.com/show_bug.cgi?id=695795 (RHEL6 counter part).

when <totem is configured, cman will not configure transport.

For your own testing, you can temporary drop <totem and see that transport is set to udpu.

This will be fixed by RHEL6.2 GA, since udpu support is Tech Preview (not supported in 6.1).

Comment 20 Don Hoover 2011-04-12 19:04:14 UTC
Fabio, Sure enough.

I can verify as well.  Removing the <TOTEM> from the config allows the <CMAN transport=> to work.


I now see UDPU happing the tcpdump, as well as this in the corosync-objctl dump:
>cluster.cman.transport=udpu
>totem.transport=udpu


And I now this in the corosync logs:
Apr 12 18:39:16 corosync [TOTEM ] Initializing transport (UDP/IP Unicast).




So, now I will have to decide if UDPU can be used in any limited deployments before 6.2. I was going to try and use it a little during the 6.1/tech preview because we have had nothing but trouble and instability with multicast on our large network over the time we have been using RHCS.  Its probably the number one thing that has nearly destroyed the reputation of RHCS here.  Multicast sounds great in theory but no switch vendors implement it the same, and its impossible in multi-vendor, very large enterprise network to keep it running reliably.

Having to use the <TOTEM> defaults and giving up any timing tweaks might work, but we have found the totem defaults to be a bit 'touchy' and cause some instability themselves.  

But at last NOW I can do some testing, because its actually working and thats a step in the right direction.

I will follow the progress of the bug 695795 and hope maybe it could be handled in an ERRATA before 6.2GA.

Comment 21 Fabio Massimo Di Nitto 2011-04-12 19:13:48 UTC
(In reply to comment #20)
> Fabio, Sure enough.
> 
> I can verify as well.  Removing the <TOTEM> from the config allows the <CMAN
> transport=> to work.

Thanks for cross checking.

> 
> So, now I will have to decide if UDPU can be used in any limited deployments
> before 6.2. I was going to try and use it a little during the 6.1/tech preview
> because we have had nothing but trouble and instability with multicast on our
> large network over the time we have been using RHCS.  Its probably the number
> one thing that has nearly destroyed the reputation of RHCS here.  Multicast
> sounds great in theory but no switch vendors implement it the same, and its
> impossible in multi-vendor, very large enterprise network to keep it running
> reliably.

I understand the issue with multicast/multi vendor and it´s not something we can address directly.

> Having to use the <TOTEM> defaults and giving up any timing tweaks might work,
> but we have found the totem defaults to be a bit 'touchy' and cause some
> instability themselves.  

Considering that udpu is not supported in 6.1 (and therefor you will be running an environment that is not considered production-ready by RH), I am sure it will be no extra pain for you to grab the patch to fix the <cman + <totem interaction once it´s done and rebuild your packages.

It would be extremely helpful tho, if you could log a case with the issues you experienced with default totem timeouts (unless you have done so already).

> 
> But at last NOW I can do some testing, because its actually working and thats a
> step in the right direction.
> 
> I will follow the progress of the bug 695795 and hope maybe it could be handled
> in an ERRATA before 6.2GA.

Unlikely to make it into an ERRATA since udpu is only Tech Preview in 6.1, but it´s not a hard limit and I believe it can be discussed.

What worries me a bit (personally) is how much you plan to deploy into production based on UDPU before it is considered "supportable".

Comment 22 Fabio Massimo Di Nitto 2011-04-12 19:19:42 UTC
(In reply to comment #17)

> Here is my sanitized cluster.conf.
> ---------------------------------------------
>         <logging>               
>                 <logging_daemon debug="on" name="rgmanager"/>
>                 <logging_daemon debug="on" name="fenced"/>
>                 <logging_daemon debug="on" name="qdiskd"/>
>                 <logging_daemon debug="on" name="groupd"/>
>                 <logging_daemon debug="on" name="corosync"/>
>                 <logging_daemon debug="on" name="dlm_controld"/>
>                 <logging_daemon debug="on" name="gfs_controld"/>
>                 <logging_daemon debug="on" name="corosync" subsys="QUORUM"/>
>                 <logging_daemon debug="on" name="corosync" subsys="CONFDB"/>
>                 <logging_daemon debug="on" name="corosync" subsys="CLM"/>
>                 <logging_daemon debug="on" name="corosync" subsys="CPG"/>
>                 <logging_daemon debug="on" name="corosync" subsys="MAIN"/>
>                 <logging_daemon debug="on" name="corosync" subsys="SERV"/>
>                 <logging_daemon debug="on" name="corosync" subsys="CMAN"/>
>                 <logging_daemon debug="on" name="corosync" subsys="TOTEM"/>
>                 <logging_daemon debug="on" name="corosync" subsys="CKPT"/>
>                 <logging_daemon debug="on" name="corosync" subsys="EVT"/>
>         </logging>

Hi Don, sorry I just noticed this in your config...

you can make it a lot simpler by:

<logging debug="on"/>

and it will be used by all system/subsystems/daemons.

Fabio

Comment 23 Don Hoover 2011-04-12 19:27:28 UTC
>What worries me a bit (personally) is how much you plan to deploy into
>production based on UDPU before it is considered "supportable".

No, it would have been into some test/dev environments. To get comfortable, and worth through any issues or bugs. 

We are looking at is as a better solution than setting transport to broadcast for our environment.  But, we would naturally delay migration of production clusters to using UDPU, until the functionality is full GA.

Of course, running broadcast transport is not considered 'supported' either from what I remember.  :)


>you can make it a lot simpler by:
><logging debug="on"/>
>and it will be used by all system/subsystems/daemons.

Right you are, I did that so I could play with each subsystem and see which ones I really wanted to leave on or off.

Thanks!

Comment 24 Steven Dake 2011-04-12 19:30:53 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
This technology is technical preview.  To enable the UDPU transport, manually add the line <cman transport="udpu"/> to the cluster.conf file.  Please note that because of BZ#695795, it is not possible to override totem defaults with the totem tag.  A complete cluster restart is required when making this change.  The udpu transport is not on-wire compatible with the default multicast transport.

Comment 28 errata-xmlrpc 2011-05-19 12:54:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0537.html