Bug 640311 - Multicast TTL is 1 - prevents use on a multicast routed network
Multicast TTL is 1 - prevents use on a multicast routed network
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: corosync (Show other bugs)
6.0
All Linux
low Severity medium
: rc
: ---
Assigned To: Angus Salkeld
Cluster QE
:
Depends On: 633415
Blocks: 684020 684305 684928 684930 688049
  Show dependency treegraph
 
Reported: 2010-10-05 10:29 EDT by Steven Dake
Modified: 2016-04-26 12:29 EDT (History)
9 users (show)

See Also:
Fixed In Version: corosync-1.2.3-23.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 633415
: 684020 684305 684928 684930 (view as bug list)
Environment:
Last Closed: 2011-05-19 10:24:12 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch committed to upsrtream master (6.72 KB, patch)
2010-11-23 22:50 EST, Angus Salkeld
no flags Details | Diff

  None (edit)
Description Steven Dake 2010-10-05 10:29:06 EDT
+++ This bug was initially created as a clone of Bug #633415 +++

Description of problem:

Multicast TTL is 1, so it can't be used on a routed network.


Version-Release number of selected component (if applicable):

corosync 1.2.0-0ubuntu1

How reproducible:

Hardcoded into system.


Steps to Reproduce:
1. Two hosts linked by an ethernet
2. Libvirt in routed mode.
3. Virtual machines sat on a 10.x.x.x/30 subnet.
4. Unicast routing in place (I'm using bird and OSPF).
5. Multicast routing in place (pimd).
6. Libvirt configured so that it doesn't filter multicast!
7. Tested as working with ssmping
  
Actual results:

Packet won't get beyond the first host as the corosync packet has a TTL of one (local subnet). 


Expected results:

TTL should really be 64 for the address range corosync is using.


Additional info:

This is an old issue from openais days by the look of it: 

http://www.mail-archive.com/linux-cluster@redhat.com/msg00548.html

--- Additional comment from neil@aldur.co.uk on 2010-09-13 14:19:17 EDT ---

Just in case it wasn't clear, each virtual machine has their own 10.x.x.x/30 subnet and there is one on each host
Comment 1 Steven Dake 2010-10-05 10:30:17 EDT
Angus,

What is needed is a configurable option per interface directive that allows override of the default TTL.

Regards
-steve
Comment 3 Angus Salkeld 2010-11-23 22:50:41 EST
Created attachment 462522 [details]
patch committed to upsrtream master
Comment 4 Angus Salkeld 2010-11-23 22:53:15 EST
I have tested that to config option is passed correctly into corosync.
The valid options are 1..255

I have tested multicast (not udpu).
Comment 6 Jaroslav Kortus 2011-03-09 11:24:38 EST
$ cat /etc/cluster/cluster.conf 
<?xml version="1.0"?>
<cluster config_version="5" name="Z_Cluster4">
	<cman>
		<multicast addr="239.192.239.192"/>
	</cman>
	<totem>
		<interface ttl="24"/>
	</totem>
	<clusternodes>
		<clusternode name="z4" nodeid="2"/>
	</clusternodes>
</cluster>

$ ccs_config_validate 
Relax-NG validity error : Extra element totem in interleave
tempfile:6: element totem: Relax-NG validity error : Element cluster failed to validate content
Configuration fails to validate

cluster.rng indeed does not contain any reference to to TTL settings in interface (have looked to <totem> overrides). It works without the TTL directive (so as expected according to the RNG).

Is this usable with cman?
Comment 7 Fabio Massimo Di Nitto 2011-03-09 11:43:03 EST
(In reply to comment #6)
> $ cat /etc/cluster/cluster.conf 
> <?xml version="1.0"?>
> <cluster config_version="5" name="Z_Cluster4">
>  <cman>
>   <multicast addr="239.192.239.192"/>
>  </cman>
>  <totem>
>   <interface ttl="24"/>
>  </totem>
>  <clusternodes>
>   <clusternode name="z4" nodeid="2"/>
>  </clusternodes>
> </cluster>
> 
> $ ccs_config_validate 
> Relax-NG validity error : Extra element totem in interleave
> tempfile:6: element totem: Relax-NG validity error : Element cluster failed to
> validate content
> Configuration fails to validate
> 
> cluster.rng indeed does not contain any reference to to TTL settings in
> interface (have looked to <totem> overrides). It works without the TTL
> directive (so as expected according to the RNG).
> 
> Is this usable with cman?

A warning/error from the relax ng schema is a bug that should be reported to cluster component. The warning/error can be still overridden in order to perform the test and cman will allow the value to be propagated to corosync (where testing should happen for this specific BZ).

If ttl works, then please clone the bug to cluster to update the relaxng schema.

For all corosync devel: please notify via meaning of BZ changes in options or new options that affect cluster so that we can update the config validation bits.
Comment 8 Jaroslav Kortus 2011-03-10 08:32:39 EST
I tried to disable the error reporting and corosync still does not seem to be happy:

$ cman_tool -Dnone join
corosync died: Could not read cluster configuration
$

Mar 10 07:14:54 z4 corosync[15070]:   [MAIN  ] Corosync Cluster Engine ('1.2.3'): started and ready to provide service.
Mar 10 07:14:54 z4 corosync[15070]:   [MAIN  ] Corosync built-in features: nss dbus rdma snmp
Mar 10 07:14:54 z4 corosync[15070]:   [MAIN  ] Successfully read config from /etc/cluster/cluster.conf
Mar 10 07:14:54 z4 corosync[15070]:   [MAIN  ] Successfully parsed cman config
Mar 10 07:14:54 z4 corosync[15070]:   [MAIN  ] parse error in config: No multicast address specified
Mar 10 07:14:54 z4 corosync[15070]:   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1679.


$ cat cluster.conf 
<?xml version="1.0"?>
<cluster config_version="5" name="Z_Cluster4">
	<cman>
		<multicast addr="239.192.239.192"/> 
	</cman>
	<totem>
		<interface ttl="33"/>
	</totem>
	<clusternodes>
		<clusternode name="z4" nodeid="2"/>
	</clusternodes>
</cluster>
Comment 9 Fabio Massimo Di Nitto 2011-03-10 23:23:56 EST
I have done a quick investigation and it looks like a possible bug in cman-preconfig. I'll need some more investigation but it is possible that we will need to use a slightly different syntax to override ttl in totem from cman.

If I am correct, an objdb dump would show:

totem {
  interface {
    ttl: XX
  }
  interface {
    [all_other_parameters]
  }
}

while effectively those should be within the same interface { } stanza.
Comment 10 Fabio Massimo Di Nitto 2011-03-11 04:52:29 EST
Ok i can confirm this is indeed the problem. We need an extra fix for cman and we will need to use <cman ttl="..." or something similar. We can´t use <totem section for any interface values within cman.

I´ll let you know the correct config entry once I have a patch.
Comment 12 Fabio Massimo Di Nitto 2011-03-11 06:28:46 EST
I have done some investigation and I am not sure the patch proposed by Angus is
completely correct (based also on previous defaults).

+
+               /*
+                * Get the TTL
+                */
+               if (totem_config->interfaces[ringnumber].mcast_addr.family ==
AF_INET6) {
+                       totem_config->interfaces[ringnumber].ttl = 255;
+               } else {
+                       totem_config->interfaces[ringnumber].ttl = 1;
+               }

First of all, IPv6 has no reason to have Hop Limit set to 255 by default (HL in
v6 == TTL in v4) as that makes it fully rout-able by default.

Both defaults should be set to 1 (local LAN) according to all recommendations
to keep mcast traffic as local as possible.

+               if (totem_config->interfaces[i].ttl > 255 ||
totem_config->interfaces[i].ttl < 1) {
+                       error_reason = "Invalid TTL (should be 1..255)";
+                       goto parse_error;
+               }

This check could improve a bit. 0 is a valid value (based on
http://tldp.org/HOWTO/Multicast-HOWTO-2.html) and it basically means send only
to the localhost and never hit the wire. In a corosync cluster it doesn´t make
a lot of sense, but it might for single node usage.

index 2cc2cb8..c1b9ed8 100644
--- a/exec/totemudpu.c
+++ b/exec/totemudpu.c
@@ -1304,6 +1304,7 @@ static int totemudpu_build_sockets_ip (
[SNIP]

Setting ttl on udpu is a dangerous option.

Kernel default is 64, that by current date, means crossing the internet 2/3
times back and forward. Most applications (if not all) set it to 30 (see
traceroute for example).

Is there a specific usecase for allowing TTL changes in udpu (other than it can
be done, so let´s do it)?

By allowing users to change unicast TTL, a route flap on the network could mean
cluster self implosion.
Comment 13 Fabio Massimo Di Nitto 2011-03-11 08:32:26 EST
http://publib.boulder.ibm.com/infocenter/zos/v1r9/index.jsp?topic=/com.ibm.zos.r9.hale001/ipv6d0151011131.htm

I also found this document for IPv6 multicast TTL.
Comment 14 Steven Dake 2011-03-11 13:09:03 EST
Fabio,

Regarding udpu, ideally nobody will be mucking with ttl options at all, unless they are operating in multicast routed environment.  We should document (via kbase arch review document) that ttl changes require architecture review so that we can identify the minimum ttl the deployment should use.  We should also place further restrictions that ttl is not meant for fully routed internet networks, but instead internal VLANs.

Angus,

Please investigate and address Fabio's comments related to IPV6 - and if necessary clone a new bugzilla with patch to change the ipv6 default from 255 to 1.  Please send me a mail as to results of this investigation so that appropriate bugzilla flogging can take place.
Comment 15 Fabio Massimo Di Nitto 2011-03-12 01:34:09 EST
(In reply to comment #14)
> Fabio,
> 
> Regarding udpu, ideally nobody will be mucking with ttl options at all, unless
> they are operating in multicast routed environment.  We should document (via
> kbase arch review document) that ttl changes require architecture review so
> that we can identify the minimum ttl the deployment should use.  We should also
> place further restrictions that ttl is not meant for fully routed internet
> networks, but instead internal VLANs.

Ok then I strongly recommend to remove mocking of ttl in udpu and set the correct defaults to ttl = 1 in multicast and do not set it at all in other transports, to allow the kernel to use the current defaults.
Comment 17 Angus Salkeld 2011-03-14 18:23:06 EDT
(In reply to comment #14)
> Fabio,
> 
> Regarding udpu, ideally nobody will be mucking with ttl options at all, unless
> they are operating in multicast routed environment.  We should document (via
> kbase arch review document) that ttl changes require architecture review so
> that we can identify the minimum ttl the deployment should use.  We should also
> place further restrictions that ttl is not meant for fully routed internet
> networks, but instead internal VLANs.
> 
> Angus,
> 
> Please investigate and address Fabio's comments related to IPV6 - and if
> necessary clone a new bugzilla with patch to change the ipv6 default from 255
> to 1.  Please send me a mail as to results of this investigation so that
> appropriate bugzilla flogging can take place.

We have resolved this by creating 2 bugs one for removing ttl changes to the
udpu case and one for setting IPv6 mcast ttl to 1 by default.
https://bugzilla.redhat.com/show_bug.cgi?id=684928
https://bugzilla.redhat.com/show_bug.cgi?id=684930
Comment 18 Fabio Massimo Di Nitto 2011-03-15 04:12:53 EDT
FYI I posted test results for cman+corosync with all patches applied in #684020

Note that I did not verify the totem functionality, by capturing packets on wire.
Comment 19 Jaroslav Kortus 2011-04-18 10:45:35 EDT
works as expected:
Default mcast TTL=1 (OK):
05:22:13.206428 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 147)
    10.15.89.15.5404 > 239.192.181.140.5405: UDP, length 119

Default override TTL=44 (OK):
05:38:12.450338 IP (tos 0x0, ttl 44, id 0, offset 0, flags [DF], proto UDP (17), length 147)
    10.15.89.15.5404 > 239.192.181.140.5405: UDP, length 119

Failure when TTL set to udpu (OK):
Apr 18 05:42:04 z4 corosync[29781]:   [MAIN  ] parse error in config: Can only set ttl on multicast transport types

udpu TTL=64 (default) (OK):
05:43:33.090016 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 147)
    10.15.89.15.37226 > 10.15.89.17.5405: UDP, length 119

udpu IPv6 default TTL (OK):
09:02:40.795006 IP6 (hlim 64, next-header UDP (17) payload length: 127) fec0:0:a0e:5900:221:5eff:fe6f:6cc7.38318 > fec0:0:a0e:5900:222:19ff:fe02:ada8.5405: [udp sum ok] UDP, length 119
09:02:40.795024 IP6 (hlim 64, next-header UDP (17) payload length: 127) fec0:0:a0e:5900:221:5eff:fe6f:6cc7.46353 > fec0:0:a0e:5900:221:5eff:fe6f:6cc7.5405: [udp sum ok] UDP, length 119

IPv6 override TTL=44 (OK):
09:04:13.418369 IP6 (hlim 44, next-header UDP (17) payload length: 127) fec0:0:a0e:5900:221:5eff:fe6f:6cc7.5404 > ff15::b5d6.5405: [udp sum ok] UDP, length 119

IPv6 default TTL (OK):
09:05:27.510154 IP6 (hlim 1, next-header UDP (17) payload length: 127) fec0:0:a0e:5900:221:5eff:fe6f:6cc7.5404 > ff15::b5d6.5405: [udp sum ok] UDP, length 119
Comment 20 errata-xmlrpc 2011-05-19 10:24:12 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0764.html

Note You need to log in before you can comment on or make changes to this bug.