Description of problem: When two or more processes listen for multicast messages on the same port (different multicast groups) they all receive messages sent to the same port for any multicast group that a process on the host machine is member of. Version-Release number of selected component: verified on 2.6.9-42.0.2.ELsmp and 2.4.21-47.ELsmp but affects other versions too How reproducible: Always Steps to Reproduce: 1. download and compile these simple c programs http://www.nmsl.cs.ucsb.edu/MulticastSocketsBook/c_send_receive.tar.gz 2. run "mcreceive 225.12.12.12 1444" 3. run "mcreceive 225.12.12.14 1444" 4. run "mcsend 225.12.12.12 1444" 5. type something and hit "enter" Actual results: both processes (step 2 and 3) have reveived the message Expected results: only the first process reveices the message Additional example: 1. run "mcreceive 225.12.12.12 2222" 2. run "mcreceive 225.12.12.14 1444" 3. run "mcsend 225.12.12.12 1444" 4. type something and hit "enter" Process from step "2" receives the message when none of the processes should.
Created attachment 149864 [details] Simple programs to send/receive multicast messages
http://www.uwsg.iu.edu/hypermail/linux/net/0211.1/0003.html The mentioned patch there seems to address the same issue. There is some additional info about binding.
This bug affects production deployments of JBoss on RHEL. The impact usually involves several hours of frustration on the customer's end trying to determine the cause, followed by 1-2 hours of support cost for Red Hat. More information:
More information: http://wiki.jboss.org/wiki/Wiki.jsp?page=PromiscuousTraffic
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
According to the email as I read it, and from what I understand of the RFC, This is working exactly as designed. the mcreceive program binds to INADDR_ANY before adding multicast membership. That means the socket in each mcreceive process should receive all multicast traffic from all groups that any process is added to, which is exactly what you are seeing. If you want only the first process to receive multicast frames, then you should write a separate program to run in step 2 that binds to 225.12.12.14:1444, rather than INADDR_ANY. the Jboss kbase entry should be fixed to reflect that.
You seem to be right. I modified mcreceive to bind to a specific multicast address (not INADDR_ANY) and it works as required: 1. 2 processes can bind to the same multicast address/join the same multicast group 2. process will not receive any multicast messages to other groups But the email I mention in "Comment #2" suggests current IPv6 implementation implies a check whether a process has joined a multicast group before passing messages to it. So I think we should have a consistent behavior between IPv4 and IPv6. As well the change suggested in the mail doesn't seem to be intrusive. We currently seem to not violate rfc but with the change we will not violate it either and get more predictive behavior, right?
If I make the suggested change in mcreceive.c: /* construct a multicast address structure */ memset(&mc_addr, 0, sizeof(mc_addr)); mc_addr.sin_family = AF_INET; // mc_addr.sin_addr.s_addr = htonl(INADDR_ANY); mc_addr.sin_addr.s_addr = inet_addr(mc_addr_str); // <==== changed ! mc_addr.sin_port = htons(mc_port); , where I bind to the multicast address and port, then this works on Linux (Fedora 7), but fails on Windows: $ ./mcreceive.exe 228.8.8.8 7500 bind() failed: Cannot assign requested address (my routing table is below) I thought that bind() had to accept the address associated with a local network interface, so that a class D address is invalid. Network Destination Netmask Gateway Interface Metric 127.0.0.0 255.0.0.0 127.0.0.1 127.0.0.1 1 192.168.5.0 255.255.255.0 192.168.5.2 192.168.5.2 10 192.168.5.2 255.255.255.255 127.0.0.1 127.0.0.1 10 192.168.5.255 255.255.255.255 192.168.5.2 192.168.5.2 10 224.0.0.0 240.0.0.0 192.168.5.2 192.168.5.2 10 255.255.255.255 255.255.255.255 192.168.5.2 192.168.5.2 1
as per conversation in bz 369591, I'm reopening this to add the ipv6 check that was previously discussed to the ipv4 code
Created attachment 306308 [details] patch to filter out unjoined groups in ipv4 I've not tested it yet, but heres a patch that I think will properly filter out mc group frames from sockets that haven't explicitly joined. please give it a whirl and let me know the results.
Would it be possible to put that on some machine and give me access there? I'm in the JBoss QA team and I'm not used to building custom kernels... Or if you know simple instructions on how to do, I can bring down for some hours one machine to test it.
Sure, I'll build you a kernel. What arches do you need?
i386 will be fine, thanks
test kernel available here: http://people.redhat.com/nhorman/rpms/kernel-2.6.9-70.EL.bz231899.i686.rpm
I've tested the kernel and a process gets only messages received on its interface for his group and port. No matter if another process is listening on same group/port but on another interface. As well when no sending interface is specified, system routing table is consulted. However I see something worrisome: 1. ./mcreceive 226.7.6.5 10000 0.0.0.0 2. ./mcsend 226.7.6.5 10000 127.0.0.1 Then the listener doesn't see any messages if OS routing of mcast messages is not through lo. I've expected to receive messages sent through any interface.
Created attachment 307093 [details] new test patch hmm, I think this should allow for multicast addresses to be received on any interface if the ip_mreqn.imr_addr.s_addr field is set to INADDR_ANY. I'll build test kernels shortly.
http://people.redhat.com/nhorman/rpms/kernel-2.6.9-70.EL.bz231899.2i686.rpm New test kernel is there. Let me know if that corrects your INADDR_ANY problem from comment 15. Thanks!
404 :(
I see it is http://people.redhat.com/nhorman/rpms/kernel-2.6.9-70.EL.bz231899.2.i686.rpm
http://people.redhat.com/nhorman/rpms/kernel-2.6.9-70.EL.bz231899.2.i686.rpm Sorry, messed up the link previously. That one works.
nope, it's the same as with the first version of the patch. Seems like the listener binds to the interface where OS multicast route goes through.
The above test, in comment 15: Does is work if you specify 127.0.0.1 as the listening interface on the receiver? Does is work if you specify a real ip address to one of the interfaces on the sender and INADDR_ANY on the receiver?
(In reply to comment #22) > The above test, in comment 15: > > Does is work if you specify 127.0.0.1 as the listening interface on the receiver? yes > Does is work if you specify a real ip address to one of the interfaces on the > sender and INADDR_ANY on the receiver? If the IP is on the interface the OS mcast route goes through, then yes. But if it is not, then no. As far as I understood it must always work.
Aleksandar, can you please attach your modified version (3 parameters) of the multicast test programs? Neil, any progress on this bug?
No, I've been occupied with other issues, I'll get back to this as soon as I can.
attachment (id=305466) and attachment (id=305204)
Out of curiosity, has this been discussed upstream, other than the 5-year old thread mentioned in comment #2?
not yet.
Aleksandar, after reading comment #7, I decided to port the test programs (mcsend and mcreceive) to IPv6 for testing purposes. Your comment suggested that the Linux IPv6 stack did implement the multicast group filtering that is being discussed in this bug. However, my tests show that IPv6 behaves exactly as IPv4 as far as multicast groups are concerned (that is: two multicast receivers binding to in6addr_any on the same UDP port see each other's traffic.) I'll attach my IPv6 test programs if you want to give them a try.
Created attachment 311483 [details] IPv6 version of the multicast test programs I am pretty new to IPv6 programming so comments are welcome. One thing I noticed is that it is apparently not possible to bind an IPv6 socket to a multicast address, contrary to IPv4. So, unless I missed something, it is not possible to use that trick to filter the IPv6 multicast packets.
please open a separate bug for IPv6 if you would please. I think the code to properly deliver multicast frames will not be common to both protocols
I didn't do any personal IPv6 testing. No idea about it. In Comment #7 I'm referring to external mailing list post that is unverified. I just hope it will be standards compliant and compatible with other UNIXes and MS windows. As well consistent with IPv4 (or make IPv4 consistent with IPv6). I can help with testing how is it on HP-UX and AIX. But I think it's best for you to create a new issue as you'll be able to describe it most accurately. I'm really lost with the addition of IPv6. Please let me know the issue number so I can watch and help with testing.
Neil, I didn't mean to suggest that there was a bug on the IPv6 side. I was simply testing what the current behavior was, because in comment #7, Aleksandar suggested that IPv6 was behaving differently from IPv4 and presented this fact an an incentive for changing the IPv4 behavior. My tests suggest that this statement was incorrect and IPv4 and IPv6 are currently consistent in their behaviors. As an additional data point, I tested the IPv4 behavior on OpenBSD, FreeBSD and NetBSD earlier this week and they all behave the same as Linux does. So, as surprising as the behavior may be, it seems to be somewhat standard.
Ok, I think I've figured this out, So we have two problems that we're dealing with here: 1) The fact that multicast groups are not filtered properly when bound to INADDR_ANY. This is fixed by my patch in Comment #10 2) The fact that when you specify INADDR_ANY in an ip_mreqn structures imr_address field (the source address), you do not receive multicast traffic on every interface. This is due to the fact that in ip_mc_find_dev, we do the following: a) If the ifindex is non-zero, we return the device specified by the index b) if the ifindex is zero, we consult the routing table to find where frames bound for the specified _source_ address would be sent. Since the specified program supplies 0.0.0.0 as the imr_address parameter, the default routes interface is returned, which in most cases is eth0 (or some other interface other than lo). I think I can successfully push the patch for (1) upstream. 2 will be a little more complex, given that there isn't any real documentation that I can find for how to handle this case (other than the man page saying that supplying INADDR_ANY will cause the kernel to select an appropriate interface). I can see arguments for selecting all interfaces, and for selecting the interface specified by the routing table. So I'm not sure what to do. Given that its possible to 'select all interfaces' by iterating over the ifindicies of each interface in the program and doing a separate join to each), I'm inclined to leave this code alone. Especially since there is precident in other os-es for how to handle INADDR_ANY in this case, as comment #35 illustrates. However, given some conversations I've had regarding Multicast handling in general, I do think that ipv6 and v6 are (while consistent) both wrong in how they handle multicast group reception when bound to INADDR_ANY. I think what we need to do is push my patch from comment #10 upstream (and the ipv6 equivalent, given that it sounds like the problem occurs there as well), and then just be aware (perhaps updating the man page), to be more clear that when doing an IP_ADD_MEMBERSHIP, specifying INADDR_ANY doesn't mean all interfaces in that case. Thoughts?
So you're saying that if one wants to listen or send to all interfaces, then would join/send on all interfaces separately. And INADDR_ANY would mean OS to choose one interface. Right? In case of INADDR_ANY ss it possible that OS choose one interface for the receiver based on default route and another for a sender based on mcast routeing? Do I confuse things?
(In reply to comment #31) However, my tests show that IPv6 behaves exactly as IPv4 > as far as multicast groups are concerned (that is: two multicast receivers > binding to in6addr_any on the same UDP port see each other's traffic.) Did you really use an IPv6 multicast address ? If you use an IPv4 mcast address (like 239.1.2.3), then this will fail. If you use (e.g.) FF01:0:0:0:0:0:0:2, then it passes.
(In reply to comment #36) > can see > arguments for selecting all interfaces, and for selecting the interface > specified by the routing table. So I'm not sure what to do. Given that its > possible to 'select all interfaces' by iterating over the ifindicies of each > interface in the program and doing a separate join to each), I'm inclined to > leave this code alone. To my knowledge, providing the IN_ANY address means that the kernel will select the interface. If you want to receive multicast traffic on ALL interfaces you have to iterate over them and JOIN all of them. Stevens and Comer agree, but I haven't read the relevant RFCs. Stevens (19.5) states that "if the local interface is specified as the wildcard address (INADDR_ANY for IPv4) or an index of 0 for IPv6, then the local interface is chosen by the kernel".
In response to comment 37, Yes, its possible to select different receiving interfaces. during the IP_ADD_MEMBERSHIP call, specifying INADDR_ANY as the local address causes a route lookup using INADDR_ANY as the destination (effectively selecting the default route). The interface returned from that route lookup is the local interface that the join listens on. During a Send, the multicast address that is the destinatation of the packet being sent is whats used as the route lookup. If you have a multicast route for the group you are sending to , and the interface in that route differs from the interface in the default route, then you have a mismatch and will not see those packets locally . I know its a bit counter-intuitive, but I look at the stevens text, and its correct. Its furhter explained by some documentation specified on his website: http://www.kohala.com/start/mcast.api.txt See the section onf receiving multicast datagrams. The only bug that I see here currently is the one origionally reported. When you _bind_ to INADDR_ANY on a receiving interface, you should not see multicast traffic from groups that you did not join. I believe that my patch in comment 10 corrects that. If you can confirm that for me, I'll push this upstream.
I confirm patch from comment 10 is working in comment 15.
Good, thank you. I've sent the patch upstream for review. Assuming it is accepted, I'll post it for RHEL inclusion shortly thereafter.
Ok, I'm done. Apparently there was a misunderstanding about my questions previously. The behavior as it currently stands is asserted to be absolutely correct, and isn't going to change. For an in depth description: http://marc.info/?l=linux-netdev&m=121579002105636&w=2 My patch is of course rejected upstream, and so I can't backport it. Sorry.
Hmm, actually on the IPv6 side one cannot bind to a multicast socket as Comment #32 explains, so to me it seems we don't have a consistent behavior. We change how IPv6 works? As well it seems to me that in order for somebody to save few lines of code, now every program must take care or this "well thought behavior"...
Sorry for the late answer. Thanks for all the work done by Red Hat engineering and for driving the bug through resolution. (In reply to comment #38) > Did you really use an IPv6 multicast address ? If you use an IPv4 mcast > address (like 239.1.2.3), then this will fail. If you use (e.g.) > FF01:0:0:0:0:0:0:2, then it passes. I used multicast address FF02:0:0:0:0:3:1:2. Feel free to try the attached programs yourself if you have a doubt on my results.
A modified version of the IPv6 source code appears to work fine with ff05::1 and ff05::2 on Fedora 9 and RHEL5. On RHEL4, the send part fails with EADDRNOTAVAIL: sendto(3, "aaa\n", 4, 0, {sa_family=AF_INET6, sin6_port=htons(1234), inet_pton(AF_INET6, "ff05::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address) Sending works to any address for which there are not listeners. IPv4 (done with ttcp though) works also on RHEL4. So, I think there _is_ a bug in RHEL4 kernel, in its IPv6 multicast code, but it's in a different place than demonstrated by the original code. You can see the problem with the following kind of bound multicast sockets once you try to sendto(2) to the the address: $ netstat -an | grep 1234 udp 0 0 ff05::2:1234 :::* udp 0 0 ff05::1:1234 :::* $ netstat -gn | grep ff05 eth0 1 ff05::2 eth0 1 ff05::1 I could not find any relevant bug#'s on this. By quickly looking at the kernel source, I also couldn't find the cause, but that might be because I don't know where to look. I've opened a separate bug ID 472200 on this, so the existing references to this bug don't get confused if this new problem is resolved. ========= On modifying the IPv6 source code posted above: On Fedora 9, the key point is to patch the code to just bind to the multicast address that you're interested in, not ip6addr_any. You have to make sure that sending is done on the same interface as receiving; with multiple interfaces, this is not necessarily true. You may also need to exclude multicast in your ip6tables rules (the default rules block incoming multicast). The patch used was: @@ -63,8 +63,8 @@ /* construct a multicast address structure */ memset(&mc_addr, 0, sizeof(mc_addr)); mc_addr.sin6_family = AF_INET6; - mc_addr.sin6_addr = in6addr_any; - /* inet_pton(AF_INET6, mc_addr_str, mc_addr.sin6_addr.s6_addr); Doesn't work? */ +// mc_addr.sin6_addr = in6addr_any; + inet_pton(AF_INET6, mc_addr_str, mc_addr.sin6_addr.s6_addr); mc_addr.sin6_port = htons(mc_port); /* bind to multicast address to socket */
In order to share the exchange with the redhat support, I post a possible answer to this problem. Since kernel 2.6.31 (then since RHEL6), it is possible to configure this behavior using the option IP_MULTICAST_ALL (see http://man7.org/linux/man-pages/man7/ip.7.html). Adding this setting in mcreceive.c produce the expected result. #ifdef IP_MULTICAST_ALL int mc_all = 0; if ((setsockopt(sock, IPPROTO_IP, IP_MULTICAST_ALL, (void*) &mc_all, sizeof(mc_all))) < 0) { perror("setsockopt() failed"); exit(1); } #endif
Created attachment 807077 [details] Modified test program that send/receive multicast resetting IP_MULTICAST_ALL option Need a kernel 2.6.31 (the commit that introduce this option is https://github.com/torvalds/linux/commit/f771bef98004d9d141b085d987a77d06669d4f4f)