Description of problem: Whilst developing a multicast routing application a bug was found in the 2.6.9 kernel in the ip_mc_msfilter code. This bug prevented setipv4sourcefilter from switching from source specific to anysource modes. In order to get around this problem, which was blocking our code development, we upgraded to the 2.6.15 kernel which has fixes for this problem. Since the upgrade we have been experiencing network problems on this server. When there is a high network load, eth0 starts to drop all packets. It will do this for a period of time and then recover. Occasionally, but not always, restarting the network service, or removing a VLAN will cause the packets to flow again. A reboot always solves the problem. By running the 2.6.9 kernel again, this problem does not happen. Version-Release number of selected component (if applicable): Dell Poweredge 1850 server Kernel 2.6.15 How reproducible: Very Steps to Reproduce: 1. Start receiving a network video stream with VLC 2. Cause a CPU load (make the VLC window larger) Actual results: eth0 drops all packets Expected results: Normal operation Additional info: Please let me know what debug you will need, and how to capture it.
OS = Redhat Enterprise Linux 4 uname -a = Linux BM3 2.6.15.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686 i686 i386 GNU/Linux
in the RHEL4 context, we would like to fix the routing bug that you encountered so anymore details about that would be helpful. In terms of a 2.6.15 bug, that should be filed either against fedora, or against the upstream kernel...
The error exists in Linux/net/ipv4/igmp.c file in function ip_mc_msfilter. When passing this function an anysource (exclude no sources) the function returns an EADDRNOTAVAIL error. This function is used to set the filter for source addresses. To allow any source, we have a empty filter set which equates to excluding no addresses (therefore allowing any). This function should not return the above error when an empty source address range is passed. The 2.6.15 kernel works correctly via changes to the above file. I could not limit the exact changes made to the multicast implementation, to just provide you with them.
It's most probably this changeset: diff-tree 8713dbf05754aa777f31bf491cb60a111f7ad828 (from ec1890c5df451799dec969a Author: Yan Zheng <yanzheng> Date: Fri Oct 28 08:02:08 2005 +0800 [MCAST]: ip[6]_mc_add_src should be called when number of sources is zero And filter mode is exclude. Further explanation by David Stevens: Multicast source filters aren't widely used yet, and that's really the only feature that's affected if an application actually exercises this bug, as fa as I can tell. An ordinary filter-less multicast join should still work, and only forwarded multicast traffic making use of filters and doing empty-sourc filters with the MSFILTER ioctl would be at risk of not getting multicast traffic forwarded to them because the reports generated would not be based o the correct counts. Signed-off-by: Yan Zheng <yanzheng Acked-by: David L Stevens <dlstevens.com> Signed-off-by: Arnaldo Carvalho de Melo <acme> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 8b6d393..c6247fc 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -1908,8 +1908,11 @@ int ip_mc_msfilter(struct sock *sk, stru sock_kfree_s(sk, newpsl, IP_SFLSIZE(newpsl->sl_max)); goto done; } - } else + } else { newpsl = NULL; + (void) ip_mc_add_src(in_dev, &msf->imsf_multiaddr, + msf->imsf_fmode, 0, NULL, 0); + } psl = pmc->sflist; if (psl) { (void) ip_mc_del_src(in_dev, &msf->imsf_multiaddr, pmc->sfmode,
The above all looks correct. Is there a chance we could recieve a interim patch to fix this? This is blocking our customer acceptance testing, which is due to complete next week. Thanks in advance.
Please can someone advise how I now progress this issue. My company are now waiting on this issue to be resolved before we can gain customer acceptance.
The above patch did not work completely. The error flag needs to be reset too: sock_kfree_s(sk, newpsl, IP_SFLSIZE(newpsl->sl_max)); goto done; } } else { newpsl = NULL; (void) ip_mc_add_src(in_dev, &msf->imsf_multiaddr, msf->imsf_fmode, 0, NULL, 0); err=0; } psl = pmc->sflist; if (psl) { (void) ip_mc_del_src(in_dev, &msf->imsf_multiaddr, pmc->sfmode,
The above changes to igmp.c have fixed the problem. Please can someone issue me with a compiled kernel with this patch in place, so that I can upgrade the servers with an official release. The servers need to be running an offical Redhat released kernel for our customer acceptance testing. Thank you.
Created attachment 127661 [details] Updated igmp.c
Created attachment 127662 [details] A simple program to test the problem\fix
RHEL4 has entered the Extended Life Phase. There will be no more minor releases. I'm closing this bug due to inactivity. Please reopen and provide an explanation if you need this issue to be addressed in RHEL4. Please note that only security and critical bugfixes are considered at this point.