Description of problem: Multicast packet loss was found when using multicast heartbeating in clumanager_1.2.16 from the Red Hat Cluster Suite in the following "no single point of failure" environment: 2 Dell 2650 nodes, each with 2 GigE NICs connected into a dual Cisco Catalyst 3750 Series switch environment, and 1 NIC from each node connected into a SAN. By default, the Cisco 3750 switch uses IGMP snooping, which causes multicast packets to only be delivered to hosts who have dynamically joined a multicast group by sending an IGMP packet. The switch uses a timeout to allow members to quietly leave the group. With bonding mode 5, there is only 1 interface configured to receive packets. If both Ethernet interfaces are up, the load balancing capability in mode 5 can move the sending of IGMP packets to the 2nd interface. If the switch ends up timing out the interface configured to receive packets, the multicast packets are only then sent to the 2nd interface, and dropped since that interface is not set up to receive them. Disabling IGMP snooping on the Cisco switches works around the problem as it forces the switch to always send multicast packets to all interfaces on the switch. This workaround isn't desirable for performance reasons though, and some switches may not offer the ability to disable IGMP snooping. Version-Release number of selected component (if applicable): Red Hat Enterprise Linux WS release 3 (Taroon Update 2) and Red Hat Enterprise Linux AS release 3.90 (Nahant) How reproducible: This problem can be recreated by sending multicast packets between two hosts with enough traffic on the interfaces to force the bonding driver to load balance between them. Steps to Reproduce: 1. Connect hosts into a dual switch environment for no single point of failure, switch must have IGMP snooping enabled 2. Send traffic between hosts to create a load so that bonding driver load balances between 2 interfaces 3. Send multicast packets between 2 hosts Actual results: If the switch has timed out the interface configured by the bonding driver to receive packets, multicast packets are only sent to an interface that cannot receieve them, and dropped. Expected results: no multicast packet loss Additional info:
Created attachment 109979 [details] test case
I don't really see any way to "fix" this and to maintain the design of mode 5 of the bonding module. By definition: balance-tlb or 5 Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave. Have you tried mode 6 (balance-alb)? I think this has a better chance of working for you, although I'll have to do more research to know for sure that transmits and receives will be balanced to the same link for a given partner host. Give it a shot and let me know the results? I don't have a switch w/ the IGMP snooping behaviour at my disposal...
Thanks for the help so far and the quick response! I checked, and I have the same problem with mode 6. multicast packets can be lost if the switch has IGMP snooping enabled. (With mode 6 in Redhat EL3, the multicast mode is still set to "active slave only.") I realize that there may be no easy "fix" for this problem. I guess I think of it as a problem with the design of mode 5 and support of multicast. I'm not sure if you want to start identifying packets to ensure that IGMP packets are sent down the same interface that is configured to receive multicast packets. If you want to say that multicast isn't supported with mode 5, or that switch modification is possibly required, that's fine. We just wanted to make you aware of the problem, and wanted to be advised as to how we should be using mode 5 of the bonding driver with multicast. Mode 5 is documented as: "Adaptive transmit load balancing: channel bonding that does not require any special switch support." This is a bit misleading, since many switches do support IGMP snooping, and have it enabled by default. In order to use multicast with bonding mode 5, we need to modify our switch configuration.
Heidi, I have to agree with you that the design of mode 5 (and 6) seems a little fragile. I'll try to do some more research to see if the maintainers have considered and/or dealt with this type of situation. A patch specific to IGMP may even be appropriate. It seems to me that mode 1 or (w/ appropriate switch support) mode 3 would fulfill your needs. If not, please elaborate?
I would really appreciate any futher research you could do to see if the maintainers know about, and possibly have plans to address this situation. Thanks so much for your help so far! We are hoping to be able to support a "no single point of failure" system using the Ethernet bonding driver with no switch support, and get the added benefit of increased aggregate bandwidth when possible. From the documentation, it looked like mode 5 would provide what we are looking for. Resilience has a higher priority than performance for us, but performance is important too, so I think we'll continue to use mode 5 with IGMP snooping disabled at the switch and hope that we can turn IGMP snooping on again at some point. Falling back to mode 1 is also an option if we find that disabling IGMP snooping is causing too many performance problems.
Heidi, I put together a patch which forces IGMP transmits to go out the primary bonding port. I have built a test kernel w/ that patch and made it available here: http://people.redhat.com/linville/kernels/rhel3/ (That page is new -- feel free to give feedback!) Please give that a try and let me know the results. Thanks!
Heidi, any word on the results with this patch?
Patch posted upstream on 3/15...
A fix for this problem has just been committed to the RHEL3 U6 patch pool this evening (in kernel version 2.4.21-32.2.EL).
(In reply to comment #12) > A fix for this problem has just been committed to the RHEL3 U6 > patch pool this evening (in kernel version 2.4.21-32.2.EL). Is there just a patch apart from the kernel higher revision that can fix this issue. We are seeing similar issues where multicast packets are being dropped by the interfaces while being bonded. Simple network stats does not report that interfaces are dropping traffic but the application and messaging daemons report inbound packetloss thanks ananth.potnis
Created attachment 115194 [details] jwltest-bond_alb-igmp-hack.patch This is the patch in question...
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html