Description of problem: Bonded network interface using 802.3ad does not balance outbound traffic to the local network. Version-Release number of selected component (if applicable): kernel-smp-2.6.9-55.0.2.EL How reproducible: Steps to Reproduce: 1. Create /etc/modprobe.conf as follows: alias eth0 tg3 alias eth1 tg3 alias scsi_hostadapter cciss alias eth2 e1000 alias eth3 e1000 alias eth4 e1000 alias eth5 e1000 alias usb-controller ohci-hcd alias bond0 bonding options bond0 mode=4 miimon=100 2. Create /etc/sysconfig/network-scripts/ifcfg-eth0 as follows: DEVICE=eth0 USERCTL=no ONBOOT=yes MASTER=bond0 SLAVE=yes BOOTPROTO=none ETHTOOL_OPTS="speed 1000 duplex full autoneg off" 3. Create /etc/sysconfig/network-scripts/ifcfg-eth1 as follows: DEVICE=eth1 USERCTL=no ONBOOT=yes MASTER=bond0 SLAVE=yes BOOTPROTO=none ETHTOOL_OPTS="speed 1000 duplex full autoneg off" 4. Create /etc/sysconfig/network-scripts/ifcfg-bond0 as follows: DEVICE=bond0 ONBOOT=YES STARTMODE=onboot BOOTPROTO=static IPADDR=172.17.145.27 NETMASK=255.255.255.0 GATEWAY=172.17.145.1 BONDING_MASTER="yes" BONDING_SLAVE0="eth0" BONDING_SLAVE1="eth1" Actual results: Once put into service the network devices receive traffic in a "balanced way" but xmit all seems to go out the etho interface. Expected results: After reading the /usr/share/doc/kernel-doc-2.6.9/Documentation/networking/bonding.txt file with attention paid to the 802.3ad section numbered 13.1 we verified that the machine was talking to hosts on the local segment and not through a gateway. Based on the documentation we expected that outbound traffic would have been balanced instead of all going out through eth0. Additional info:
This is interesting. Can you post the contents of /proc/net/bonding/bond0?
nymsgs21 # cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v2.6.3-rh (June 8, 2005) Bonding Mode: IEEE 802.3ad Dynamic link aggregation MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 802.3ad info LACP rate: slow Active Aggregator Info: Aggregator ID: 2 Number of ports: 2 Actor Key: 17 Partner Key: 3 Partner Mac Address: 00:19:07:a0:8c:00 Slave Interface: eth0 MII Status: up Link Failure Count: 3 Permanent HW addr: 00:17:a4:a7:95:bc Aggregator ID: 2 Slave Interface: eth1 MII Status: up Link Failure Count: 4 Permanent HW addr: 00:17:a4:a7:95:bb Aggregator ID: 2
Well that looks fine. One thing you can try is the 'xmit_hash_policy' module parameter. RHEL4.5 should support options of '0' or 'layer2' for mac-based hashing and '1' or 'layer3+4' for hashing based on host and tcp/udp port info.
We are concerned about the fact that this will make the setup non-compliant with the 802.3ad spec (according to the docs). The applications are sensitive to packet retransmissions which is what this is likely to cause. Notably the external config on this network is Cisco switched setup to support 802.3ad.
How is choosing a different transmit hashing algorithm going to make it non-standards compliant? Please attach or provide a link to the document that claims anything but MAC-based hashing isn't going to comply fully with the standard. Many switch manufacturers use hashing based on L2/L4/L4 info and I believe some of them even allow the type of hashing to be configured. This form of hashing should not cause retransmissions due to out of order packets because each session (tcp/udp) will always hash to the same outgoing interface. The advantage to the layer3+4 hashing is that if the first tcp session to a given host chooses eth0, the next tcp session to the exact same host has the chance to pick eth1 for its traffic.
An example of cisco allowing you to choose the hashing algorithm. http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/configuration/guide/channel.html#wp1052523
per the doc: /usr/share/doc/kernel-doc-2.6.9/Documentation/networking/bonding.txt IEEE 802.3ad Dynamic link aggregation... Slave selection for outgoing traffic is done according to the transmit hash policy... Note that not all transmit policies may be 802.3ad compliant, particularly in regards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Differing peer implementations will have varying tolerances for noncompliance. ... layer3+4 ... This algorithm is not fully 802.3ad compliant. A single TCP or UDP conversation containing both fragmented and unfragmented packets will see packets striped across two interfaces. This may result in out of order delivery. Most traffic types will not meet this criteria, as TCP rarely fragments traffic, and most UDP traffic is not involved in extended conversations. Other implementations of 802.3ad may or may not tolerate this noncompliance.
I think the statements referenced in comment #7 are overblown, but if you are concerned we can address why layer2 hashing doesn't work. So you say that all the traffic always flows on eth0. Do you have locally administered MAC addresses on this network? I ask because the layer2 hashing is pretty simple: static int bond_xmit_hash_policy_l2(struct sk_buff *skb, struct net_device *bond_dev, int count) { struct ethhdr *data = (struct ethhdr *)skb->data; return (data->h_dest[5] ^ bond_dev->dev_addr[5]) % count; } So if you've got preset MACs on that network where the last 2 bytes on all of them are the same then they will always XOR to the same value (since count is simply the number of members in the bond).
Okay.... upon further review it looks like almost all of the traffic is coming from the switched gateway... if that's the case the MAC issue would definitely come into play... if so then we can almost certainly mark this "Not A BUG"?
Sounds good. Feel free to re-open if you have any more problems.