586557 – Bonding with LACP does not work

Bug 586557 - Bonding with LACP does not work

Summary: Bonding with LACP does not work

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel-xen
Sub Component:
Version:	5.4
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Xen Maintainance List
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-04-27 20:44 UTC by Simon Gao
Modified:	2010-05-17 17:10 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-05-17 17:10:04 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Simon Gao 2010-04-27 20:44:03 UTC

Description of problem:

Network traffic only pass through one physical NIC, instead of running on both
interfaces.


Version-Release number of selected component (if applicable):

RHEL 5.4 kernel 2.6.18-164.15.1xen

How reproducible:

Steps to Reproduce:
1. Configure bond0, eth0 and eth1 as follow:

/etc/sysconfig/network-scripts/ifcfg-bond0:
# Bonding interface
DEVICE=bond0
BOOTPROTO=none
ONBOOT=yes
IPADDR=192.168.15.34
NETMASK=255.255.255.0
USERCTL=no

/etc/sysconfig/network-scripts/ifcfg-eth0:
# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
HWADDR=xx:xx:xx:xx:xx:xx
MASTER=bond0
SLAVE=yes
ETHTOOL_OPTS="autoneg off speed 1000 duplex full"

/etc/sysconfig/network-scripts/ifcfg-eth:
# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
HWADDR=xx:xx:xx:xx:xx:xx
MASTER=bond0
SLAVE=yes
ETHTOOL_OPTS="autoneg off speed 1000 duplex full"

Add following to /etc/modprobe.conf:

alias bond0 bonding
options bond0 miimon=80 mode=4

$ cat /proc/net/bonding/pbond0 
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 80
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 2
        Actor Key: 9
        Partner Key: 3
        Partner Mac Address: 00:13:80:xx:xx:xx

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: xx:xx:xx:xx:xx:xx
Aggregator ID: 1

Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: xx:xx:xx:xx:xx:xx
Aggregator ID: 1

2. Copy files to the machine from multiple hosts


3. Check amount of network traffic on eth0 and eth1
  

Actual results:

The data copying traffic are all carried through eth0. Eth1 does not get any of the traffic.

Expected results:

Should see roughly equal traffic eth0 and eth1.

Additional info:

The switch is Cisco 4500 with LACP supported port channel for eth0 and eth1 links.

Comment 1 Andrew Jones 2010-04-28 09:00:15 UTC

I'm guessing there's no guarantee that both nics will be involved for a single transfer. A better test would be to create enough network traffic that you need the bandwidth from both, and then check that they both are handling the traffic. But I'll pass this over to kernel for the bonding guys to comment.

Comment 3 Andy Gospodarek 2010-05-17 17:10:04 UTC

There is no guarantee that traffic will flow on both bonding interfaces. A hash based on the contents of the frame is computed and used to determine output port selection. The default hash policy only examines Layer 2 (source and destination MAC addresses) data in the frame. If all traffic is destined for a single-host or multiple hosts on a separate network connected via a router, only one link will be used when using Layer 2 hashing. For that reason, I would suggest using Layer2+3 or Layer3+4 hashing instead depending on your traffic pattern.

I would suggest adding the option:

"xmit_hash_policy=layer2+3"

to your bonding options.

More details about these options can be found in the bonding documentation about the xmit_hash_policy option:

xmit_hash_policy

Selects the transmit hash policy to use for slave selection in
balance-xor and 802.3ad modes. Possible values are:

layer2

Uses XOR of hardware MAC addresses to generate the
hash. The formula is

(source MAC XOR destination MAC) modulo slave count

This algorithm will place all traffic to a particular
network peer on the same slave.

This algorithm is 802.3ad compliant.

layer2+3

This policy uses a combination of layer2 and layer3
protocol information to generate the hash.

Uses XOR of hardware MAC addresses and IP addresses to
generate the hash. The formula is

(((source IP XOR dest IP) AND 0xffff) XOR
( source MAC XOR destination MAC ))
modulo slave count

This algorithm will place all traffic to a particular
network peer on the same slave. For non-IP traffic,
the formula is the same as for the layer2 transmit
hash policy.

This policy is intended to provide a more balanced
distribution of traffic than layer2 alone, especially
in environments where a layer3 gateway device is
required to reach most destinations.

This algorithm is 802.3ad compliant.

layer3+4

This policy uses upper layer protocol information,
when available, to generate the hash. This allows for
traffic to a particular network peer to span multiple
slaves, although a single connection will not span
multiple slaves.

The formula for unfragmented TCP and UDP packets is

((source port XOR dest port) XOR
((source IP XOR dest IP) AND 0xffff)
modulo slave count

For fragmented TCP or UDP packets and all other IP
protocol traffic, the source and destination port
information is omitted. For non-IP traffic, the
formula is the same as for the layer2 transmit hash
policy.

This policy is intended to mimic the behavior of
certain switches, notably Cisco switches with PFC2 as
well as some Foundry and IBM products.

This algorithm is not fully 802.3ad compliant. A
single TCP or UDP conversation containing both
fragmented and unfragmented packets will see packets
striped across two interfaces. This may result in out
of order delivery. Most traffic types will not meet
this criteria, as TCP rarely fragments traffic, and
most UDP traffic is not involved in extended
conversations. Other implementations of 802.3ad may
or may not tolerate this noncompliance.

The default value is layer2. This option was added in bonding
version 2.6.3. In earlier versions of bonding, this parameter
does not exist, and the layer2 policy is the only policy. The
layer2+3 value was added for bonding version 3.2.2.

I would also suggest adding these options to ifcfg-bond0:

BONDING_OPTS="miimon=80 mode=4 xmit_hash_policy=layer2+3"

and removing this line and only this line from modprobe.conf:

options bond0 miimon=80 mode=4

Each time someone has asked about this issue, it turns out to be a configuration issue, so I am going to close this bug. Feel free to re-open it if you are still having problems after switching to layer2+3 or layer3+4 and provide more details about the traffic being transmitted by the bond.

Note You need to log in before you can comment on or make changes to this bug.