Bug 586557 - Bonding with LACP does not work
Summary: Bonding with LACP does not work
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Xen Maintainance List
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-27 20:44 UTC by Simon Gao
Modified: 2010-05-17 17:10 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-05-17 17:10:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Simon Gao 2010-04-27 20:44:03 UTC
Description of problem:

Network traffic only pass through one physical NIC, instead of running on both
interfaces.


Version-Release number of selected component (if applicable):

RHEL 5.4 kernel 2.6.18-164.15.1xen

How reproducible:

Steps to Reproduce:
1. Configure bond0, eth0 and eth1 as follow:

/etc/sysconfig/network-scripts/ifcfg-bond0:
# Bonding interface
DEVICE=bond0
BOOTPROTO=none
ONBOOT=yes
IPADDR=192.168.15.34
NETMASK=255.255.255.0
USERCTL=no

/etc/sysconfig/network-scripts/ifcfg-eth0:
# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
HWADDR=xx:xx:xx:xx:xx:xx
MASTER=bond0
SLAVE=yes
ETHTOOL_OPTS="autoneg off speed 1000 duplex full"

/etc/sysconfig/network-scripts/ifcfg-eth:
# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
HWADDR=xx:xx:xx:xx:xx:xx
MASTER=bond0
SLAVE=yes
ETHTOOL_OPTS="autoneg off speed 1000 duplex full"

Add following to /etc/modprobe.conf:

alias bond0 bonding
options bond0 miimon=80 mode=4

$ cat /proc/net/bonding/pbond0 
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 80
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 2
        Actor Key: 9
        Partner Key: 3
        Partner Mac Address: 00:13:80:xx:xx:xx

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: xx:xx:xx:xx:xx:xx
Aggregator ID: 1

Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: xx:xx:xx:xx:xx:xx
Aggregator ID: 1

2. Copy files to the machine from multiple hosts


3. Check amount of network traffic on eth0 and eth1
  

Actual results:

The data copying traffic are all carried through eth0. Eth1 does not get any of the traffic.

Expected results:

Should see roughly equal traffic eth0 and eth1.

Additional info:

The switch is Cisco 4500 with LACP supported port channel for eth0 and eth1 links.

Comment 1 Andrew Jones 2010-04-28 09:00:15 UTC
I'm guessing there's no guarantee that both nics will be involved for a single transfer. A better test would be to create enough network traffic that you need the bandwidth from both, and then check that they both are handling the traffic. But I'll pass this over to kernel for the bonding guys to comment.

Comment 3 Andy Gospodarek 2010-05-17 17:10:04 UTC
There is no guarantee that traffic will flow on both bonding interfaces.  A hash based on the contents of the frame is computed and used to determine output port selection.  The default hash policy only examines Layer 2 (source and destination MAC addresses) data in the frame.  If all traffic is destined for a single-host or multiple hosts on a separate network connected via a router, only one link will be used when using Layer 2 hashing.  For that reason, I would suggest using Layer2+3 or Layer3+4 hashing instead depending on your traffic pattern.

I would suggest adding the option:

"xmit_hash_policy=layer2+3"

to your bonding options.

More details about these options can be found in the bonding documentation about the xmit_hash_policy option:

xmit_hash_policy

        Selects the transmit hash policy to use for slave selection in
        balance-xor and 802.3ad modes.  Possible values are:

        layer2

                Uses XOR of hardware MAC addresses to generate the
                hash.  The formula is

                (source MAC XOR destination MAC) modulo slave count

                This algorithm will place all traffic to a particular
                network peer on the same slave.

                This algorithm is 802.3ad compliant.

        layer2+3

                This policy uses a combination of layer2 and layer3
                protocol information to generate the hash.

                Uses XOR of hardware MAC addresses and IP addresses to
                generate the hash.  The formula is

                (((source IP XOR dest IP) AND 0xffff) XOR
                        ( source MAC XOR destination MAC ))
                                modulo slave count

                This algorithm will place all traffic to a particular
                network peer on the same slave.  For non-IP traffic,
                the formula is the same as for the layer2 transmit
                hash policy.

                This policy is intended to provide a more balanced
                distribution of traffic than layer2 alone, especially
                in environments where a layer3 gateway device is
                required to reach most destinations.

                This algorithm is 802.3ad compliant.

        layer3+4

                This policy uses upper layer protocol information,
                when available, to generate the hash.  This allows for
                traffic to a particular network peer to span multiple
                slaves, although a single connection will not span
                multiple slaves.

                The formula for unfragmented TCP and UDP packets is

                ((source port XOR dest port) XOR
                         ((source IP XOR dest IP) AND 0xffff)
                                modulo slave count

                For fragmented TCP or UDP packets and all other IP
                protocol traffic, the source and destination port
                information is omitted.  For non-IP traffic, the
                formula is the same as for the layer2 transmit hash
                policy.

                This policy is intended to mimic the behavior of
                certain switches, notably Cisco switches with PFC2 as
                well as some Foundry and IBM products.

                This algorithm is not fully 802.3ad compliant.  A
                single TCP or UDP conversation containing both
                fragmented and unfragmented packets will see packets
                striped across two interfaces.  This may result in out
                of order delivery.  Most traffic types will not meet
                this criteria, as TCP rarely fragments traffic, and
                most UDP traffic is not involved in extended
                conversations.  Other implementations of 802.3ad may
                or may not tolerate this noncompliance.

        The default value is layer2.  This option was added in bonding
        version 2.6.3.  In earlier versions of bonding, this parameter
        does not exist, and the layer2 policy is the only policy.  The
        layer2+3 value was added for bonding version 3.2.2.

I would also suggest adding these options to ifcfg-bond0:

BONDING_OPTS="miimon=80 mode=4 xmit_hash_policy=layer2+3"

and removing this line and only this line from modprobe.conf:

options bond0 miimon=80 mode=4

Each time someone has asked about this issue, it turns out to be a configuration issue, so I am going to close this bug.  Feel free to re-open it if you are still having problems after switching to layer2+3 or layer3+4 and provide more details about the traffic being transmitted by the bond.


Note You need to log in before you can comment on or make changes to this bug.