Bug 619070

Summary: 802.3ad link aggregation won't work with newer (2.6.194-8.1.el5) kernel and ixgbe driver
Product: Red Hat Enterprise Linux 5 Reporter: Doug Wandell <doug>
Component: kernelAssignee: Andy Gospodarek <agospoda>
Status: CLOSED ERRATA QA Contact: Network QE <network-qe>
Severity: high Docs Contact:
Priority: urgent    
Version: 5.5CC: alexander.h.duyck, andriusb, arozansk, bandan.das, bluca, cevich, chas.horvath, cward, cww, dan.duval, dhoward, greg.procunier, hjia, jesse.brandeburg, john.ronciak, jonathansturges, jparadis, jpirko, peterm, robert.evans, syeghiay
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, using 802.3ad link aggregation did not work properly when using the ixgbe driver. This was caused due to an inability to form 802.3ad-based bonds. With this update, the issue causing 802.3ad link aggregation to not work properly has been fixed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 21:45:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 644822    
Attachments:
Description Flags
Dmesg tail after issuing 'ifup bond0'
none
New kernel dmesg tail after issuing 'ifup bond0'
none
possible missing fix from ixgbe
none
bonding-correct-LACPDUs-that-are-in-non-linear-skbs.patch none

Description Doug Wandell 2010-07-28 14:01:41 UTC
Description of problem:

802.3ad (link aggregation) breaks with 2.6.194-8.1.el5 when using the ixgbe driver.

Version-Release number of selected component (if applicable):

Name        : kernel                       Relocations: (not relocatable)
Version     : 2.6.18                            Vendor: Red Hat, Inc.
Release     : 194.8.1.el5                   Build Date: Wed 23 Jun 2010 

How reproducible:

Consistently reproduced. 

Steps to Reproduce:

1. Running RHEL5.4, create bonded interface using two 10G ports on Intel 82599EB nic.
2. Connect to twin switches configured for MLAG/LACP.
3. Confirm mlag established, ping/mount from other devices on lan
4. Update system to RHEL5.5, boot into 2.6.194-8.1.el5, try to bring up bond0
  
Actual results:

Dynamic lag never established with new kernel.

Expected results:

For bonded interface to continue to work

Additional info:

Switches are Arista 7000-series in multi-chassis lag configuration.  Server is a Dell PowerEdge R610, with Intel 10G nics:

04:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit Network Connection (rev 01)

/proc/net/bonding/bond0 with 2.6.18-164:

Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 150
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Active Aggregator Info:
	Aggregator ID: 1
	Number of ports: 2
	Actor Key: 33
	Partner Key: 25
	Partner Mac Address: 00:1c:73:08:17:1b

Slave Interface: eth8
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1b:21:54:f7:c0
Aggregator ID: 1

Slave Interface: eth9
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1b:21:54:f7:c1
Aggregator ID: 1


/proc/net/bonding/bond0 with 2.6.18-194.8.1:

Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 150
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Active Aggregator Info:
	Aggregator ID: 1
	Number of ports: 1
	Actor Key: 33
	Partner Key: 1
	Partner Mac Address: 00:00:00:00:00:00

Slave Interface: eth8
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1b:21:54:f7:c0
Aggregator ID: 1

Slave Interface: eth9
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1b:21:54:f7:c1
Aggregator ID: 2

Comment 1 Doug Wandell 2010-07-28 14:02:46 UTC
Created attachment 435018 [details]
Dmesg tail after issuing 'ifup bond0'

Comment 2 Doug Wandell 2010-07-28 14:03:41 UTC
Created attachment 435019 [details]
New kernel dmesg tail after issuing 'ifup bond0'

Comment 3 Greg Procunier 2010-08-17 16:17:49 UTC
I am also getting this exact same problem trying to enable mode=4 (802.3ad link aggregation.

cat /proc/net/bonding/bond0

http://img839.imageshack.us/img839/2317/8023adproblem.jpg

service network stop

http://img412.imageshack.us/img412/8517/netstop.jpg

service network start

http://img96.imageshack.us/img96/1560/netstart.jpg

Running 2.6.18-194.el5 RHEL5-u5 x86_64 smp kernel.

Comment 4 Greg Procunier 2010-08-18 21:08:13 UTC
I updated to the following driver from intel and my trunking issue was fixed:

# modinfo ixgbe
filename:       /lib/modules/2.6.18-194.8.1.el5/kernel/drivers/net/ixgbe/ixgbe.ko
version:        2.0.84.11-NAPI
license:        GPL
description:    Intel(R) 10 Gigabit PCI Express Network Driver
author:         Intel Corporation, <linux.nics>
srcversion:     4E1775748069A498875EA2E

# cat /proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 2
        Actor Key: 9
        Partner Key: 2
        Partner Mac Address: 00:24:98:ed:2a:80

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1b:21:66:a0:dc
Aggregator ID: 1

Slave Interface: eth2
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:1b:21:66:a4:04
Aggregator ID: 1

Conclusion, the problem exists in Red Hats bundled ixgbe driver with their kernel.

Comment 5 Andy Gospodarek 2010-08-23 18:02:04 UTC
I realize it seems like this is something that is fixed with the latest ixgbe driver from Intel, but this appears to be way too much like bug 567604 for me to discredit this patch as a fix:

http://people.redhat.com/agospoda/rhel5/0208-bonding-Fix-updating-of-speed-duplex-changes.patch

My test kernels not only contain that patch but an ixgbe update too, so I suspect the issue would be resolved if running those kernels.  They can be found here:

http://people.redhat.com/agospoda/#rhel5

The latest development kernels contain only the fix for bug 567604 (as of right now) can be found here:

http://people.redhat.com/jwilson/el5/

If you are able to test both and report the results here I would *really* appreciate it.

Comment 6 Doug Wandell 2010-09-07 20:32:50 UTC
Andy,
I tested both kernels and neither made a difference for me. 

# uname -a
Linux rhev-prod-node6.mitre.org 2.6.18-212.el5.gtest.89 #1 SMP Mon Aug 16 14:01:15 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

# modinfo  ixgbe
filename:       /lib/modules/2.6.18-212.el5.gtest.89/kernel/drivers/net/ixgbe/ixgbe.ko
version:        2.0.84-k2
...

The /proc/net/bonding/bond0 information looks the same as before:

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 150
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Active Aggregator Info:
	Aggregator ID: 1
	Number of ports: 1
	Actor Key: 33
	Partner Key: 1
	Partner Mac Address: 00:00:00:00:00:00

Slave Interface: eth8
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1b:21:54:f7:5c
Aggregator ID: 1

Slave Interface: eth9
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1b:21:54:f7:5d
Aggregator ID: 2

Comment 7 Andy Gospodarek 2010-09-07 21:30:31 UTC
Thanks for testing that Doug -- sorry neither kernel worked for you.  I think I see what's wrong and will test and post a patch shortly.

Comment 8 Andy Gospodarek 2010-09-08 17:38:45 UTC
It turns out the problem I thought I saw was not really a problem.  I don't have a switch for testing, but I can confirm with another system that the 802.3ad bonds are not getting setup correctly.

Comment 9 Andy Gospodarek 2010-09-08 20:21:48 UTC
It looks like something is incorrect with the packet-split code in the driver as skb->data is wrong when the 802.3ad bonding code begins to inspect it.

Comment 10 Andy Gospodarek 2010-09-08 21:28:15 UTC
I've even found that if I clear IXGBE_PSRTYPE_L2HDR from psrtype that everything works, so I'm looking around for changes in rx buffer setup that may be causing this.

Comment 11 Alexander Duyck 2010-09-09 17:00:33 UTC
I think this is probably a bug in the link aggregation code that probably also needs to be fixed in the upstream kernel.

The issue is that bond_3ad_lacpdu_recv/bond_3ad_rx_indication are expecting a linear skb, but the 82599 is splitting the packet at the L2 header and is then placing the LACP data in a separate page.  If you add a call to skb_linearize in bond_3ad_lacpdu_recv before you call the bond_3ad_rx_indication it should resolve the issue.

Comment 12 Andy Gospodarek 2010-09-09 17:14:58 UTC
I would tend to agree that a call to skb_lineraize would resolve this.  The fact that the reporter tested ixgbe-2.0.84.11 on RHEL5.4 and RHEL5.5 and it worked makes me wonder if we've got something wrong in our backport.

Comment 13 Jesse Brandeburg 2010-09-09 17:25:08 UTC
Created attachment 446309 [details]
possible missing fix from ixgbe

Andy, does your code have this patch or similar?

Comment 14 Jesse Brandeburg 2010-09-09 17:28:56 UTC
instead of calling linearize, what about just pulling the bytes into the skb->data using pskb_pull or skb_pull or whatever call?

Comment 15 Alexander Duyck 2010-09-09 17:50:25 UTC
That should work too.  You could just pull the sizeof(struct lacpdu).  The issue is that the L2 header split was added to support FCoE but it is going to expose any protocol handlers that don't correctly handle non-linear frames.

Comment 16 Andy Gospodarek 2010-09-09 18:43:59 UTC
Jesse, we do have the patch added in comment #13.

Comment 17 Andy Gospodarek 2010-09-09 21:33:24 UTC
After some more testing it seems the SF driver (2.0.84.11) doesn't actually enable packet split, so that explains things. :)

Comment 18 Alexander Duyck 2010-09-10 00:13:36 UTC
Andy, do you want to submit the upstream patch for this or should I?

Basically all that needs to be done is to add the following snippit to bond_3ad_lacpdu_recv before it grabs the bond->lock:
	if (!pskb_may_pull(skb, sizeof(struct lacpdu)))
		goto out;

I don't have the setup here in front of me to test it so it might be easier for you to reproduce it, verify the fix, and send the patch from your end.

Comment 19 Andy Gospodarek 2010-09-10 02:38:01 UTC
Alexander, I actually tested this against an upstream kernel and did not find it to be broken, so I'd like to figure out why everything works there before posting a fix upstream.

I tested something similar to what is posted in comment#18 on RHEL5.6 development kernels and as you suspected linearizing the skb resolved the issue.

Comment 20 Andy Gospodarek 2010-09-10 15:11:51 UTC
(In reply to comment #19)
> Alexander, I actually tested this against an upstream kernel and did not find
> it to be broken, so I'd like to figure out why everything works there before
> posting a fix upstream.
> 

I did a bit more snooping around and my upstream kernel on this system was old enough that it did not have this fix:

commit 486545216472d67c16e3d3d60c5f21f60959c855
Author: Alexander Duyck <alexander.h.duyck>
Date:   Thu Aug 19 13:36:27 2010 +0000

    ixgbe: pull PSRTYPE configuration into a separate function

so packet split was only enabled on queue 0 and there seemed to be some cases where the frames were being received on a queue other than queue 0 (at least that had to be the case for it to work in the past).

I definitely agree that we need a solution to handle the non-linear frames now coming out of the ixgbe interfaces upstream (I didn't doubt Alex before, but I wanted to be sure I understood why my upstream stuff appeared to be working before), but I question whether or not this is the correct solution for RHEL5 as I could just turn off packet split for L2 frames since we aren't supporting FCoE on RHEL5.

Comment 21 Andy Gospodarek 2010-09-10 15:22:26 UTC
Created attachment 446535 [details]
bonding-correct-LACPDUs-that-are-in-non-linear-skbs.patch

This will likely be the patch to add to RHEL5 and upstream.

Thanks to Intel for the help on this.

Comment 22 Andy Gospodarek 2010-09-10 21:08:42 UTC
patch posted upstream:

http://marc.info/?l=linux-netdev&m=128415190309022&w=2

Comment 23 Andy Gospodarek 2010-09-23 22:21:26 UTC
My test kernels have been updated to include a patch for this bugzilla.
Please test them and report back your results.

http://people.redhat.com/agospoda/#rhel5

Without immediate feedback there is a good chance this or any other fix for this driver will not be included in the upcoming update.  Please test them and report back your results.

Comment 24 Jonathan S. 2010-10-13 20:55:25 UTC
I've been having fits trying to get LACP to work using some 82599EB cards and Cisco Nexus 5010 switches in recent weeks.  I'd tried the stock RHEL5.5 ixgbe (2.0.44-k2) and also a 2.1.4 driver from Intel with no luck.
Though I've just begun testing the kernel linked to in Comment 23, they do indeed seem to fix the problem!

# lspci | grep 82599:
08:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit Network Connection (rev 01)
08:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit Network Connection (rev 01)
09:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit Network Connection (rev 01)
09:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit Network Connection (rev 01)

# uname -a
Linux test_node 2.6.18-223.el5.gtest.90 #1 SMP Thu Sep 23 11:18:30 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

# cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2+3 (2)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 2
        Actor Key: 33
        Partner Key: 32970
        Partner Mac Address: 00:23:04:ee:be:03

Slave Interface: eth2
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:e0:ed:0e:cd:ee
Aggregator ID: 1

Slave Interface: eth4
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:e0:ed:19:47:60
Aggregator ID: 1

Comment 25 Andy Gospodarek 2010-10-14 15:02:43 UTC
(In reply to comment #24)
> I've been having fits trying to get LACP to work using some 82599EB cards and
> Cisco Nexus 5010 switches in recent weeks.  I'd tried the stock RHEL5.5 ixgbe
> (2.0.44-k2) and also a 2.1.4 driver from Intel with no luck.
> Though I've just begun testing the kernel linked to in Comment 23, they do
> indeed seem to fix the problem!
> 

Glad to hear it!

The problem was actually in the bonding driver not the ixgbe driver, so until puts a workaround in their SourceForge driver, it will still be broken.  I'll see what I can do to get this into RHEL5.6 (no promises though).

Comment 28 RHEL Program Management 2010-10-14 15:50:27 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 33 Robert N. Evans 2010-10-28 19:49:24 UTC
Stratus has encountered this problem also.

Comment 34 Jarod Wilson 2010-10-28 20:08:34 UTC
in kernel-2.6.18-229.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 36 bluca 2010-11-10 17:29:57 UTC
(In reply to comment #34)
> in kernel-2.6.18-229.el5
> You can download this test kernel (or newer) from
> http://people.redhat.com/jwilson/el5
> 
> Detailed testing feedback is always welcomed.

i was stomped today on a possibly related problem,
my environment is based on NX3031 nics (netxen_nic driver) with bonding in active-backup mode

at system boot the bond is apparently created but no traffic comes through
putting the interface in promiscuous mode (eg by tcpdump) makes it work again, until promiscuous is disabled,
also issuing 'service network restart' fixes it.

I tried updating to kernel from http://people.redhat.com/jwilson/el5/231.el5/ and the problem seems solved for good.

L.

Comment 37 Martin Prpič 2010-11-11 13:55:28 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, using 802.3ad link aggregation did not work properly when using the ixgbe driver. This was caused due to an inability to form 802.3ad-based bonds. With this update, the issue causing 802.3ad link aggregation to not work properly has been fixed.

Comment 40 errata-xmlrpc 2011-01-13 21:45:34 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html