Bug 625061 - igb doesn't see link status changes on 82580 NIC
Summary: igb doesn't see link status changes on 82580 NIC
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Stefan Assmann
QA Contact: Network QE
URL:
Whiteboard:
Depends On:
Blocks: 547220
TreeView+ depends on / blocked
 
Reported: 2010-08-18 14:08 UTC by Martin Wilck
Modified: 2018-11-14 19:16 UTC (History)
9 users (show)

Fixed In Version: kernel-2.6.18-215.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-13 21:10:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
messages file (106.69 KB, text/plain)
2010-08-18 14:12 UTC, Martin Wilck
no flags Details
Network configuration, as requested. (864 bytes, application/x-gzip)
2010-08-23 08:55 UTC, Martin Wilck
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0017 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.6 kernel security and bug fix update 2011-01-13 10:37:42 UTC

Description Martin Wilck 2010-08-18 14:08:29 UTC
Description of problem:
On a system with Intel 82580 ("Barton Hills") NIC, bonding failover doesn't work because the igb driver doesn't correctly determine link state on this hardware. The link is always considered "up".

How reproducible:
always

Steps to Reproduce:
1 install on server with 82580 ("Barton Hills") NIC
2 activate 82580 network interface
3 pull cable

Actual results:
ethtool shows "Link detected: yes"
In a bonding setup, no failover occurs because the link down event is not noticed. cat /proc/net/bonding/bond<X> shows all slaves as "up" even if cables are pulled.

Expected results:
ethtool shows "Link detected: no". In a bonding setup, bonding driver does a fail over to another slave.

Additional info:
This does not occur with 82576 (Kawela) or 82575 (Zoar) NICs. This does not occur on RHEL6 beta although the igb driver version is the same.

This problem is solved in the Intel OEM driver 2.2.9.

This should be fixed in 5.6. We'd also like to request a DUD for 5.5 when the problem is fixed for 5.6.

Comment 1 Martin Wilck 2010-08-18 14:12:31 UTC
Created attachment 439393 [details]
messages file

At the bottom you can see sections '#### bond0 test ###' and '### bond 1 test ###'. During the tests  I pulled and re-plugged cables on the two bonds. bond0 is a bond with 4 Barton Hills ports, and bond1 is a bond with 2 Kawela ports. You can see the status changes on Kawela, but you see nothing for Barton Hills.

Comment 2 Stefan Assmann 2010-08-22 18:06:56 UTC
Hi Martin,

please try the latest kernel from
http://people.redhat.com/sassmann/kernel/#rhel5

Also, as requested in the other bonding bug: Please provide the exact steps you took to do the bonding setup.

Comment 3 Martin Wilck 2010-08-23 08:53:47 UTC
(In reply to comment #2)

> please try the latest kernel from
> http://people.redhat.com/sassmann/kernel/#rhel5

The problem is solved in that kernel.

Comment 4 Martin Wilck 2010-08-23 08:55:44 UTC
Created attachment 440325 [details]
Network configuration, as requested.

Comment 5 Martin Wilck 2010-08-23 08:56:34 UTC
(In reply to comment #3)
> The problem is solved in that kernel.

I used kernel-2.6.18-212.el5.sassmann_igb56_02.x86_64.rpm

Comment 8 Stefan Assmann 2010-09-02 07:42:48 UTC
This bug should be solved along with #566024.

Please verify that everything works as expected with kernel 2.6.18-215.el5 or above when it becomes available at:
http://people.redhat.com/jwilson/el5/

Thanks!

Comment 10 Martin Wilck 2010-09-07 12:01:10 UTC
This will need time as I just returned from vacation. Stefan, since I already verified the fix in your test kernel (comment #5), I think you know whether the relevant fix is included in the -215 kernel.

Comment 11 Stefan Assmann 2010-09-07 12:16:18 UTC
Martin,
the changes from my test kernel are included in -215. This is just to verify that everything is working as you would expect it. Take your time. Thanks!

Comment 16 errata-xmlrpc 2011-01-13 21:10:15 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html


Note You need to log in before you can comment on or make changes to this bug.