Description of problem: On a system with Intel 82580 ("Barton Hills") NIC, bonding failover doesn't work because the igb driver doesn't correctly determine link state on this hardware. The link is always considered "up". How reproducible: always Steps to Reproduce: 1 install on server with 82580 ("Barton Hills") NIC 2 activate 82580 network interface 3 pull cable Actual results: ethtool shows "Link detected: yes" In a bonding setup, no failover occurs because the link down event is not noticed. cat /proc/net/bonding/bond<X> shows all slaves as "up" even if cables are pulled. Expected results: ethtool shows "Link detected: no". In a bonding setup, bonding driver does a fail over to another slave. Additional info: This does not occur with 82576 (Kawela) or 82575 (Zoar) NICs. This does not occur on RHEL6 beta although the igb driver version is the same. This problem is solved in the Intel OEM driver 2.2.9. This should be fixed in 5.6. We'd also like to request a DUD for 5.5 when the problem is fixed for 5.6.
Created attachment 439393 [details] messages file At the bottom you can see sections '#### bond0 test ###' and '### bond 1 test ###'. During the tests I pulled and re-plugged cables on the two bonds. bond0 is a bond with 4 Barton Hills ports, and bond1 is a bond with 2 Kawela ports. You can see the status changes on Kawela, but you see nothing for Barton Hills.
Hi Martin, please try the latest kernel from http://people.redhat.com/sassmann/kernel/#rhel5 Also, as requested in the other bonding bug: Please provide the exact steps you took to do the bonding setup.
(In reply to comment #2) > please try the latest kernel from > http://people.redhat.com/sassmann/kernel/#rhel5 The problem is solved in that kernel.
Created attachment 440325 [details] Network configuration, as requested.
(In reply to comment #3) > The problem is solved in that kernel. I used kernel-2.6.18-212.el5.sassmann_igb56_02.x86_64.rpm
This bug should be solved along with #566024. Please verify that everything works as expected with kernel 2.6.18-215.el5 or above when it becomes available at: http://people.redhat.com/jwilson/el5/ Thanks!
This will need time as I just returned from vacation. Stefan, since I already verified the fix in your test kernel (comment #5), I think you know whether the relevant fix is included in the -215 kernel.
Martin, the changes from my test kernel are included in -215. This is just to verify that everything is working as you would expect it. Take your time. Thanks!
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html