Bug 452292

Summary: [Intel 4.8 FEAT] ixgbe driver update to latest upstream
Product: Red Hat Enterprise Linux 4 Reporter: John Ronciak <john.ronciak>
Component: kernelAssignee: Andy Gospodarek <agospoda>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 4.8CC: bugproxy, cward, dougal, ivecera, jane.lv, jfeeney, jvillalo, keve.a.gabbert, ltroan, luyu, martinez, martin.wilck, peterm, riek, rlerch, thomas_chenault, vgoyal
Target Milestone: rcKeywords: FutureFeature, HardwareEnablement, OtherQA
Target Release: 4.8   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-18 19:14:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 435612, 435614, 450719, 452249, 458123, 458752, 461297    
Attachments:
Description Flags
Proposed patch none

Description John Ronciak 2008-06-20 19:03:11 UTC
Description of problem:
Update the latest ixgbe driver to the latest upstream

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 John Jarvis 2008-07-10 20:28:01 UTC
*** Bug 454813 has been marked as a duplicate of this bug. ***

Comment 2 IBM Bug Proxy 2008-07-10 20:33:42 UTC
=Comment: #0=================================================
Emily J. Ratliff <emilyr.com> - 2008-07-09 17:30 EDT
1. Feature Id:	[201288]
Feature Name:	Driver update for Intel 10GB
Sponsor:	xSeries
Category:	Kernel
Request Type:	Driver - Update Version

2. Short Description
Driver updates to support the Intel 10GB NICS.  The drivers are called ixgbe and ixgb. This feature
request is specifically for ixgbe. (There is an additional bugzilla/feature request for ixgb.)

3. Business Case
Future option support of Intel 10GB adapter will be available on several systems and blades.  These
drivers need to be updated to support the high speed adapters.

4. Sponsor Priority	1
IBM Confidential:	yes
Code Contribution:	no
Upstream Acceptance:	Accepted
Component Version Target:	2.6.24
Performance Assistance:	no

5. PM Contact:	Monte Knutson, mknutson.com, 877-894-1495

6. Technical contact(s):
Kevin Stansell, kstansel.com
Chris McDermott, mcdermoc.com

7. LTC Manager: Julio Alvarez, julioa.com

*** This bug has been marked as a duplicate of 452292 ***

Comment 3 Kevin Stansell 2008-07-23 18:29:28 UTC
We are requesting this driver update to include support for the Intel Dorado chip.

Comment 4 Ronald Pacheco 2008-07-24 19:15:19 UTC
Per the 3-way conversation, Dorado is based on Niantic chipset, which is
supported with ixgbe.

Comment 6 Ronald Pacheco 2008-07-29 02:21:46 UTC
We have a request to ensure that the Copper Pond NIC is supported.

Comment 9 John Ronciak 2008-07-29 17:26:31 UTC
The currently scheduled driver update will include CuPond support.

Comment 10 John Ronciak 2008-08-26 21:01:13 UTC
Are these changes also in the latest test kernel from Andy?  If so we can start testing this as well.

Comment 11 RHEL Program Management 2008-09-03 12:58:59 UTC
Updating PM score.

Comment 12 Peter Martuccelli 2008-09-22 17:29:27 UTC
*** Bug 454827 has been marked as a duplicate of this bug. ***

Comment 13 IBM Bug Proxy 2008-09-22 17:33:07 UTC
Emily J. Ratliff <emilyr.com> - 2008-07-09 23:58 EDT
Feature Name:	Driver update for Intel 10GB - ixgb

Driver updates to support the Intel 10GB NICS.  The drivers are called ixgbe and
ixgb. This feature request is specifically for ixgb. (There is an additional bugzilla/feature
request for ixgbe.)

Future option support of Intel 10GB adapter will be available on several systems
and blades.  These drivers need to be updated to support the high speed adapters.





There are no plans to update this driver but if you have a list of specific bugs
you would like to make sure are fixed we can see about getting those into 4.8

John, we need the driver (ixgb/ixgbe) updated to the latest version to support
the Intel 10GB (Dorado) based on Intel Niantic chipset.

I would expect a similar requirement to come in from the Intel NIC group and for
them to provide the driver details.

Intel did not submit a BZ for ixgb, but that may have been an oversight.  I will
check this iwth them today.

They did, however, submit a request for ixgbe: BZ459292


Hmm, BZ459292 is not pulling up for me... Was this the correct number?

We are requesting updates to support the Intel Dorado 10GB chip (i've seen
conflicting statements saying this was an update to ixgb or ixgbe...Intel would
have the final say).

correcting the BZ in comment 3, it is BZ 452292

Comment 14 John Villalovos 2008-12-10 16:42:51 UTC
Andy,

Any status on this bug.  Would like to make sure we meet deadline next week.

Comment 15 Andy Gospodarek 2008-12-10 16:58:03 UTC
John, I just haven't gotten to it yet, I'm hoping to have it done by the
deadline.

Comment 16 Andy Gospodarek 2008-12-23 02:34:33 UTC
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel4

Please test them and report back your results.

Comment 18 Andy Gospodarek 2009-01-07 02:56:11 UTC
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel4

Please test them and report back your results.

Comment 19 Andy Gospodarek 2009-01-07 21:41:48 UTC
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel4

Please test them and report back your results.  Without immediate
feedback there is a good chance this or any other fix for this driver
will not be included in the upcoming update.

Comment 20 Vivek Goyal 2009-01-12 14:50:09 UTC
Committed in 78.27.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 23 IBM Bug Proxy 2009-02-20 00:30:32 UTC
ixgbe driver v1.3.18-k4 hung the system when 82598EB (Oplin) NIC was enabled. The machine is running RHEL 4.8 Alpha with 2.6.9-81.

Installed kernel 2.6.28.6 with 1.3.30-k2 driver which worked.

Comment 24 Chris Ward 2009-02-20 13:31:09 UTC
~~ Attention Partners!  ~~
RHEL 4.8 Partner Alpha has been released on partners.redhat.com. There should
be a fix present in the Beta, which addresses this bug. If you have already completed testing your other URGENT priority bugs, and you still haven't had a chance yet to test this bug, please do so at your earliest convenience, to ensure that only the highest possible quality bits are shipped in the upcoming public Beta drop.

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. Further questions can be directed to your
Red Hat Partner Manager.

Thanks, more information about Beta testing to come.
 - Red Hat QE Partner Management

Comment 25 Andy Gospodarek 2009-02-20 14:09:41 UTC
(In reply to comment #23)
> ixgbe driver v1.3.18-k4 hung the system when 82598EB (Oplin) NIC was enabled.
> The machine is running RHEL 4.8 Alpha with 2.6.9-81.
> 
> Installed kernel 2.6.28.6 with 1.3.30-k2 driver which worked.

Can you try my latest test kernels?

http://people.redhat.com/agospoda/#rhel4

I've added some new patches to address some issues with the original backport and so far they have been working much better than what is in 2.6.9-81.  Any feedback would be welcome.

Comment 26 IBM Bug Proxy 2009-02-20 17:40:47 UTC
(In reply to comment #24)
> (In reply to comment #23)
> > ixgbe driver v1.3.18-k4 hung the system when 82598EB (Oplin) NIC was enabled.
> > The machine is running RHEL 4.8 Alpha with 2.6.9-81.
> >
> > Installed kernel 2.6.28.6 with 1.3.30-k2 driver which worked.
>
> Can you try my latest test kernels?
>
> http://people.redhat.com/agospoda/#rhel4
>
> I've added some new patches to address some issues with the original backport
> and so far they have been working much better than what is in 2.6.9-81.  Any
> feedback would be welcome.
>

Progress !!
I was able to activate the oplin NIC and run tests using test kernel 2.6.9-81.EL.gtest.60.

Comment 27 Andy Gospodarek 2009-02-20 21:50:44 UTC
(In reply to comment #26)
> Progress !!
> I was able to activate the oplin NIC and run tests using test kernel
> 2.6.9-81.EL.gtest.60.

Excellent!

If you find any problems with those kernels please let me know as soon as possible, as I need to get these into the next official kernel build.

Comment 28 Dougal Ballantyne 2009-03-10 12:00:56 UTC
(In reply to comment #24)
> ~~ Attention Partners!  ~~
> RHEL 4.8 Partner Alpha has been released on partners.redhat.com. There should
> be a fix present in the Beta, which addresses this bug. If you have already
> completed testing your other URGENT priority bugs, and you still haven't had a
> chance yet to test this bug, please do so at your earliest convenience, to
> ensure that only the highest possible quality bits are shipped in the upcoming
> public Beta drop.
> 
> If you encounter any issues, please set the bug back to the ASSIGNED state and
> describe the issues you encountered. Further questions can be directed to your
> Red Hat Partner Manager.
> 
> Thanks, more information about Beta testing to come.
>  - Red Hat QE Partner Management  

Chris,

I am a RH partner but cannot find anything about RHEL 4.8 Partner Alpha on the partner site. Could you maybe email me details about accessing it. I have a customer who is preparing the test the new Intel chipsets and would very much like to verify if RHEL4.8 has support for the new 10GE cards.

-Dougal

Comment 30 Chris Ward 2009-03-13 14:02:39 UTC
~~ Attention Partners!  ~~
RHEL 4.8Beta has been released on partners.redhat.com. There should
be a fix present, which addresses this bug. Please test and report back results on this OtherQA Partner bug at your earliest convenience.

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe any issues you encountered. If you have found a NEW bug, clone this bug and describe the issues you've encountered. Further questions can be directed to your Red Hat Partner Manager.

If you have VERIFIED the bug fix. Please select your PartnerID from the Verified field above. Please leave a comment with your test results details. Include which arches tested, package version and any applicable logs.

 - Red Hat QE Partner Management

Comment 33 Chris Ward 2009-04-03 10:29:37 UTC
~~ Attention Partners! Snap 2 *Kernel Only* Released ~~
RHEL 4.8 Snapshot 2 *kernel* has been released on partners.redhat.com. There should be a fix present, which addresses this bug. NOTE: there is only a short amount of time left to test, please test and report back results on this OtherQA Partner bug at your earliest convenience.

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. If you have found a NEW bug, clone this
bug and describe the issues you encountered. Further questions can be
directed to your Red Hat Partner Manager.

If you have VERIFIED the bug fix. Please select your PartnerID from the
Verified field above. Please leave a comment with your test results details.
Include which arches tested, package version and any applicable logs.

Comment 34 Chris Ward 2009-04-09 07:42:48 UTC
~~ Attention Partners! Snap 3 Released ~~
RHEL 4.8 Snapshot 3 has been released on partners.redhat.com. There
should be a fix present that resolves this bug.

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. If you have found a NEW bug, clone this
bug and describe the issues you encountered. Further questions can be
directed to your Red Hat Partner Manager.

If you have VERIFIED the bug fix. Please select your PartnerID from the
Verified field above. Please leave a comment with your test results details.
Include which arches tested, package version and any applicable logs.

Comment 35 John Ronciak 2009-04-10 22:22:17 UTC
The following Device ID's are missing from the latest ixgbe driver as compared to upstream:

#define IXGBE_DEV_ID_82598               0x10B6
#define IXGBE_DEV_ID_82598_BX            0x1508
#define IXGBE_DEV_ID_82598EB_SFP_LOM     0x10DB
#define IXGBE_DEV_ID_82598_DA_DUAL_PORT  0x10F1
#define IXGBE_DEV_ID_82598_SR_DUAL_PORT_EM      0x10E1
#define IXGBE_DEV_ID_82598EB_XF_LR       0x10F4


These need to be added to the driver.  These are for the Oplin HW which has been shipping for some time now.

Comment 36 Andy Gospodarek 2009-04-13 13:38:35 UTC
John, according to comment #26 at least some of the Oplin NICs were working fine with this driver version.

At a minimum it sounds like you want us to try and pull these patches?

Author: Don Skidmore <donald.c.skidmore>
Date:   Mon Jan 26 20:57:51 2009 -0800

    ixgbe: add support KX/KX4 device

commit 2f21bdd3542838dc5513a585a32aa13f01b019e7
Author: Don Skidmore <donald.c.skidmore>
Date:   Sun Feb 1 01:18:23 2009 -0800

    ixgbe: Add 82598 support for BX mezzanine devices

commit c4900be053d376dfe4f603d000aa5e4c60745dec
Author: Donald Skidmore <donald.c.skidmore>
Date:   Thu Nov 20 21:11:42 2008 -0800

    ixgbe: add SFP+ driver support

commit b95f5fcb8ba6073a652927d232a7a7cb552afe62
Author: Jesse Brandeburg <jesse.brandeburg>
Date:   Thu Sep 11 19:58:59 2008 -0700

    ixgbe: add device support for XF LR adapters

I'm not sure if we can satisfy that request this late in the game, but I'll see what we can do....

Comment 37 John Ronciak 2009-04-13 16:23:12 UTC
These newer patches add more NIC support.  So to match what's upstream I think these should be added as well, especially the last 2.  Since this it's getting close to the end of RHEL4 maybe all of the most recent devices should get into it so all of the OEM's will have device support in the 4.8 for all of the NICs and devices rolling out this year.  Was Niantic support picked up for 4.8 as well?  It's upstream now, at least for basic support.

Comment 38 Andy Gospodarek 2009-04-13 18:20:48 UTC
John, my only concern with pulling only the 4 patches listed above, is I'm not sure if other fixes/supporting code were added in-between.  It makes me a others here a bit nervous.

Comment 39 John Ronciak 2009-04-13 18:29:48 UTC
So what commits from the list I sent are already included?  We could take a look and see what's already been included and see if other things are missing.  We don't know what's been added and what has not so we can't really do this at this point without a list of what has been included.

Comment 40 Chris Ward 2009-04-14 09:13:48 UTC
FYI, if there will be any additional changes to this request, you're going to need to request blocker status and it'll need to be done ASAP.

Comment 41 Thomas Chenault 2009-04-14 23:22:19 UTC
I have performed basic testing, including network stress testing and driver load/unload testing, on ixgbe-1.3.18-k4 as provided in kernels 2.6.9-86.EL, 2.6.9-86.ELsmp, 2.6.9-86.ELlargesmp, and 2.6.9-87.ELsmp, x86_64. Testing has included 8086:10c7 and 8086:10c8 adapters. I have not encountered any errors.

Comment 42 Chris Ward 2009-04-15 09:18:09 UTC
Andy, what's the current status regarding patch inclusion?

Comment 46 Chris Ward 2009-04-16 13:13:19 UTC
~~ Attention! Snap 4 Released ~~
RHEL 4.8 Snapshot 4 has been released on partners.redhat.com. There
should be a fix present that resolves this bug. There's not much more time to test. Please report back results ASAP.

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. If you have found a NEW bug, clone this
bug and describe the issues you encountered. Further questions can be
directed to your Red Hat Partner Manager.

If you have VERIFIED the bug fix. Please select your PartnerID from the
Verified field above. Please leave a comment with your test results details.
Include which arches tested, package version and any applicable logs.

Comment 47 Ivan Vecera 2009-04-16 14:48:25 UTC
Created attachment 339851 [details]
Proposed patch

Hi John,
I have prepared the backport of the requested patches. We need to test it against the real hardware. The testing has to be done till 2 PM EDT, if not this patch cannot be included in RHEL 4.8 release. I'm building now the testing kernel packages for you, but if you don't want to wait you can build them by yourself using Vivek's SRPMS available at:
http://people.redhat.com/vgoyal/rhel4/SRPMS.kernel/kernel-2.6.9-88.EL.src.rpm

My builds (for i686 and x86_64) will be available ASAP.

Comment 48 Ivan Vecera 2009-04-16 15:54:09 UTC
SRPM and RPMs for i686 are now at:
http://people.redhat.com/ivecera/rhel-4-ivtest/

Comment 49 John Ronciak 2009-04-16 17:21:23 UTC
So what patches were included?  We can check for the missing ID inclusion but no real meaningful testing in lass than 4 hours.  How can we be asked to set of unknown patches in 4 to 5 hours?

Comment 50 Andy Gospodarek 2009-04-16 18:25:36 UTC
(In reply to comment #49)
> So what patches were included?  We can check for the missing ID inclusion but
> no real meaningful testing in lass than 4 hours.  How can we be asked to set
> of unknown patches in 4 to 5 hours?  

John, in comment #35 you gave us no clear knowledge of what patches we should add.  You just wanted us to add some PCI IDs.  I took a look at what patches added those and came up with this list of commits and posted the list in comment #36.  For completeness here is the list again.  These are the patches that have been used to generate the patch in comment #47.

commit 1e336d0fc99f159ed636ffb9128bc84e09ccc279
Author: Don Skidmore <donald.c.skidmore>
Date:   Mon Jan 26 20:57:51 2009 -0800

    ixgbe: add support KX/KX4 device

commit 2f21bdd3542838dc5513a585a32aa13f01b019e7
Author: Don Skidmore <donald.c.skidmore>
Date:   Sun Feb 1 01:18:23 2009 -0800

    ixgbe: Add 82598 support for BX mezzanine devices

commit c4900be053d376dfe4f603d000aa5e4c60745dec
Author: Donald Skidmore <donald.c.skidmore>
Date:   Thu Nov 20 21:11:42 2008 -0800

    ixgbe: add SFP+ driver support

commit b95f5fcb8ba6073a652927d232a7a7cb552afe62
Author: Jesse Brandeburg <jesse.brandeburg>
Date:   Thu Sep 11 19:58:59 2008 -0700

    ixgbe: add device support for XF LR adapters

You emailed me a list of commits for 5.4 a few weeks ago and that full list is NOT going into 4.8.  We were supposed to have frozen this code long ago so we can release soon, so including those changes into RHEL4.8 wasn't ever an option.

I can understand your frustration when asking for a short turnaround time on testing, but the deadline for 4.8 was months ago and with a moving target like 'upstream' it's hard for us to know exactly what we should pull into RHEL at all times.  Unfortunately we have to draw the line somewhere no matter where we get the driver.

In comment #2 someone requested the driver version that was in 2.6.24 and I'm quite sure we have satisfied that request, so that's probably where we will stand since we can't really get these verified with our current schedule.

Comment 51 John Ronciak 2009-04-16 18:40:46 UTC
I was confused by Ivan comments about the patches.  If these are the patches talked about, we are good.  The kernel is under test now and we are checking the device ID's.  I'll get another reply out as soon as we know, hopefully in the next hour or so.

Comment 52 John Ronciak 2009-04-16 18:51:32 UTC
OK the device ID's have made it.  So they now match upstream, the driver is loaded and testing has started on it.  I think we should go ahead and include this version of the driver.  We'll have more testing over the next day or so with some results by COB today our time.

Comment 53 John Ronciak 2009-04-16 21:07:10 UTC
It looks like at least one of the new ID's is broke.  We see an issue when loading the driver on Oplin LOM (10DB). We get "EEPROM is not valid". Maybe some of the SFP+ didn't get backported correctly?  Our internal 2.0.2x driver works fine.  These are usually easy to fix but if we are out of time I'm not sure what gets done about this.

Comment 54 John Ronciak 2009-04-16 23:58:00 UTC
In testing the kernel listed above for the LongCove DA (10F1) and KX4 (10B6) devices the driver loads OK but we cannot get link.  It looks like the drive is broken, at least for some devices.

Comment 55 Andy Gospodarek 2009-04-17 20:40:16 UTC
Thanks for the feedback, John.  I will open a new bugzilla to address this hardware support so we can try and get it working.

Comment 56 John Ronciak 2009-04-17 21:25:27 UTC
We'll test as soon as we have a new kernel so just let us know.

Thanks Andy.

Comment 57 Chris Ward 2009-04-20 10:24:35 UTC
Pointer to the new bugzilla?

Comment 58 Chris Ward 2009-04-20 10:25:45 UTC
I suppose it's this one?  Bug 496331 -  [RHEL4] ixgbe: add additional hardware support.

Comment 59 Chris Ward 2009-04-20 13:49:47 UTC
HP, bug 496331 will track the additional requests for this update. They will be deferred until 4.9, where they will be reviewed for inclusion.

To confirm, other than what's been defered to bug #496331, /this/ bug has been resolved?

Comment 60 Andy Gospodarek 2009-04-20 14:43:49 UTC
I believe so.

Comment 61 Chris Ward 2009-04-28 07:26:41 UTC
John, re comment #56, the latest kernel can be found in the usual location: 

http://people.redhat.com/vgoyal/rhel4/RPMS.kernel/

Besides the additional requests that have been deferred to bug 496331, has this driver update request been resolved?

Comment 62 John Ronciak 2009-04-28 18:41:06 UTC
All of the older devices that were tested worked.  so if the LongCove DA (10F1) and KX4 (10B6)devices are being pushed out, we are OK with closing this BZ while work continues on the the 496331 BZ.

Comment 64 Andy Gospodarek 2009-04-29 13:42:36 UTC
(In reply to comment #62)
> All of the older devices that were tested worked.  so if the LongCove DA (10F1)
> and KX4 (10B6)devices are being pushed out, we are OK with closing this BZ
> while work continues on the the 496331 BZ.  

Thanks, John.  We'll close this one and track the other devices in bug 496331.

Comment 66 errata-xmlrpc 2009-05-18 19:14:18 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html