Bug 556476

Summary: Update sfc driver (add SFC9000 support)
Product: Red Hat Enterprise Linux 5 Reporter: Ben Hutchings <bhutchings>
Component: kernelAssignee: Michal Schmidt <mschmidt>
Status: CLOSED ERRATA QA Contact: Network QE <network-qe>
Severity: low Docs Contact:
Priority: high    
Version: 5.6CC: andriusb, coughlan, cward, dhoward, hjia, marting, mschmidt, msnitzer, xdl-redhat-bugzilla
Target Milestone: rcKeywords: OtherQA, ZStream
Target Release: 5.6   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 556563 (view as bug list) Environment:
Last Closed: 2011-01-13 21:00:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 658520, 1300182    
Attachments:
Description Flags
patchset to update sfc up to v2.6.36-rc1 for RHEL5.6
none
extra patches
none
net: Fix test for VLAN TX checksum offload capability
none
patchset to update sfc up to v2.6.36-rc1 for RHEL5.6 (bwh) none

Description Ben Hutchings 2010-01-18 15:01:43 UTC
In Linux 2.6.33 the sfc driver adds support for the Solarstorm SFC9000 family of Ethernet controllers.

We would like this to be backported in RHEL 5.6 and in RHEL 6 (if it uses a kernel version prior to 2.6.33). I am available to help with this, and can probably arrange to send hardware samples for your own testing.

Comment 1 Michal Schmidt 2010-01-18 18:03:34 UTC
I've cloned this bug to track RHEL5 and RHEL6 separately. Bug 556563 is the one for RHEL6.

Comment 2 Michal Schmidt 2010-08-18 16:40:58 UTC
Created attachment 439444 [details]
patchset to update sfc up to v2.6.36-rc1 for RHEL5.6

These patches will update the sfc driver in RHEL5 to the version in v2.6.36-rc1.

They apply on kernel-2.6.18-212.el5 which is available at http://people.redhat.com/jwilson/el5/

The patchset is completely untested, I have just started a Brew build:
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2690223

Comment 3 Michal Schmidt 2010-08-20 12:49:27 UTC
I uploaded the built kernel to:
http://people.redhat.com/mschmidt/kernel/bz556476/

Could you please test it?

Comment 4 Ben Hutchings 2010-08-25 23:52:44 UTC
(In reply to comment #3)
> I uploaded the built kernel to:
> http://people.redhat.com/mschmidt/kernel/bz556476/
> 
> Could you please test it?

I've reviewed your backported patches and they looked correct. However, the kernel panics when probing! I am still investigating this.

Comment 5 Ben Hutchings 2010-08-26 19:45:30 UTC
Created attachment 441318 [details]
extra patches

I've attached a tarball containing the following patches:

- 0000a-vlan-pull-fix.patch
- 0000b-vlan-gro-fix.patch

Networking core fixes for VLAN RX with page buffers. Without these, all 802.1q packets received over an sfc device will be dropped unless the driver is configured to pre-allocate skbs (rx_alloc_method=1). This may be true for RHEL 5.5 as well.

I have not tested the br_netfilter changes in 0000a-vlan-pull-fix.patch so you may wish to remove them.

- 0082a-sfc-Add-power-management-and-wake-on-LAN-support-cleanup.patch

This makes the power management functions identical to upstream and adds wrapper functions like those we use in our out-of-tree driver. Please fold it into 0082-sfc-Add-power-management-and-wake-on-LAN-support.patch

- 0150a-sfc-Create-multiple-TX-queues-not-really.patch

This corrects the number of TX queues used, and fixes the panic I reported previously. Please fold it into 0150-sfc-Create-multiple-TX-queues.patch

- 0166a-sfc-Implement-message-level-control-cleanup.patch

This removes some unnecessary divergence from upstream. Please fold it into 0166-sfc-Implement-message-level-control.patch

There is still a performance-killing bug affecting VLAN TX which I will try to track down and fix tomorrow.

Comment 6 Michal Schmidt 2010-08-27 14:56:54 UTC
Awesome! Thank you, Ben.

Note for self: new Brew task 2715961

Comment 7 RHEL Program Management 2010-08-27 15:09:43 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 Michal Schmidt 2010-09-03 00:20:45 UTC
(In reply to comment #5)
> There is still a performance-killing bug affecting VLAN TX which I will try to
> track down and fix tomorrow.

Hello Ben,
do you have any progress on the VLAN TX performance bug?

Comment 9 Ben Hutchings 2010-09-03 16:06:57 UTC
Created attachment 442930 [details]
net: Fix test for VLAN TX checksum offload capability

(In reply to comment #8)
> (In reply to comment #5)
> > There is still a performance-killing bug affecting VLAN TX which I will try to
> > track down and fix tomorrow.
> 
> Hello Ben,
> do you have any progress on the VLAN TX performance bug?

I started back on this today. There's another fix to the networking
core needed to make VLAN TX acceleration work properly, which I'm
attaching now.

Comment 10 Ben Hutchings 2010-09-03 16:45:27 UTC
Given that the SFC9000 controllers support {TCP,UDP}/IPv6 checksum offload as well as IPv4, I think we can include NETIF_F_HW_CSUM in their features. This is what we do in the OOT version of this driver. That should also remove the need for this last patch.

Comment 11 Ben Hutchings 2010-09-03 18:34:42 UTC
Created attachment 442955 [details]
patchset to update sfc up to v2.6.36-rc1 for RHEL5.6 (bwh)

Here's the final patch series I've ended up with.

Comment 12 Michal Schmidt 2010-09-03 18:39:49 UTC
Thank you very much. I'm going to review your modifications and then post the series to our internal mailing list.

Comment 15 Ben Hutchings 2010-09-06 15:25:04 UTC
One more thing: The patch
"linux-2.6-misc-add-thread-core-_siblings_list-to-sys.patch" renames the macros topology_core_siblings and topology_thread_siblings to topology_core_cpumask and topology_thread_cpumask.

sfc attempts to allocate one set of queues per package, falling back to one per core. It uses the macro topology_core_siblings, if defined, and therefore will always use the fallback now. Also, the topology_*_cpumask macros return a different type from their mainline implementations, so I don't think the renaming makes sense.

Comment 16 Chris Ward 2010-09-08 08:55:03 UTC
@solarflare, please confirm that you expect to test and verify this request is complete with RHEL 5.6.0 beta bits when they are available via RHN.

Otherwise, we will need to arrange for delivery of hardware before we can accept this into the 5.6 release

I would also like to understand whether there is an Issue Tracker associated with this request or if this request is not being pushed through our standard support request workflows?

Comment 17 Ben Hutchings 2010-09-08 13:08:49 UTC
(In reply to comment #16)
> @solarflare, please confirm that you expect to test and verify this request is
> complete with RHEL 5.6.0 beta bits when they are available via RHN.

I expect to do that anyway.

> Otherwise, we will need to arrange for delivery of hardware before we can
> accept this into the 5.6 release

We do have an outstanding order to send hardware to Michal. This has unfortunately been delayed but I understand he should receive it next week.

> I would also like to understand whether there is an Issue Tracker associated
> with this request or if this request is not being pushed through our standard
> support request workflows?

I'm not aware of any support request.

Comment 18 Chris Ward 2010-09-08 14:57:51 UTC
Okay, I ask because normally Feature requests come through our support group. 

This bug is also possibly labelled incorrectly, I believe it should have FutureFeature in the keywords field.

Comment 22 Michal Schmidt 2010-09-09 12:30:59 UTC
Good news. The two SFL9021 cards arrived to me already. Thanks!

Comment 29 Michal Schmidt 2010-09-13 17:24:14 UTC
(In reply to comment #15)
> One more thing: The patch
> "linux-2.6-misc-add-thread-core-_siblings_list-to-sys.patch" renames the macros
> topology_core_siblings and topology_thread_siblings to topology_core_cpumask
> and topology_thread_cpumask.
> 
> sfc attempts to allocate one set of queues per package, falling back to one per
> core. It uses the macro topology_core_siblings, if defined, and therefore will
> always use the fallback now. Also, the topology_*_cpumask macros return a
> different type from their mainline implementations, so I don't think the
> renaming makes sense.

Ben, I have created bug 633388 for this.

Comment 30 Jarod Wilson 2010-09-15 14:00:01 UTC
in kernel-2.6.18-221.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 32 Ben Hutchings 2010-10-07 15:28:11 UTC
The Solarflare test group has completed testing of the sfc driver in kernel-2.6.18-222.el5 (which has the fix for bug 633388) and found no issues with it.

Comment 33 Martin George 2010-11-15 15:23:29 UTC
The fix for this bug contains the jiffies related fix required for the ALUA transitioning patch described in bug 619361. 

And we need the ALUA transitioning fix to be backported to 5.5.z. So could we please have this jiffies related fix backported to 5.5.z as well?

Comment 34 Ben Hutchings 2010-11-15 16:17:36 UTC
(In reply to comment #33)
> The fix for this bug contains the jiffies related fix required for the ALUA
> transitioning patch described in bug 619361. 
> 
> And we need the ALUA transitioning fix to be backported to 5.5.z. So could we
> please have this jiffies related fix backported to 5.5.z as well?

This comment doesn't seem to have anything to do with bug 556476.

Comment 35 Michal Schmidt 2010-11-15 16:24:05 UTC
I believe Martin is interested in the patch "add round_jiffies_up and related routines" which was added to the RHEL kernel because the updated sfc driver depended on it.

Comment 36 Martin George 2010-11-15 17:32:14 UTC
(In reply to comment #35)
> I believe Martin is interested in the patch "add round_jiffies_up and related
> routines" which was added to the RHEL kernel because the updated sfc driver
> depended on it.

Yes, that's right. This 'round_jiffies_up' routine is used in the ALUA transitioning fix described in bug 619361. Hence the request.

Comment 41 errata-xmlrpc 2011-01-13 21:00:56 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html