Description of problem: During the course of testing with Mellanox 10GbE NICs, customer discovered a major issue while testing with vlan's. The parent NIC driver cannot perform tcp segmentation offloads to the vlan driver included as part of RHEL, and the line speed drops down to less than 1GB. The request here is that the vlan performs as well as the native driver and Mellanox has stated that they can get this, as long as all available features are propagated. TSO and CSUM have been propagated (backporting 3 patches), and while they see an improvement, they don't see the same performance. Scatter Gather IO has not been backported yet. These are the upstream commits backported so far: commit 75b8846acd11ad3fc736d4df3413fe946bbf367c Author: Patrick McHardy <kaber> Date: Tue Jul 8 03:22:42 2008 -0700 vlan: Add ethtool support Add ethtool support for querying the device for offload settings. commit 5fb13570543f4ae022996c9d7c0c099c8abf22dd Author: Patrick McHardy <kaber> Date: Tue May 20 14:54:50 2008 -0700 [VLAN]: Propagate selected feature bits to VLAN devices Propagate feature bits from the NETDEV_FEAT_CHANGE notifier. For now only TSO is propagated for devices that announce their ability to support TSO in combination with VLAN accel by setting the NETIF_F_VLAN_TSO flag. commit 289c79a4bd350e8a25065102563ad1a183d1b402 Author: Patrick McHardy <kaber> Date: Fri May 23 00:22:04 2008 -0700 vlan: Use bitmask of feature flags instead of seperate feature bits Herbert Xu points out that the use of seperate feature bits for features to be propagated to VLAN devices is going to get messy real soon. Replace the VLAN feature bits by a bitmask of feature flags to be propagated and restore the old GSO_SHIFT/MASK values. The last one breaks kABI, so it deserves more work on it. Additional info: They also tested with another vendor's NIC that is TOE capable, it is *not* susceptible to the same performance issues. Perhaps TOE driver is performing some of the functionality that vlan driver is meant to do. Unfortunately, the Mellanox NICs do not offer TOE and are the only KR NICs available in blade form factor.
Created attachment 385670 [details] 0001-Propagate-selected-feature-bits-to-VLAN-devices.patch
Created attachment 385671 [details] 0002-vlan-Add-ethtool-support.patch
Created attachment 385672 [details] 0003-mlx4_en-Added-vlan_features-support.patch
Created attachment 385673 [details] screenshot of the performance results
Have you tested upstream kernel? Does it have the desired performance?
(In reply to comment #5) > Have you tested upstream kernel? Does it have the desired performance? Yevgeny from Mellanox has tried and said that .32 performance was good.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
@Mellanox, We need to confirm that if we accept this updated patch set into the release, you will be able to provide us with a quick turnaround on testing so we know whether it properly addresses the issues as reported.
in kernel-2.6.18-191.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Please update the appropriate value in the Verified field (cf_verified) to indicate this fix has been successfully verified. Include a comment with verification details.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days