Bug 510818 - cxgb3 driver fixes [NEEDINFO]
cxgb3 driver fixes
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.4
All Linux
low Severity medium
: rc
: 5.5
Assigned To: Doug Ledford
Red Hat Kernel QE team
: OtherQA
Depends On:
Blocks: 533192
  Show dependency treegraph
 
Reported: 2009-07-10 18:42 EDT by Divy Le Ray
Modified: 2010-03-30 03:41 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-03-30 03:41:32 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
cward: needinfo? (divy)


Attachments (Terms of Use)
tar ball containing a series of patches against RHEL5.4 (build 154) cxgb3 driver (5.27 KB, application/octet-stream)
2009-07-10 19:17 EDT, Divy Le Ray
no flags Details

  None (edit)
Description Divy Le Ray 2009-07-10 18:42:52 EDT
Hi,

I open this bug to track inclusion of the following 8 bug fixes committed to kernel.org:

commit 34701fde8f4bf207ca96d10b8700a8667157854c
Author: Divy Le Ray <divy@chelsio.com>
Date:   Tue Jul 7 19:48:32 2009 +0000

    cxgb3: Drain Mac Tx fifo when the port goes down.

commit 88045b3cf0f8981129cb489c7b6bc36c21dd33a7
Author: Divy Le Ray <divy@chelsio.com>
Date:   Tue Jul 7 19:49:04 2009 +0000

    cxgb3: fix mac index mapping

    Override the mac index computation for the gen2 adapter,
    as each port is expected to use index 0.

commit dce7d1d031aeaa8c65bd37ff2480dc450a68185e
Author: Divy Le Ray <divy@chelsio.com>
Date:   Tue Jul 7 19:48:59 2009 +0000

    cxgb3: Fix mss table initialization

commit 5e659515569220701bfe3c8936dcab67554cc286
Author: Divy Le Ray <divy@chelsio.com>
Date:   Tue Jul 7 19:48:43 2009 +0000

    cxgb3: AEL2020 phy support update

commit cfe2462c6af309ee70e4aeefa55cae976071b9e2
Author: Divy Le Ray <divy@chelsio.com>
Date:   Tue Jul 7 19:48:38 2009 +0000

    cxgb3: Fix T3C MAC max packet size access

commit 619f05cf690149bef1f15cd0cec6a31b40d96951
Author: Divy Le Ray <divy@chelsio.com>
Date:   Tue Jul 7 19:48:53 2009 +0000

    cxgb3: fix phy power down

commit 2c3d50f7db6c4aa85b099613aba8660da6de75d4
Author: Divy Le Ray <divy@chelsio.com>
Date:   Tue Jul 7 19:48:48 2009 +0000

    cxgb3: AQ100X phy support update


Cheers,
Divy
Comment 1 Divy Le Ray 2009-07-10 19:17:15 EDT
Created attachment 351313 [details]
tar ball containing a series of patches against RHEL5.4 (build 154) cxgb3 driver

Hi,

The tar ball in attachment contains patches updating the cxgb3 driver to
the kernel.org level.
It  is a and bug fixes series.
All the changes have been committed to kernel.org.
The tar ball contains a directory of individual patches as appearing in
net-next-2.6, and a global patch for convenience.

Cheers,
Divy
Comment 2 Andrius Benokraitis 2009-07-22 00:13:51 EDT
Submitting a bugzilla doesn't automatically notify targeted folks - please CC your partner manager in the future for any new bugzillas created (I've added him).
Comment 3 Doug Ledford 2009-07-23 16:25:54 EDT
We are already well beyond the kernel freeze deadline.  In order to even consider these patches, we will need both justification and a risk assessment for inclusion.
Comment 4 Divy Le Ray 2009-07-23 17:54:46 EDT
(In reply to comment #3)
> We are already well beyond the kernel freeze deadline.  In order to even
> consider these patches, we will need both justification and a risk assessment
> for inclusion.  

Hi Doug,

I fully understand the concern.
These patches are narrowed to some issues seen in the field, and should present
no risk. Justifications are inline:

1. commit 34701fde8f4bf207ca96d10b8700a8667157854c
   Author: Divy Le Ray <divy@chelsio.com>
   Date:   Tue Jul 7 19:48:32 2009 +0000

        cxgb3: Drain Mac Tx fifo when the port goes down.

Seen in the field. On a 2 ports adapters, each port running RDMA traffic,
shutting down one port would block the chip, Tx channels were not getting
drained. This fix ensures to drain Tx channels when shutting down a port.
Risk assessment: low. verified to fix the issue, and touches ifup/ifdown path
only.

 2. commit 88045b3cf0f8981129cb489c7b6bc36c21dd33a7
    Author: Divy Le Ray <divy@chelsio.com>
    Date:   Tue Jul 7 19:49:04 2009 +0000

        cxgb3: fix mac index mapping

        Override the mac index computation for the gen2 adapter,
        as each port is expected to use index 0.

Seen in the lab. The Gen2 adapter was not addressing the right MAC for port 1,
leading to driver crashes. 
Risk assessment: low, affects the Gen2 adapter only.

3.  commit dce7d1d031aeaa8c65bd37ff2480dc450a68185e
    Author: Divy Le Ray <divy@chelsio.com>
    Date:   Tue Jul 7 19:48:59 2009 +0000

        cxgb3: Fix mss table initialization

Seen in the field. Some iWARP connections would get random mtus.
cxgb3_main.c::write_smt_entry() has always initialized the mtu index properly,
but the late added init_tp_parity() replays the settings and did not (re-)
initialize the mtu index.
Risk assessment: low.

4.  commit 5e659515569220701bfe3c8936dcab67554cc286
    Author: Divy Le Ray <divy@chelsio.com>
    Date:   Tue Jul 7 19:48:43 2009 +0000

        cxgb3: AEL2020 phy support update

This phy's first link status interrupt was not always detected, leaving the
driver/OS to believe the link is down and remaining in this state until the
link actually changes. The fix checks the link status right after the phy is
reset.
The link status LED, was also inconsistent, fixed by setting the right set of
registers.
Risk assessment: Low. only affects the boards shipping this phy. 

5.  commit cfe2462c6af309ee70e4aeefa55cae976071b9e2
    Author: Divy Le Ray <divy@chelsio.com>
    Date:   Tue Jul 7 19:48:38 2009 +0000

        cxgb3: Fix T3C MAC max packet size access

Seen in the field. Max packet size access was bogus, due to improper masking of
the register read value.
Risk assessment: low.

6.  commit 619f05cf690149bef1f15cd0cec6a31b40d96951
    Author: Divy Le Ray <divy@chelsio.com>
    Date:   Tue Jul 7 19:48:53 2009 +0000

        cxgb3: fix phy power down

Seen in the field. Our CX4 and one optical phy were missing one mdio bit
setting on ifdown, leading to potential link status inconsistencies on the
peer.
The fix gets all phys to call the same port down function, that was in place
for other phys.
Risk assessment: low. Affects CX4 and AEL1006 phy based adapters only.

7.  commit 2c3d50f7db6c4aa85b099613aba8660da6de75d4
    Author: Divy Le Ray <divy@chelsio.com>
    Date:   Tue Jul 7 19:48:48 2009 +0000

        cxgb3: AQ100X phy support update

The BT phy has a new FW flashed in manufacturing. The driver needs to be
updated to check the right on phy FW.
Risk assessment: low. Affects 10G BT adapters only.


It is unfortunate these patches show up this late in the release. 
I waited for net-next-2.6 to open to submit these fixes in - they could have
been candidates for net-2.6, but I had pushed the envelope in the 2.6.30 merge
window, and got cautious this time around. Once backported, tested and posted
to RH, I did not realize they were not on the radar.
I however really hope they can make it. They all are low risk, targeted to
specific issues and do not touch the data path.

Cheers,
Divy
Comment 5 Peter Martuccelli 2009-07-24 14:14:58 EDT
Doug has reviewed the patch set and even though the risk is low we are too far along in the RHEL 5.4 schedule to take this patch in.  Development will target this patch set for inclusion in RHEL 5.5.
Comment 8 Chris Ward 2009-10-13 11:12:23 EDT
Chelsio,

We need to confirm that there is commitment to test 
for the resolution of this request during the RHEL 5.5 test
phase, if it is accepted into the release. 

Please post a confirmation before Oct 16th, 2009, 
including the contact information for testing engineers.
Comment 9 Divy Le Ray 2009-10-13 17:23:47 EDT
(In reply to comment #8)
> Chelsio,
> 
> We need to confirm that there is commitment to test 
> for the resolution of this request during the RHEL 5.5 test
> phase, if it is accepted into the release. 
> 
> Please post a confirmation before Oct 16th, 2009, 
> including the contact information for testing engineers.  

Hi Chris,

Chelsio is committed to test these patches.
Indranil (added in -cc) is in charge of the QA at Chelsio.

Cheers,
Divy
Comment 10 Don Zickus 2009-11-17 16:56:04 EST
in kernel-2.6.18-174.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.
Comment 12 Chris Ward 2010-02-11 05:22:22 EST
~~ Attention Customers and Partners - RHEL 5.5 Beta is now available on RHN ~~

RHEL 5.5 Beta has been released! There should be a fix present in this 
release that addresses your request. Please test and report back results 
here, by March 3rd 2010 (2010-03-03) or sooner.

Upon successful verification of this request, post your results and update 
the Verified field in Bugzilla with the appropriate value.

If you encounter any issues while testing, please describe them and set 
this bug into NEED_INFO. If you encounter new defects or have additional 
patch(es) to request for inclusion, please clone this bug per each request
and escalate through your support representative.
Comment 14 Petr Beňas 2010-03-15 10:23:06 EDT
Bug state changed from ON_QA to VERIFIED.
Sanity only. 
The patch is actually being applied.
Comment 16 errata-xmlrpc 2010-03-30 03:41:32 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html

Note You need to log in before you can comment on or make changes to this bug.