Bug 568601

Summary: [Broadcom 5.6 FEAT] Update bnx2 to 2.0.8+
Product: Red Hat Enterprise Linux 5 Reporter: Michael Chan <mchan>
Component: kernelAssignee: John Feeney <jfeeney>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 5.6CC: aaswath, adaora.onyia, agospoda, anderson, andrew.patterson, andriusb, benlu, bnagendr, bugproxy, bzeranski, cward, enarvaez, gideonn, jburke, jjarvis, joerg.roedel, joseph.szczypek, jtorrice, ltroan, niran, nobody+PNT0273897, noboru.obata.ar, sandy.garza, sbest, seiji.aguchi.tr, tyasui
Target Milestone: alphaKeywords: FutureFeature, OtherQA
Target Release: 5.6   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 617024 641495 (view as bug list) Environment:
Last Closed: 2011-01-13 20:35:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 528587, 531800, 537251, 557597, 562684, 566321, 566322, 566323, 580580, 600363, 617024, 641495    
Attachments:
Description Flags
Old format bnx2_fw2.h firmware file for RHEL5 none

Description Michael Chan 2010-02-26 06:13:21 UTC
1.  Feature Overview:
     a. Update bnx2 to at least 2.0.8 version

     b. Feature Description
	General bug fixes and feature enhancements.  Newer iSCSI firmware.

2.  Feature Details:
     a. Architectures:
         32-bit x86
         64-bit Intel EM64T/AMD64
         64-bit Itanium2

     b. Upstream acceptance information:
        2.0.8 is upstream, patch submission will be ongoing.

3. Business Justification:
     a. Why is this feature needed?
	General driver update, bug fixes, etc.

4. Primary contact at Broadcom, email, phone
    mchan
    (949)926-6170

Comment 1 Andrius Benokraitis 2010-03-01 15:18:41 UTC
*** Bug 562934 has been marked as a duplicate of this bug. ***

Comment 2 Andrius Benokraitis 2010-03-01 16:33:03 UTC
*** Bug 562965 has been marked as a duplicate of this bug. ***

Comment 3 Andrius Benokraitis 2010-03-04 04:09:34 UTC
Michael, is this bugzilla also going to house AER support also?

Comment 4 Michael Chan 2010-03-04 06:43:06 UTC
Yes, we can include AER.  Just need to figure out how to test it.

Comment 5 Andrius Benokraitis 2010-03-04 14:15:32 UTC
I believe there are ways to test this using standard error injection tools. I'm going to include some additional OEMs to this bugzilla anyway, but I think they'll have the ability to test this.

Comment 6 Andrius Benokraitis 2010-03-04 15:34:45 UTC
*** Bug 570367 has been marked as a duplicate of this bug. ***

Comment 7 Andrius Benokraitis 2010-03-04 15:34:55 UTC
*** Bug 513936 has been marked as a duplicate of this bug. ***

Comment 8 Andrius Benokraitis 2010-03-09 20:35:05 UTC
*** Bug 563582 has been marked as a duplicate of this bug. ***

Comment 9 IBM Bug Proxy 2010-04-06 23:11:00 UTC
------- Comment From mreed10.com 2010-04-06 19:00 EDT-------
I have verified that that the bnx2 driver for BCM 5709 Ethernet device on Power is functioning properly on RHEL 5.5.

Comment 10 John Jarvis 2010-07-09 21:00:51 UTC
This enhancement request was evaluated by the full Red Hat Enterprise Linux 
team for inclusion in a Red Hat Enterprise Linux minor release.   As a 
result of this evaluation, Red Hat has tentatively approved inclusion of 
this feature in the next Red Hat Enterprise Linux Update minor release.   
While it is a goal to include this enhancement in the next minor release 
of Red Hat Enterprise Linux, the enhancement is not yet committed for 
inclusion in the next minor release pending the next phase of actual 
code integration and successful Red Hat and partner testing.

Comment 11 Michael Chan 2010-08-06 20:15:22 UTC
Created attachment 437253 [details]
Old format bnx2_fw2.h firmware file for RHEL5

This firmware is equivalent to the upstream firmware:

#define FW_MIPS_FILE_09         "bnx2/bnx2-mips-09-5.0.0.j15.fw"
#define FW_RV2P_FILE_09_Ax      "bnx2/bnx2-rv2p-09ax-5.0.0.j10.fw"
#define FW_RV2P_FILE_09         "bnx2/bnx2-rv2p-09-5.0.0.j10.fw"

for 5709.  The 5708 firmware (bnx2_fw.h) has not changed since RHEL5.5.

Comment 12 John Feeney 2010-08-13 19:41:24 UTC
Please find rpms on my people page that fulfill the requirements of this bz. See http://people.redhat.com/jfeeney/.bz568601. Testing feedback would be greatly appreciated. Note: I have sanity tested these and found no issues but more testing is certainly needed.

The following upstream patches have been applied to these rpms.
  1. netdev: Remove redundant checks for CAP_NET_ADMIN
  2. bnx2: avoid compiler errors
  3. bnx2: EEH is failing without timout
  4. bnx2: protect tx_timeout_reset with rtnl_lock()
  5. bnx2: dump some state during tx_timeout()
  6. bnx2: print warning when unable to allocate
  7. bnx2: read firmware version from VPD
  8. bnx2: update version to 2.0.3
  9. bnx2: Refine VPD logic
 10. bnx2: reset_task is crashing the kernel
 11. bnx2: Fix bnx2_netif_stop() merge error
 12. bnx2: Flush the register writes which setup the MSI-X
 13. bnx2: Refine statistics code
 14. bnx2: Save statistics during reset
 15. net: remove HAVE_ leftovers
 16. bnx2: Check BNX2_FLAG_USING_MSIX flag
 17. bnx2: Need to call cnic_setup_cnic_irq_info() after
 18. bnx2: Adjust flow control water marks
 19. bnx2: Allow for user-specific multiple advertisement speed
 20. bnx2: Fix bug when saving statistics
 21. bnx2: Update firmwares and update version to 2.0.8

To get the "2.0.8+" part, incorporate patches in RHEL6:
  1. bnx2: Fix netpoll crash
  2. bnx2: Use proper handler during netpoll
  3. bnx2: Update 5709 MIPS firmware mips = 5.0.0j15 rv2p = j10
  4. bnx2: Fix hang during rmmod bnx2
  5. bnx2: Always enable MSI-X on 5709

Comment 13 John Feeney 2010-08-23 19:05:18 UTC
I was wondering if there was any initial feedback on testing.

Comment 14 Larry Troan 2010-09-02 13:53:29 UTC
> I was wondering if there was any initial feedback on testing.
Setting NEEDINFO so our partners see this.

Comment 16 Jarod Wilson 2010-09-15 14:00:40 UTC
in kernel-2.6.18-221.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 18 Dave Anderson 2010-09-21 20:36:38 UTC
> I was wondering if there was any initial feedback on testing.

As it turns out, this patch fails badly, at least on an AMD Dinar
machine.

While testing an unrelated issue on amd-dinar-02.lab.bos.redhat.com
with a kernel-2.6.18-222.el5 kernel, I noted that every attempt to
scp an ~88MB kernel src.rpm file would "hang" about half way through
the copy.  The session running the scp, as well as all other ssh
sessions all become non-responsive, and it is impossible to ssh
into the system.  The only way to communicate with the box was
via a serial console connection.

Narrowing it down, the scp hang was reproducible only with
kernel-2.6.18-221.el5, kernel-2.6.18-222.el5 and kernel-2.6.18-223.el5 
kernels.  Looking at the changelog for kernel-2.6.18-221.el5, the
patch associated with this BZ looked like a suspect. 

So the first thing we tried was to boot kernel-2.6.18-223.el5,
remove the bnx2 driver, force it to use the bnx2 driver from
the last-known "good" kernel kernel-2.6.18-220.el5, and restart
networking.  When that was done, the scp hang did not occur.

Just to verify that, I rebuilt a kernel-2.6.18-222.el5 with the
linux-2.6-net-bnx2-update-to-v2-0-8-with-new-5709-firmware-j15.patch
removed, and the problem goes away.

Here is what it looks like when it fails.  The system is running
2.6.18-222.el5, and the attempt to scp kernel-2.6.18-222.el5.bnx2rev.src.rpm
hangs like so, invariably at around 40%-50%:

# uname -a
Linux amd-dinar-02.lab.bos.redhat.com 2.6.18-222.el5 #1 SMP Wed Sep 15 22:23:52 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
# scp kernel-2.6.18-222.el5.bnx2rev.src.rpm anderson.redhat.com:
anderson.redhat.com's password: 
kernel-2.6.18-222.el5.bnx2rev.src.rpm                                                            46%   39MB   5.9MB/s   00:07 ETA
<hang>

But here when I reboot with kernel-2.6.18-222.el5.bnx2rev kernel,
(222.el5 with the patch commented out), it works OK:

# uname -a
Linux amd-dinar-02.lab.bos.redhat.com 2.6.18-222.el5.bnx2rev #1 SMP Tue Sep 21 15:29:00 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
[root@amd-dinar-02 ~]# scp kernel-2.6.18-222.el5.bnx2rev.src.rpm anderson.redhat.com:
anderson.redhat.com's password: 
kernel-2.6.18-222.el5.bnx2rev.src.rpm                                                           100%   82MB   4.6MB/s   00:18    
#

The system is an RHTS/beaker machine that I have reserved:

  amd-dinar-02.lab.bos.redhat.com  (root/redhat)

FWIW, these are the kernels that are installed on the machine,
and the results of the simple "scp" test:
  
  2.6.18-219.el5          OK
  2.6.18-220.el5          OK
  2.6.18-221.el5          FAIL
  2.6.18-222.el5          FAIL
  2.6.18-223.el5          FAIL
  2.6.18-222.el5.bnx2rev  OK
  
I will keep the machine reserved if you would like to keep it
around for further testing.  I have no further use for it, so
please let me know if you want me to keep extending my 
reservation.

Comment 19 John Feeney 2010-09-21 22:14:16 UTC
Yes Dave, I would like to use amd-dinar-02 to figure out what went wrong. 

(Thanks for all the info.)

Comment 20 Joe T 2010-09-22 03:55:00 UTC
We don't have a AMD Dinar reference system in our inventory. We tried this with a PowerEdge T710 (Intel Xeon E5520) production system using the 2.6.18-221.el5 x86-64 kernel; did not see the issue.

Comment 21 Dave Anderson 2010-09-22 13:16:16 UTC
(In reply to comment #19)
> Yes Dave, I would like to use amd-dinar-02 to figure out what went wrong. 
> 
> (Thanks for all the info.)

Ok John, it's all yours.  BTW, just to clarify, the problem only occurs
upon doing the first large-file scp *from* the system.  Prior to doing
that, you can scp large files *to* the system with no problem.  But
once the hang occurs during an out-bound scp, no further network
related communication, in or out, is possible.

Comment 22 John Feeney 2010-09-22 23:05:51 UTC
I determined which patch in this bnx2 patch set caused the trouble. It appears as though the Dinar gets an error when calling pci_map_page() in bnx2_start_xmit(). This pci_map_page() call was added by the patch. Have not determined why there was an error. Will also look at a non-Dinar to see if pci_map_page() returns a similar error.

Note: As reported elsewhere, I did demonstrate that large files can be transmitted via scp from a non-Dinar system (x86_64) without trouble, just in case someone was wondering if there was something germaine to our lab.

Comment 23 Michael Chan 2010-09-22 23:17:32 UTC
I think may be we didn't check for dma mapping errors before.  pci_map_page() in one form or another has been in the driver's bnx2_start_xmit() function to handle S/G fragments.  So does it work if we turn off S/G?

Comment 24 John Feeney 2010-09-23 16:32:19 UTC
Some pieces of information:
1. non-Dinar system (Dell per610) did not have a dma error and hence worked well. Non-Dinar system had S/G on (by default).
2. Dinar system with slab corruption patch provided for bz530619 still exhibited the problem (But did register only one dma_error before hanging as opposed to two in previous attempts).
3. Dinar system with S/G turned off did not exhibit problem (and no dma_error).
4. After S/G was turned back on, the problem did not manifest.
5. Turn off S/G and then back on after reboot but before transmission, resulted in no problem manifestation.

So it looks like Michael's comment about pci_map_page() was dead on. The initial setup of dma memory for S/G is leading to trouble. Once it is disabled and even reset there is no dma_error reported.

Comment 25 John Feeney 2010-09-23 19:50:27 UTC
One additional piece of info...when the dma_error occurs on the Dinar, the value of mapping returned by pci_map_page() in bnx2_start_xmit() is zero.

Comment 26 John Feeney 2010-09-24 18:12:49 UTC
I found that the inclusion following code can induce the problem on the amd-dinar system:
[root@amd-dinar-02 net]# diff bnx2.c bnx2.c.jjf
5353,5356c5353,5355
<                       last = tx_buf->nr_frags;
<                       j++;
<                       for (k = 0; k < last; k++, j++) {
<                               tx_buf = &txr->tx_buf_ring[TX_RING_IDX(j)];
---
>                       last = skb_shinfo(skb)->nr_frags;
>                       for (k = 0; k < last; k++) {
>                               tx_buf = &txr->tx_buf_ring[j + k + 1];
5362a5362
>                       j += k + 1;
6459,6462c6459
<               mapping = pci_map_page(bp->pdev, frag->page, frag->page_offset,
<                       len, PCI_DMA_TODEVICE);
<               if (pci_dma_mapping_error(mapping))
<                       goto dma_error;
---
>               mapping = dma_maps[i + 1];
6505,6528d6501
< dma_error:
<       /* save value of frag that failed */
<       last_frag = i;
<
<       /* start back at beginning and unmap skb */
<       prod = txr->tx_prod;
<       ring_prod = TX_RING_IDX(prod);
<       tx_buf = &txr->tx_buf_ring[ring_prod];
<       tx_buf->skb = NULL;
<       pci_unmap_single(bp->pdev, pci_unmap_addr(tx_buf, mapping),
<       skb_headlen(skb), PCI_DMA_TODEVICE);
<
<       /* unmap remaining mapped pages */
<       for (i = 0; i < last_frag; i++) {
<               prod = NEXT_TX_BD(prod);
<               ring_prod = TX_RING_IDX(prod);
<               tx_buf = &txr->tx_buf_ring[ring_prod];
<               pci_unmap_page(bp->pdev, pci_unmap_addr(tx_buf, mapping),
<                       skb_shinfo(skb)->frags[i].size,
<                       PCI_DMA_TODEVICE);
<       }
<
<       dev_kfree_skb(skb);
<       return NETDEV_TX_OK;

This code was taken from the upstream patch "bnx2: remove skb_dma_map/unmap calls from driver" (commit: e95524a726904a1d2b91552f0577838f67d53c6c).
It should be noted that this upstream patch reverted another patch (the one that added the skb_dma_unmap() calls) but that other patch was never included in RHEL5, so there was not anything to revert. Still, it did add the code above and was in the patch posted for this bz (568601). 

Also, I think it just about time to create a new bugzilla to document this problem with S/G on amd-dinars. I will do so soon.

Comment 27 Michael Chan 2010-09-24 18:52:54 UTC
John, the code above adds the recovery logic in case the DMA mapping fails.  The DMA error checking is relatively new in the bnx2 driver (added about 2 - 3 years ago).

So on this AMD system, pci_map_page() occasionally returns error with 0 as the DMA mapping.  Without this new code in bnx2, we would just transmit the packet and parts of the packet would contain garbage as the chip would DMA a fragment from DMA address 0.

With the new code, bnx2 will check for DMA mapping errors.  If an error is encountered, it will unwind all mappings for this skb and free the skb.

John, are you saying that the unwind logic causes additional problems?

Comment 28 John Feeney 2010-09-24 21:01:42 UTC
Right, so you get a dma error (i.e. mapping = 0), detect it, and handle it with dma_error code. But with the above code, a. I get mapping = 0 and b. the scp hangs. 

So which of the three parts (the bnx2_free_tx_skbs() change; the detection of the error with pci_map_page() and then pci_dma_mapping_error(); or the dma_error logic) is causing the trouble? If I don't use pci_map_page() and pci_dma_mapping_error() then a. I don't detect an error so no dma_error logic BUT b. there is no mapping = 0 either. Thus, it would appear as though the bnx2_free_tx_skbs() change MIGHT be okay. 

If I call pci_map_page() and then pci_dma_mapping_error() but don't jump to dma_error, I get mapping = 0 and IO_PAGE_FAULTS and hang. If I do jump to dma_error, I get the mapping = 0 and hang.

So, to answer your question, Michael, I am not sure pci_map_page() is getting the correct data because with it we get mapping = 0 and without it we don't, in this scenario.  Of course it should be noted that this code works okay on non-Dinar systems and there is a known issue with memory on Dinars. I hate to use that as an excuse at this point.

Comment 29 Michael Chan 2010-09-24 22:34:31 UTC
bnx2_free_tx_skbs() is only called when resetting the device.  And the logic in question will only be executed when there are still buffers queued up in the tx ring.  So that logic is rarely used.

The only new logic in bnx2_start_xmit() is to handle dma mapping errors on the pages in the frags.  I think it was determined that an earlier version of the driver would not hang.  The earlier version should still support S/G, so it should still call pci_map_page (or skb_dma_map() which does the same thing).  But John, you seem to say that we don't get any dma mapping errors at all (even though we don't check) with the earlier version.  I'm very puzzled.

BTW, the dma_error logic has been tested by simulating dma errors some time ago.  I just reviewed the code and don't see any obvious bugs in the logic.

Comment 30 John Feeney 2010-09-24 22:51:51 UTC
>But John, you seem to say that we don't get any dma mapping errors at all (even
>though we don't check) with the earlier version.  I'm very puzzled.

That makes two of us. Without the call to pci_map_page(), mapping was never 0 in the simple test case (scp a rpm from the Dinar). Perhaps more intricate tests scenarios are needed for dma errors(?). With pci_map_page(), I would get at least one, possibly two mapping = 0 per run. 

Good to know that bnx2_free_tx_skbs() is run infrequently. I had to take it into account because it was in the patch that results in a hang. Also, good to know that nothing blatant appears to be wrong with this patch, too.

Comment 31 John Feeney 2010-09-28 22:14:31 UTC
Okay, I changed the system from one Dinar to another to make sure there wasn't some hardware or bios config issue. I found that yes, 2.6.18-222 still failed, but I found that 2.6.18-225 worked. The patch for this bnx2 bz is in both. So what changed? (2.6.18-223 did not work and there was no 2.6.18-224 posted.)

Well, 2.6.18-225 had several AMD patches. After adding some patches (the slab corruption patch for bz530619, the x86_64 specific patches for bz610199 and "Add passthrough mode support" bz561127) and not finding success, I added the patch for bz628018 "change default to passthrough mode" and the scp finished without hanging.
 
So, does this still fail if passthrough is not used? Yes, so it hasn't really been fixed by these IOMMU passthrough patches, just hidden. 

So to summarize, transmit does not effectively work on a Dinar system with the bnx2 driver, scatter gather enabled, and in non-passthrough mode with 2.6.18-222 onward. Can AMD provide any insight?

Comment 32 Gideon Naim 2010-10-12 00:27:27 UTC
John,

What is the status on this one? do you need any help from us?

Thanks,
Gidi

Comment 33 Andrius Benokraitis 2010-10-12 00:34:16 UTC
(In reply to comment #32)
> John,
> 
> What is the status on this one? do you need any help from us?
> 
> Thanks,
> Gidi

Please test the 5.6 Partner Alpha when it is released soon, I think that is the best you can do in the short-term.

Comment 34 John Feeney 2010-10-12 16:52:35 UTC
Just wanted to add a comment stating that I cloned this bz so the clone could document the problem found on AMD Dinars with S/G and non-Passthrough enabled. See bz641495.

Comment 35 Beth Zeranski 2010-10-19 18:18:25 UTC
*** Bug 568836 has been marked as a duplicate of this bug. ***

Comment 36 Chris Ward 2010-11-09 13:27:14 UTC
~~ Attention Customers and Partners - RHEL 5.6 Public Beta is now available on RHN ~~

A fix for this 'OtherQA' BZ should be present and testable in the release. 

If this Bugzilla is verified as resolved, please update the Verified field above with an appropriate value and include a summary of the testing executed and the results obtained.

If you encounter any issues or have questions while testing, please describe them and set this bug into NEED_INFO. 

If you encounter new defects or have additional patches to request for inclusion, promptly escalate the new issues through your support representative.

Finally, future Beta kernels can be found here:
 http://people.redhat.com/jwilson/el5/

Note: Bugs with the 'OtherQA' keyword require Third-Party testing to confirm the request has been properly addressed. See: https://bugzilla.redhat.com/describekeywords.cgi#OtherQA ).

Comment 37 John Jarvis 2010-11-18 21:05:12 UTC
This enhancement request was evaluated by the full Red Hat Enterprise Linux 
team for inclusion in a Red Hat Enterprise Linux minor release.   As a 
result of this evaluation, Red Hat has tentatively approved inclusion of 
this feature in the next Red Hat Enterprise Linux Update minor release.   
While it is a goal to include this enhancement in the next minor release 
of Red Hat Enterprise Linux, the enhancement is not yet committed for 
inclusion in the next minor release pending the next phase of actual 
code integration and successful Red Hat and partner testing.

Comment 38 IBM Bug Proxy 2010-11-20 12:48:03 UTC
------- Comment From jmtt.com 2010-11-19 20:58 EDT-------
The bnx2 driver worked OK for us during pounder testing on HS21-piranha and HX5-hammerhead.

Comment 39 Gideon Naim 2010-11-23 18:22:30 UTC
Our QA has tested with the test Kernel and didn't see any issue. We do not have this specific platform mentioned.

Comment 40 Michael Chan 2010-12-15 19:55:03 UTC
IBM noticed a serious problem in bnx2_start_xmit().  It appears that we are calling pci_map_page() twice on each frag.

Comment 41 Andrius Benokraitis 2010-12-15 23:17:25 UTC
(In reply to comment #40)
> IBM noticed a serious problem in bnx2_start_xmit().  It appears that we are
> calling pci_map_page() twice on each frag.

Please file a new bugzilla ASAP, thanks!

Comment 43 errata-xmlrpc 2011-01-13 20:35:23 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html