Bug 491752 - For Broadcom(r) BMC57710, modprobe bnx2* fails citing memory allocation failures
For Broadcom(r) BMC57710, modprobe bnx2* fails citing memory allocation fail...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.7
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: Andy Gospodarek
Red Hat Kernel QE team
: ZStream
Depends On: 453305
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-23 16:28 EDT by Flavio Leitner
Modified: 2014-06-29 19:01 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-06-02 12:29:38 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Flavio Leitner 2009-03-23 16:28:19 EDT
On a Dell PowerEdgeR805, insert a Broadcom®  BMC57710 10Base-T Cooper Single 
Port NIC, and install RHEL 4.7 32 on the system. The installer recognizes 
the card fine, lspci shows it. But no ifup <ethX> fails for that card because 
the kernel module doesn't get loaded fine.

How reproducible:
Always with this card on any system which supports the card

Steps to Reproduce:

1. Install RHEL 4.7 32 bit
2. Do an lspci post-install. The card is shown
3. Try to do an ifup for the card. It fails
4. Check dmesg. bnx2x module loading fails citing vmalloc failure

Actual results:
The card is not able to get an IP

Expected results:
The card should work fine, module should get loaded correctly

Additional info:
sosreport for the system is attached

Customer reports that this card works just fine in the 64bit architecture 
and on both architectures for RHEL5.  They are only seeing this issue 
on RHEL4.7, and they have a one liner patch from 4.8 that supposedly 
fixes this.

Customer also says the issue is related to memory allocation for BAR2 which 
requests 128MB of memory to be allocated. The allocation fails. It is fixed 
in RHEL 4.8. The patch that fixes it is below -

- bp->regview = ioremap_nocache(dev->base_addr,
    pci_resource_len(pdev, 0));

+ bp->doorbells = ioremap_nocache(pci_resource_start(pdev, 2),
min_t(u64, BNX2X_DB_SIZE,
    pci_resource_len(pdev, 2)));


From the other bugzilla (bz#453305 comment#49):
--
It sounds like they want a backport of this hunk:

@@ -7298,8 +8276,9 @@ static int __devinit bnx2x_init_board(struct pci_dev
*pdev,
                goto err_out_release;
        }

-       bp->doorbells = ioremap_nocache(pci_resource_start(pdev , 2),
-                                       pci_resource_len(pdev, 2));
+       bp->doorbells = ioremap_nocache(pci_resource_start(pdev, 2),
+                                       min_t(u64, BNX2X_DB_SIZE,
+                                             pci_resource_len(pdev, 2)));
        if (!bp->doorbells) {
                printk(KERN_ERR PFX "Cannot map doorbell space, aborting\n");
                rc = -ENOMEM;

from this upstream commit.

commit 34f80b04f325078ff21123579343d99756ad8d0e
Author: Eilon Greenstein <eilong@broadcom.com>
Date:   Mon Jun 23 20:33:01 2008 -0700

    bnx2x: Add support for BCM57711 HW

It would be good to paste the contents of this comment as well as the previous
two into a new bug and copy @broadcom.com, so we can get some endorsement
from Broadcom that we can just take that hunk in 4.7.z.

This patch would be what I would propose for 4.7.z:

diff --git a/drivers/net/bnx2x.c b/drivers/net/bnx2x.c
index d9a9407..764db31 100644
--- a/drivers/net/bnx2x.c
+++ b/drivers/net/bnx2x.c
@@ -9753,8 +9753,9 @@ static int __devinit bnx2x_init_board(struct pci_dev
*pdev,
                goto err_out_release;
        }

-       bp->doorbells = ioremap_nocache(pci_resource_start(pdev , 2),
-                                       pci_resource_len(pdev, 2));
+       bp->doorbells = ioremap_nocache(pci_resource_start(pdev, 2),
+                                       min_t(u64, BNX2X_DB_SIZE,
+                                             pci_resource_len(pdev, 2)));
        if (!bp->doorbells) {
                printk(KERN_ERR PFX "Cannot map doorbell space, aborting\n");
                rc = -ENOMEM;  
--
Comment 2 Eilon Greenstein 2009-03-23 16:47:31 EDT
Hi,

This is a very impressive and thorough analysis and the “one line change” is indeed all that is needed. There are no expected side effects for this change and I strongly recommend adapting it.

Regards,
Eilon
Comment 5 Andy Gospodarek 2009-03-24 10:43:45 EDT
Thanks for the feedback, Eilon!

I didn't see any problem taking just that hunk, and I'm glad it looks fine to you as well.
Comment 9 Narendra K 2009-03-25 13:03:46 EDT
>>Customer reports that this card works just fine in the 64bit architecture 
>> and on both architectures for RHEL5.

Andy,

Th issue is present in RHEL 5.2 32 bit. But it is fixed in RHEL 5.3 with the above mentioned patch. So on RHEL 5.3 the issue will not be seen in both 32 bit and 64 bit.
Comment 10 Andy Gospodarek 2009-03-25 14:24:08 EDT
Narendra, thanks for letting us know that this exists on RHEL5.2 32-bit kernels.  There is no known workaround for RHEL4 or RHEL5 that you know about is there?

Does it happen on all memory configuration sizes or does is seem to happen only on some (like ones that are over 4G)?
Comment 13 Andy Gospodarek 2009-03-25 17:21:58 EDT
So the #define for BNX2X_DB_SIZE is set to 32k, what does pci_resource_len(pdev, 0) return?  Do you have any debug output during this failure?
Comment 15 Narendra K 2009-03-26 13:35:11 EDT
(In reply to comment #10)
> Narendra, thanks for letting us know that this exists on RHEL5.2 32-bit
> kernels.  There is no known workaround for RHEL4 or RHEL5 that you know about
> is there?

Andy,

With RHEL 4.7 hugemem 32 bit kernel, the workaround "vmalloc=256M" worked for me without any "mem=" options.

> 
> Does it happen on all memory configuration sizes or does is seem to happen only
> on some (like ones that are over 4G)?  

On hugemem kernels, with mem= anything greater than 4G failed. On smp kernel it failed on all memory configs ( like mem=1G etc).
Comment 16 Narendra K 2009-03-26 13:56:58 EDT
(In reply to comment #13)
> So the #define for BNX2X_DB_SIZE is set to 32k, what does
> pci_resource_len(pdev, 0) return?  Do you have any debug output during this
> failure?  

As of now i have only dmesg output of the failure. It looks like this -

allocation failed: out of vmalloc space - use vmalloc=<size> to increase size
bnx2x: Cannot map doorbell space, aborting
bnx2x: probe of 0000:05:00.0 failed with error -12.

I don't have the hardware as of now with me. I would provide the return value of pci_resource_len(pdev, 0) when i get the hardware.
Comment 17 Narendra K 2009-04-24 11:00:41 EDT
I had a few debug statements put in the probe code and here is what i get in a failed bnx2x load -

pci_resource_len(pdev, 0) - 8388608 bytes 
pci_resource_len(pdev, 2) - 134217728 bytes
Comment 21 Vitaly Mayatskikh 2009-05-19 10:45:56 EDT
Committed in 78.0.24.EL
Comment 25 errata-xmlrpc 2009-06-02 12:29:38 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1077.html

Note You need to log in before you can comment on or make changes to this bug.