Bug 242060 - Panic in bnx2_poll - needs patch from bug #212055?
Panic in bnx2_poll - needs patch from bug #212055?
Status: CLOSED DUPLICATE of bug 225350
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
All Linux
low Severity high
: ---
: ---
Assigned To: Ivan Vecera
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-06-01 10:23 EDT by Kenn Humborg
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-08-13 14:02:59 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
bnx2 kernel panic dump in CentOS5 (43.61 KB, image/png)
2007-06-20 06:59 EDT, Edward Fjellskål
no flags Details

  None (edit)
Description Kenn Humborg 2007-06-01 10:23:08 EDT
Description of problem:

I've got a Dell PE2950 running CentOS5 (I know!!!) with
their kernel-PAE-2.6.18-8.1.4.el5 RPM.  It's running as a
fairly busy NFS server.  About once or twice a week it 
crashes with a panic in bnx2_poll.  The stack trace looks
very similar to the problem described in bug #212055.

Looking at the sources for this kernel, the patch 
that fixes the problem:

   http://bugzilla.redhat.com/bugzilla/attachment.cgi?id=143559

is not included in RHEL5.

Version-Release number of selected component (if applicable):

kernel-PAE-2.6.18-8.1.4.el5.i386

How reproducible:

No reliable reproduction known, except let server run in
production for a week or so.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

The patch mentioned above was applied to the Linus kernel last December:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=faac9c4b753f420c02bdce0785d2657087830a12;hp=a3d384029aa304f8f3f5355d35f0ae274454f7cd
Comment 1 Kenn Humborg 2007-06-01 10:48:30 EDT
Apologies - that should be:

> Version-Release number of selected component (if applicable):
> 
> kernel-PAE-2.6.18-8.1.4.el5.i686

(not kernel-PAE-2.6.18-8.1.4.el5.i386).


Comment 2 Edward Fjellskål 2007-06-20 06:59:47 EDT
Created attachment 157446 [details]
bnx2 kernel panic dump in CentOS5

For those who want to look :)
Comment 3 Edward Fjellskål 2007-06-20 07:02:57 EDT
Same issue on IBM eSERVER BC 2x2000 with kernel :
2.6.18-8.1.4.el5 0000001 SMP Thu May 17 03:16:52 EDT 2007 x86_64 x86_64 x86_64
GNU/Linux
See attachment with id=157446 for dump.

nfs reads from clients ends with kernel panic :(
Comment 4 Kenn Humborg 2007-06-20 18:31:52 EDT
By the way, I've rebuilt kernel-PAE-2.6.18-8.1.4.el5 to include this patch:
diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 7d824cf..f296c37 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -217,9 +217,16 @@ static inline u32 bnx2_tx_avail(struct b
 	u32 diff;
 
 	smp_mb();
-	diff = TX_RING_IDX(bp->tx_prod) - TX_RING_IDX(bp->tx_cons);
-	if (diff > MAX_TX_DESC_CNT)
-		diff = (diff & MAX_TX_DESC_CNT) - 1;
+
+	/* The ring uses 256 indices for 255 entries, one of them
+	 * needs to be skipped.
+	 */
+	diff = bp->tx_prod - bp->tx_cons;
+	if (unlikely(diff >= TX_DESC_CNT)) {
+		diff &= 0xffff;
+		if (diff == TX_DESC_CNT)
+			diff = MAX_TX_DESC_CNT;
+	}
 	return (bp->tx_ring_size - diff);
 }
 

It has been running continuously since Jun 2 for 18 days with no crashes.
Comment 5 WhidbeyNet 2007-07-01 07:22:08 EDT
We experience this bug on Dell PowerEdge 1950's and 2950's.  Booting with the
kernel flag "pci=nomsi" does not help.  
Comment 6 Kenn Humborg 2007-08-10 10:02:35 EDT
To follow up on comment #4, I'm still running kernel-PAE-2.6.18-8.1.4.el5 with
that patch.  No crashes since June 2.
Comment 7 Ivan Vecera 2007-08-13 04:49:26 EDT
It seems the problem is already fixed in BZ225350
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=225350).
The patch for BZ225350 already contains the proposed patch from here.
I think this is a duplicate.
Comment 8 Kenn Humborg 2007-08-13 11:19:11 EDT
Bug #225350 is a private bug that I can't read.  Has the fix 
for #225350 been included in a released kernel update?

Comment 9 Andy Gospodarek 2007-08-13 14:02:59 EDT
Yes it has.

*** This bug has been marked as a duplicate of 225350 ***
Comment 10 Kenn Humborg 2007-09-17 07:27:57 EDT
Are there any plans to release a -36.el5 or later kernel (as 
mentioned in bug #225350, comment #31) with the updated bnx2
driver or if the updated driver will be back-ported to 
the -8.1.x.el5 series?

I have also been wondering this could be a security issue.
Might it be possible for a local non-root user to cause enough 
TX traffic to trigger the panic?  If so, you've got a local DOS 
vulnerability.

Comment 11 David Rees 2007-11-07 20:09:22 EST
Kenn, It looks like this is in 2.6.18-53.el5.

Note You need to log in before you can comment on or make changes to this bug.