Bug 242060 - Panic in bnx2_poll - needs patch from bug #212055?
Summary: Panic in bnx2_poll - needs patch from bug #212055?
Keywords:
Status: CLOSED DUPLICATE of bug 225350
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: All
OS: Linux
low
high
Target Milestone: ---
: ---
Assignee: Ivan Vecera
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-06-01 14:23 UTC by Kenn Humborg
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-08-13 18:02:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
bnx2 kernel panic dump in CentOS5 (43.61 KB, image/png)
2007-06-20 10:59 UTC, Edward Fjellskål
no flags Details

Description Kenn Humborg 2007-06-01 14:23:08 UTC
Description of problem:

I've got a Dell PE2950 running CentOS5 (I know!!!) with
their kernel-PAE-2.6.18-8.1.4.el5 RPM.  It's running as a
fairly busy NFS server.  About once or twice a week it 
crashes with a panic in bnx2_poll.  The stack trace looks
very similar to the problem described in bug #212055.

Looking at the sources for this kernel, the patch 
that fixes the problem:

   http://bugzilla.redhat.com/bugzilla/attachment.cgi?id=143559

is not included in RHEL5.

Version-Release number of selected component (if applicable):

kernel-PAE-2.6.18-8.1.4.el5.i386

How reproducible:

No reliable reproduction known, except let server run in
production for a week or so.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

The patch mentioned above was applied to the Linus kernel last December:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=faac9c4b753f420c02bdce0785d2657087830a12;hp=a3d384029aa304f8f3f5355d35f0ae274454f7cd

Comment 1 Kenn Humborg 2007-06-01 14:48:30 UTC
Apologies - that should be:

> Version-Release number of selected component (if applicable):
> 
> kernel-PAE-2.6.18-8.1.4.el5.i686

(not kernel-PAE-2.6.18-8.1.4.el5.i386).




Comment 2 Edward Fjellskål 2007-06-20 10:59:47 UTC
Created attachment 157446 [details]
bnx2 kernel panic dump in CentOS5

For those who want to look :)

Comment 3 Edward Fjellskål 2007-06-20 11:02:57 UTC
Same issue on IBM eSERVER BC 2x2000 with kernel :
2.6.18-8.1.4.el5 0000001 SMP Thu May 17 03:16:52 EDT 2007 x86_64 x86_64 x86_64
GNU/Linux
See attachment with id=157446 for dump.

nfs reads from clients ends with kernel panic :(

Comment 4 Kenn Humborg 2007-06-20 22:31:52 UTC
By the way, I've rebuilt kernel-PAE-2.6.18-8.1.4.el5 to include this patch:
diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 7d824cf..f296c37 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -217,9 +217,16 @@ static inline u32 bnx2_tx_avail(struct b
 	u32 diff;
 
 	smp_mb();
-	diff = TX_RING_IDX(bp->tx_prod) - TX_RING_IDX(bp->tx_cons);
-	if (diff > MAX_TX_DESC_CNT)
-		diff = (diff & MAX_TX_DESC_CNT) - 1;
+
+	/* The ring uses 256 indices for 255 entries, one of them
+	 * needs to be skipped.
+	 */
+	diff = bp->tx_prod - bp->tx_cons;
+	if (unlikely(diff >= TX_DESC_CNT)) {
+		diff &= 0xffff;
+		if (diff == TX_DESC_CNT)
+			diff = MAX_TX_DESC_CNT;
+	}
 	return (bp->tx_ring_size - diff);
 }
 

It has been running continuously since Jun 2 for 18 days with no crashes.


Comment 5 WhidbeyNet 2007-07-01 11:22:08 UTC
We experience this bug on Dell PowerEdge 1950's and 2950's.  Booting with the
kernel flag "pci=nomsi" does not help.  

Comment 6 Kenn Humborg 2007-08-10 14:02:35 UTC
To follow up on comment #4, I'm still running kernel-PAE-2.6.18-8.1.4.el5 with
that patch.  No crashes since June 2.


Comment 7 Ivan Vecera 2007-08-13 08:49:26 UTC
It seems the problem is already fixed in BZ225350
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=225350).
The patch for BZ225350 already contains the proposed patch from here.
I think this is a duplicate.

Comment 8 Kenn Humborg 2007-08-13 15:19:11 UTC
Bug #225350 is a private bug that I can't read.  Has the fix 
for #225350 been included in a released kernel update?



Comment 9 Andy Gospodarek 2007-08-13 18:02:59 UTC
Yes it has.

*** This bug has been marked as a duplicate of 225350 ***

Comment 10 Kenn Humborg 2007-09-17 11:27:57 UTC
Are there any plans to release a -36.el5 or later kernel (as 
mentioned in bug #225350, comment #31) with the updated bnx2
driver or if the updated driver will be back-ported to 
the -8.1.x.el5 series?

I have also been wondering this could be a security issue.
Might it be possible for a local non-root user to cause enough 
TX traffic to trigger the panic?  If so, you've got a local DOS 
vulnerability.



Comment 11 David Rees 2007-11-08 01:09:22 UTC
Kenn, It looks like this is in 2.6.18-53.el5.


Note You need to log in before you can comment on or make changes to this bug.