Bug 83565

Summary: bcm5700 driver support dropped/tg3 driver continues to crash on kernel-smp-2.4.18-24.x
Product: [Retired] Red Hat Linux Reporter: Amit Bhutani <amit_bhutani>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: high    
Version: 8.0CC: aander07, dale_kaisner, john_hull, kambiz, matt_domsch, redhat, robert_hentosh
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-03-06 05:39:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amit Bhutani 2003-02-05 17:34:42 UTC
Description of problem:
So far at least two people have reported the following:

1)The tg3 driver in Red Hat errata kernel-smp-2.4.18-24.7.x 
(https://rhn.redhat.com/errata/RHSA-2003-025.html)continues to crash the system 
within a few hours
2) The workaround driver 'bcm5700.o' has been left out from the kernel.

This is very very bad and unacceptable. The support for a bcm5700 driver 
*cannot* be dropped in an errata cycle especially when the default tg3 driver 
is still susupect. 

I am posting the two messages that confirm what I have said above.

MESSAGE 1:
------- Additional Comment #55 From grmansell on 2003-02-05 09:14 --
-----  

I can confirm that the latest production released Redhat kernel 2.4.18-24.7.xsmp
does not fix the problem. My PE2650 crashed in the usual manner after about 5
hours of normal (minimal) activity.

I am concerned that the bcm5700 modules (the only work around) do not exist in
/lib/modules for this new kernel - it would appear that they have been
deprecated. This is unacceptable to me as my machine has run for two months on
these modules perfectly fine. Hence I cannot run the latest kernel and have had
to revert my machine to the 2.4.18-18.7.xsmp kernel with the bcm5700 kernel 
module.

I also have a call (ref #222224) logged with Redhat's Patrick Ernzer
(pernzer) who is working with Dell UK on trying resolve this issue
for me for the last 4 months.

I will also submit this report to bug #79997 on bugzilla.redhat.com as I am not
sure which bug I am actually suffering from.

MESSAGE 2:
-----Original Message-----
From: Mansell, Gary [GRMansell]
Sent: Wednesday, February 05, 2003 9:27 AM
To: linux-poweredge.com
Subject: tg3 problems and the latest erata kernel.....


Dear all

Don't get too excited about the new Redhat erata kernel (2.4.18-24.7.xsmp) 
fixing your tg3 issues !!

I have been experiencing problems with the tg3 module for about 6 months and 
have been sucessfully running with the bcm5700 module for the last two months 
whilst waiting for a working fix to be released. Believe me, I have tried many 
now....

The new kernel crashed in the usual manner after about 5 hours on my PE2650 
with 2x2.4 Ghz Zeon processors. The machine was under a minimal load for this 
time - NFS fileserving to about a dozen UNIX boxes.

I am also concerned to see that the bcm5700 module has been removed 
from /lib/modules for this new kernel (have they been deprecated?). This means 
that I cannot run this new kernel with the bcm5700 module and hence I have had 
to go back to my stable solution: 2.4.18-18.7.xsmp with the bcm5700 module.

Does anyone know why the bcm5700 module has been removed from /lib/modules when 
Redhat know that the use of the bcm5700 module is the only work around for the 
tg3 bugs???


Gary Mansell
Senior Technical Analyst
IT Department
Ricardo Consulting Engineers Ltd.


Version-Release number of selected component (if applicable):
kernel-smp-2.4.18-24.7.x

How reproducible:
Easy

Steps to Reproduce:
1.Install RH 7.3 stock on a system with Broadcom gig-e NIC
2.Upgrade to the 2.4.18-24.7.x.smp kernel
3.Run low to medium network load 
    
Actual results:
Server crashed in approx 5 hours

Expected results:
No crash expected.

Additional info:

Comment 1 Tibor SANDOR 2003-02-11 14:14:28 UTC
I have the same problem with Compaq Proliant ML370 (I think it has the same 
chipset), and I'm waiting for the solution.

Comment 2 T 2003-02-12 05:28:42 UTC
I confirm that starting with the November kernel update 2.4.18-18
and the automatic switch to tg3 I have had random crashes within a few hours
on several Dell 2650 servers in various configurations. These all run fully
updated redhat 7.3 installations. The latest Dell 2650 with redhat 8.0
pre-installed also crashes within 2-3 hours -- unmodified and as delivered by Dell!

switching to bcm5700 instead of tg3 solves this and leads to 100+ days of uptime

the 2.4.18-24 kernel is no different, and the servers froze within hours
reverting to 2.4.18-19 and changing /etc/modules.conf to use bcm5700
gives stable operation.

T.

Comment 3 Scott 2003-02-16 22:47:08 UTC
Dido problem here; RH 7.3 on a ASUS board with dual 1.5 mp athlon's, though we
stay up of a few days to a week before crashing. I tried to compile
BCM5700-2.2.27 from
ftp.caldera.com/pub/OpenLinux3.1.1/drivers/network/bcm5700-2.27-1.scr.rpm and
.txt but it bombed.  Roll back to kernel 2.4.19.7xsmp appears to be the only
solution for now.  At least tg3 should not load for a bcm5700 nic!

Comment 4 Matt Domsch 2003-03-06 05:39:33 UTC
Fixed in kernel 2.4.18-26 released via RHN yesterday.  Closing.
Please open a new bug if this kernel does not solve your problems.