Bug 245430

Summary: b44 module for a Broadcomm NIC locks machine under heavy load
Product: Red Hat Enterprise Linux 5 Reporter: David Bunt <dbunt>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: low Docs Contact:
Priority: low    
Version: 5.0CC: jarod, jfeeney, sgruszka
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-06 20:56:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Bunt 2007-06-23 04:53:43 UTC
Description of problem:
b44 NIC module will cause a hard lock when there is high constant network
utilization.

Version-Release number of selected component (if applicable):
2.6.18-8.1.6.el5

How reproducible:
Every time.

Steps to Reproduce:
1. Grab a broadcom nic
2. Load up b44 module
3. Transfer something large and wait for the lock up.
  
Actual results:
System locks, no kernel panic messages, no log entries, no way to capture useful
debug data.

Expected results:
No lockups

Additional info:
I'm using a Dell Inspiron 5150 laptop with a stock onboard broadcom NIC. This
problem also occured on previous install of Fedora Core 3 on the same laptop. I
was hoping the problem was fixed and upgrading to rhel 5 would fix it. I'm
completely open to accepting I may just have bad hardware as I'm not that
confident in Dell, but I don't have another 5150 to test it on.

Comment 1 Jarod Wilson 2007-09-02 03:33:03 UTC
For rhel5.1, the b44 driver has been updated. The changes taken in contain the following comments:

Backport b44 from 2.6.22-rc4 upstream.  Changes:

* endianness fixes, types
* use DMA_30BIT_MASK constant
* use skb_copy_from_linear_data_offset()
* safer interrupt disabling with spin_lock_irqsave()
* chip reset may take longer than previously thought
* multicast fixes
* simple error checking during resume

Please give the latest rhel5.1 beta kernel a try and report back results:

http://people.redhat.com/dzickus/el5/