Bug 134790

Summary: Inspiron 8500 practically hangs when configuring b44 NIC with 1.5G memory
Product: Red Hat Enterprise Linux 4 Reporter: John Haxby <jch>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: barryn, davej
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2005-514 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-05 12:30:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 156322    
Attachments:
Description Flags
jwltest-b44-bounce-bufs.patch none

Description John Haxby 2004-10-06 11:47:14 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.7.2)
Gecko/20040806

Description of problem:
This is a very odd problem.

When I first installed RHEL4-b1 on my laptop on a spare disk it
installed and booted just fine.   I had 768Mb memory.

Subsequently I got a 1G memory stick (so I now have a total of 1.5Gb
memory) and I decided to re-install to choose a different disk layout.
 The installation went swimmingly until it came to rebooting for the
firstboot run.

The first thing I noticed was that it was taking a very long time to
configure the network.   I rebooted and brought the machine up without
the "quiet" and "rhgb" options.

The machine appeared to hang at the point at which it attempted to
configure the network driver (a Broadcom 93406 which uses the b44
driver).   I say "appeared to hang" because hitting "x" (say)
eventually echoed the character to the screen some tens of seconds
after hitting the key in question.

Interestingly, in its normal guise (with the usual disk), this machine
is running a stock 2.6.8.1 with the acpi-20040816-26-stable-release
patches and it's just fine.

When I boot the machine with the RHEL4b1 kernel (2.6.8-1.528.2.10) and
start bringing up services one by one, it is fine until I attempt to
run ifconfig on the newly loaded b44 module.   Even more interesting,
the leds on the back of the laptop didn't appear to light up at all. 
If I remove the 512Mb memory stick, everything is fine and dandy --
well, at least the network works.

Perhaps interestingly, a collegue who has an identical machine running
FC2 2.6.8-1.521 kernel has the exactly the same problem when he has
1.5Gb memory installed.

It's going to be interesting getting an updated kernel on to this
machine as I don't really want to keep levering memory out, but I
really do want to get a fix and I'm happy to install any test ternel
or run some kind of diagnostics as this is pretty much a showstopper
for me.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Boot
2. Wait for network configuration
    

Actual Results:  Observe machine essentially hung

Expected Results:  No hang :-)

Comment 1 Barry K. Nathan 2004-10-06 13:56:03 UTC
This sounds like bug 118165 (or a variant thereof).

Comment 2 John Haxby 2004-10-06 16:17:08 UTC
That was a useful hint.   I took the patch from bug #118615 and I'm
now up and running.   The final comment:

> Just a status update, the fix is in 2.6.9-rc2-mm2 (bk-netdev.patch),
> hopefully will propagate to the Linus (and thus Fedora) tree 
> soon.

suggests that a fix is on its way.   When it hits the RHEL4b1 kernel
I'm ready to try it out.

Comment 3 Tim Burke 2004-10-28 00:08:15 UTC
We have included everything in 2.6.9 into B2.  Please retry when its
available in a few weeks.

Comment 4 John Haxby 2004-10-29 08:31:57 UTC
I await the new kernel with bated breath.  However, in the 2.6.10-rc1
changelog I notice:

----
<pp.fi>
	[PATCH] b44: use bounce buffers to workaround chip DMA
        bug/limitations
	
	Signed-off-by: Pekka Pietikainen <pp.fi>
----

A quick look at the corresponding code for this change shows that this
is indeed a fix for the DMA problems I experienced.   Is this the fix
that will be in the new kernel?


Comment 5 John Haxby 2004-11-24 21:36:07 UTC
Finally got around to testing this.   Sorry.   There's good news and
bad news: kernel-2.6.9-1.648_EL and kernel-2.6.9-1.675_EL both appear
to run without problems. Both kernel-hugemem-2.6.9-1.648_EL and
kernel-hugemem-2.6.9-1.675_EL both hang; the former hangs when
configuring eth0 (b44) and the latter hangs during the udev device
initialisation when the b44 module is first loaded.

A quick look at the kernel source shows that the fix for this problem
didn't make it into the new kernel (unless I've missed it).  You can
see this in the diffs for 2.9.10-rc1.  I think the link is
http://www.kernel.org/diff/diffview.cgi?file=%2Fpub%2Flinux%2Fkernel%2Fv2.6%2Ftesting%2Fpatch-2.6.10-rc1.bz2
 (the diffviewer isn't working at the moment).  Just look at the diffs
for b44.d and there's a comment in there about DMA above 1G not
working for some hardware.

Comment 6 John W. Linville 2005-04-25 14:04:58 UTC

*** This bug has been marked as a duplicate of 145109 ***

Comment 7 John W. Linville 2005-04-25 14:08:55 UTC
Ooop...bug 145109 is for Fedora...reopening... 

Comment 8 John W. Linville 2005-04-25 14:22:53 UTC
Created attachment 113632 [details]
jwltest-b44-bounce-bufs.patch

Comment 9 John W. Linville 2005-04-25 15:19:32 UTC
Pre-built test kernels are available here: 
 
   http://people.redhat.com/linville/kernels/rhel4/ 
 
Please give them a try and report the results.  Thanks! 

Comment 10 John Haxby 2005-05-03 14:01:36 UTC
I can confirm that kernel-2.6.9-6.41.EL.jwltest.20 works for me.

Comment 16 Red Hat Bugzilla 2005-10-05 12:30:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-514.html