Bug 118165 - (NET B44) (4G/4G?) Hangs on bootup - b44 (Broadcom) network driver cannot deal with > 1Gb ram
(NET B44) (4G/4G?) Hangs on bootup - b44 (Broadcom) network driver cannot dea...
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: John W. Linville
:
Depends On:
Blocks: FC2Blocker FC3Target FC4Target
  Show dependency treegraph
 
Reported: 2004-03-12 13:49 EST by Craig Cruden
Modified: 2007-11-30 17:10 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-06 21:23:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
nomemnob44 - no mem, no acpi=off, no b44 loaded in modprobe (8.41 KB, text/plain)
2004-05-07 16:24 EDT, Craig Cruden
no flags Details
mem=1024m option (8.41 KB, text/plain)
2004-05-07 16:25 EDT, Craig Cruden
no flags Details
mem=1024m acpi=off (8.41 KB, text/plain)
2004-05-07 16:26 EDT, Craig Cruden
no flags Details
Bring out the barf-bags (2.11 KB, patch)
2004-05-31 16:00 EDT, Pekka Pietikäinen
no flags Details | Diff
The barf-bags strike back! (3.29 KB, patch)
2004-06-05 16:10 EDT, Pekka Pietikäinen
no flags Details | Diff
Return of the barf-bags (3.66 KB, patch)
2004-06-09 08:33 EDT, Pekka Pietikäinen
no flags Details | Diff
b44 update from -mm tree (12.99 KB, patch)
2004-10-20 18:19 EDT, Pekka Pietikäinen
no flags Details | Diff

  None (edit)
Description Craig Cruden 2004-03-12 13:49:18 EST
From Bugzilla Helper:


User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 
5.1) Opera 7.01  [en]




Description of problem:


I have not been able to properly boot on any kernel after ".118". 




Currently the latest set of kernels is hanging on bootup when it gets 
to the "b44" broadcom network driver.  




The computer it is installed on is a Dell Inspiron 8500.  




The network card is Broadcom 4401.  




Version-Release number of selected component (if applicable):


kernel-2.6.3-2.1.253




How reproducible:


Always




Steps to Reproduce:


1. Install updated kernel


2. Boot 


3. It will stop when it gets to b44 (so you have to power off)


    




Actual Results:  The bootup logs were not updated after rebooting and 
running the util to ensure the partition journals are ok.




Additional info:
Comment 1 Anders Joergensen 2004-03-18 10:19:49 EST
This was a problem in 2.6.1-1.65 as well. This is only ipv4, since
ipv6 seems to work.
Comment 2 Warren Togami 2004-04-10 22:04:53 EDT
Craig, I have a suspicion.  You normally use i686 kernels right?  Try
the i586 latest rawhide kernel and see if behavior is any different.
Comment 3 Craig Cruden 2004-04-11 23:35:04 EDT
I tried the .i586 kernel (.315) and the b44 loads and I am able to
connect to the outside world from that computer.  

BTW:  I never did have the problem with .65 kernel.  

Anyways, I am guessing that this means it is a simple fix to make :p
Comment 4 Warren Togami 2004-04-12 02:03:33 EDT
Okay just as I suspected, 4G/4G enabled in all kernels after .118 may
be interfering with your network driver.  i586 has 4G/4G disabled
while i686 is enabled.  You can further confirm this by using the .315
sources and the standard i686 config file, but disabling only 4G/4G
and rebuilding it.  I suspect it will work then too.

Did you try .315 i686 though?  A major 4G/4G patch went in that may
make a difference.  Please report your results.
Comment 5 Arjan van de Ven 2004-04-12 03:37:22 EDT
warren: there's also differences acpi wise.... not full conclusion as
of yet...
Comment 6 Warren Togami 2004-04-12 04:50:01 EDT
Hence the question mark and suggestion of further more specific testing.
Comment 7 Craig Cruden 2004-04-12 12:19:40 EDT
Before installing the .i586 kernel, I had installed the .i686 kernel 
and did a "modprobe b44" to test it before adding it back to the 
modprobe.conf.  It hung the machine in about the time it took to hit 
the enter key the fourth time (so I did not try it at boot time).  I 
did the same thing during the testing of the .i586 and "modprobe b44" 
did not hang the machine so I added it in for boot and it worked.  

I dare not add it into boot time right now since I do not have a 
recovery disk with me at this time :p

I am working on compiling a custom kernel now....
Comment 8 Craig Cruden 2004-04-12 13:33:53 EDT
Compiled a custom kernel with the 4G/4G turned off.  The b44 driver 
seems to work with that option checked off.

Rebooted with stock .i686 kernel and it hangs....

------------------------------------------

config differences

***************
*** 91,101 ****
  CONFIG_X86_GOOD_APIC=y
  CONFIG_X86_INTEL_USERCOPY=y
  CONFIG_X86_USE_PPRO_CHECKSUM=y
! CONFIG_X86_4G=y
! CONFIG_X86_SWITCH_PAGETABLES=y
! CONFIG_X86_4G_VM_LAYOUT=y
! CONFIG_X86_UACCESS_INDIRECT=y
! CONFIG_X86_HIGH_ENTRY=y
  CONFIG_HPET_TIMER=y
  CONFIG_HPET_EMULATE_RTC=y
  # CONFIG_SMP is not set
--- 91,101 ----
  CONFIG_X86_GOOD_APIC=y
  CONFIG_X86_INTEL_USERCOPY=y
  CONFIG_X86_USE_PPRO_CHECKSUM=y
! # CONFIG_X86_4G is not set
! # CONFIG_X86_SWITCH_PAGETABLES is not set
! # CONFIG_X86_4G_VM_LAYOUT is not set
! # CONFIG_X86_UACCESS_INDIRECT is not set
! # CONFIG_X86_HIGH_ENTRY is not set
  CONFIG_HPET_TIMER=y
  CONFIG_HPET_EMULATE_RTC=y
  # CONFIG_SMP is not set
Comment 9 Warren Togami 2004-04-14 21:46:04 EDT
http://people.redhat.com/wtogami/temp/
Just a FYI.  If other people want to test i686 kernel-2.6.5-1.322, I
have made test i686 RPMS at the above URL for your convenience.  Only
difference in configuration is the disabled 4G/4G memory split.
Comment 10 Craig Cruden 2004-04-15 03:27:59 EDT
Double checked with:

kernel-2.6.5-1.322
kernel-2.6.5-1.322.disabled4G

precompiled kernels and I have the same result of:

kernel-2.6.5-1.322 - hanging
kernel-2.6.5-1.322.disabled4G - working
Comment 11 Pekka Pietikäinen 2004-04-16 04:07:17 EDT
Works with 4G/4G (1.305) for me (ASUS A7V8X, 1GB of memory)...

One thing that would be interesting to know is whether bcm4400
(the broadcom driver) works. There's one other known problem
(shown as timeout waiting for bit xxx when loading the driver),
that seems to only be a "problem"  for dell inspiron users.
On my A7V8X it only gets triggered when I load the broadcom driver
and loading b44 afterwards (and is fixed by adding a return; at a 
certain point in the init code of bcm4400 or a full powerdown :-) 
)

There are certainly some hardware quirks in the chip that need to 
be worked around, and how they manifest themselves might be 
related to how the OEM has wired the chip...
Comment 12 Ingo Molnar 2004-05-01 13:43:34 EDT
Craig, how much RAM does your system have?

one side-effect of 4G/4G is that it enables much more RAM to be used
for 'lowmem' - which is the place where network buffers (skbs) go. The
3:1 kernel has a lowmem range of 0-960MB, while the 4:4 one can have
up to 3GB of lowmem RAM. If the card hardware has a bug in that it can
only do DMA up to say 2 GB (or 1 GB) then such a bug would only
trigger under the 4:4 kernel.
Comment 13 Craig Cruden 2004-05-01 14:26:31 EDT
It is a Dell Inspiron 8500 - memory at maximum (2GB).
Comment 14 Jochen Weiss 2004-05-01 21:15:11 EDT
FYI, I am getting the same problems on an Inspiron 9100 with 1 GB
memory (BTW: I am using the 2.6.5-1.327 kernel).
Comment 15 Alan Cox 2004-05-03 13:53:28 EDT
Random possible connection - both b44 and acenic dont use the ethtool
hooks. Both b44 and acenic have a problem with 4G/4G. Don't understand
why however
Comment 16 Ingo Molnar 2004-05-06 06:01:39 EDT
does the problem trigger if you boot with mem=512m?
Comment 17 Arjan van de Ven 2004-05-06 06:04:12 EDT
the 351 kernel in http://people.redhat.com/arjanv/2.6 has b44 using
the ethtool ops infrastructure....
Comment 18 Craig Cruden 2004-05-06 10:39:30 EDT
Little more testing.  I installed .352 kernel and booted.

It hung.

I then added mem=512 to the 352 kernel boot parameters....  

It worked.  
Comment 19 Jochen Weiss 2004-05-07 13:44:18 EDT
Latest development kernel (.351) solved the problem for me (Inspiron 
9100, 1GB mem) even without the mem=512 option... 
Comment 20 Craig Cruden 2004-05-07 14:52:52 EDT
So, I am assuming that the closing of this issue means that you 
believe there is NO solution to this problem for the Dell Inspiron 
8500?
Comment 21 Craig Cruden 2004-05-07 14:54:20 EDT
So, I am assuming that the closing of this issue means that you 
believe there is NO solution to this problem for the Dell Inspiron 
8500 - with 2GB of memory?
Comment 22 Craig Cruden 2004-05-07 15:17:32 EDT
More info -- tested again to make sure that I was not mistaken:

Testing with kernel options of:

    mem=512m  - worked
    mem=960m  - worked
    mem=1024m - worked
    mem=1536m - failed - hung during bootup
    <omitted> - failed - hung during bootup 
Comment 23 Craig Cruden 2004-05-07 15:23:59 EDT
Reopen - pending comment.
Comment 24 Alan Cox 2004-05-07 15:26:38 EDT
It was closed because it was assumed to be fixed by the 351 kernel
since our internal bugs and Jochens problem went away. Clearly that
wasnt a correct asusmption.

Does acpi=off help, and can you attach an lspci -vxx from a working one
Comment 25 Arjan van de Ven 2004-05-07 15:27:41 EDT
if the boundary is exactly 1Gb, can we rule out someone forgetting to
wire up the highest 2 address lines ?
Comment 26 Craig Cruden 2004-05-07 15:53:00 EDT
took away the mem option, put 

   acpi=off

and 

   acpi=off lspci -vxx

both of which hung at b44

I am not familiar with the lspci -vxx option, do I attach it when I 
have mem=1024m  and let it properly boot, then hunt around for a 
file.... or has it already generated a file which I have not found?

"if the boundary is exactly 1Gb, can we rule out someone forgetting to
wire up the highest 2 address lines ?"

don't know anything about the quote above...
Comment 27 Warren Togami 2004-05-07 15:59:28 EDT
lspci -vxx is a command you run at the shell prompt.  You can
redirects its output to a text file and attach that file here.

lspci -vxx > output.txt
Comment 28 Craig Cruden 2004-05-07 16:24:54 EDT
Created attachment 100095 [details]
nomemnob44 - no mem, no acpi=off, no b44 loaded in modprobe

Did not know what/why contents differ but included some from several different
boots:
    nomemnob44 - no mem, no acpi=off, no b44 loaded in modprobe
    mem1024    - mem=1024m
    mem1024acpioff - mem=1024m acpi=off
Comment 29 Craig Cruden 2004-05-07 16:25:54 EDT
Created attachment 100096 [details]
mem=1024m option
Comment 30 Craig Cruden 2004-05-07 16:26:40 EDT
Created attachment 100097 [details]
mem=1024m acpi=off
Comment 31 Alan Cox 2004-05-07 17:31:40 EDT
Arjan I see one thing suspicious there. The pci spaces seem to be hard
up against the end of memory - but only for the ones Linux assigned. I
wonder if we are getting memory and other resources overlapping due to
a PCI resource handling bug or E820 data funnies ?
Comment 32 nazeman 2004-05-28 04:16:25 EDT
I have a A7V8X motherboard with 1,5 GB memory
I have the same problem -> 
my system freeze with ifup eth1
(eth0 3com, eth1 bcm44)
Comment 33 Pekka Pietikäinen 2004-05-28 05:54:40 EDT
-sigh-

I'll try to borrow some memory for my A7V8X to hunt this down. 
Could you confirm the situation is the same with the latest kernel
from updates-testing or http://people.redhat.com/arjanv/ and whether
mem=1024m changes the situation.

Comment 34 Ingo Molnar 2004-05-28 06:02:22 EDT
We fixed a couple of 4/4 bugs lately, these fixes should be in Arjan's
.391 kernel:

http://people.redhat.com/arjanv/2.6/RPMS.kernel/kernel-2.6.6-1.391.i686.rpm

could you try this kernel?

Pekka, if this kernel shows the problem too then indeed it would be
nice if you could check it out - we've run out of ideas. There are no
other known 4:4 related bugs or weirdnesses pending, other than this one.
Comment 35 Craig Cruden 2004-05-28 08:17:13 EDT
Tried the "391" kernel and the same problem persists.  

   i.e when the mem=1024m is not added to the kernel options -- the
system freezes on b44; when the mem=1024m is added it boots.

Comment 36 Pekka Pietikäinen 2004-05-28 16:22:11 EDT
I managed to borrow some extra memory (1.25G total) and was able to
reproduce the bug with 1.397. Debugging time, the night is still
young... :-)
Comment 37 Pekka Pietikäinen 2004-05-29 05:49:04 EDT
Initial results show that the broadcom bcm4400 driver is affected as
well and pci_set_dma_mask(pdev, (u64) 0x3fffffff); did not fix the
problem as it might have. 

Will continue poking around.
Comment 38 Pekka Pietikäinen 2004-05-31 15:59:27 EDT
Close your eyes and find a barf-bag... Following patch (a bit
over-kill, GFP_DMA for the rx skbs, illegal_highdma() == 1 and the pci
dma masks
should be enough) makes the chip receive fine. Transmitting still
breaks after a short while, I assume once it hits an skb that is
located above 1GB. 

So infrastructure changes needed if this is to ever work if I
understood correctly.
Comment 39 Pekka Pietikäinen 2004-05-31 16:00:46 EDT
Created attachment 100723 [details]
Bring out the barf-bags
Comment 40 Alan Cox 2004-05-31 16:23:00 EDT
Yuck yuck 8)

You might want to steal the logic from the old ISA bus drivers like
lance. Those use fixed rings for RX and bounce buffer tx sk_buffs. 

Comment 41 Pekka Pietikäinen 2004-06-05 16:09:55 EDT
Here's a
"works-for-me-but-I'm-still-not-sure-I-want-to-admit-writing-this-patch"
patch, which seems to be the best that can be done with just driver
changes (I've posted this to netdev/l-k too)
Comment 42 Pekka Pietikäinen 2004-06-05 16:10:44 EDT
Created attachment 100897 [details]
The barf-bags strike back!
Comment 43 Ludovic Coumétou 2004-06-06 12:58:40 EDT
Hello,

I experience the same problem with my BCM4401 on my Asus Pundit, but I
only have 512Mo RAM is it the same problem or is it another bug??

I use kernel 2.6.6 compiled with 4kstacks off.

Regards,
Ludovic C.
Comment 44 Pekka Pietikäinen 2004-06-06 15:45:28 EDT
Sounds like a different bug. If it's a self-compiled kernel
bugzilla.redhat.com is the wrong place (bugme.osdl.org would be 
more appropriate). If you file a bug there, please include
details such as whether there is some combination that does work?
(2.4, the bcm4400 driver from broadcom, some earlier 2.6
version/vanilla 2.6.7 release candidates, acpi=off?)

and notify me of the bug so I can start tracking it there
and hopefully be able to reproduce it, in which case there is some
hope that I might manage to fix it as well :-)

Comment 45 Pekka Pietikäinen 2004-06-09 08:33:17 EDT
Created attachment 100988 [details]
Return of the barf-bags

This is the "final" version of the patch, which I've submitted to netdev.
Comment 46 Brad Clements 2004-06-19 20:57:20 EDT
I have a Dell Dimension 2400 with a bcm4401 (rev 1) ethernet adapter.
This machine has 256 meg of ram. I just installed stock fedora core 2
from iso. On startup, modprobe b44 works. But initializing eth0
results in many lines of BUG! Timeout waiting for bit 80000000 of
register 428 to clear.

ifconfig eth0 shows no packets sent or received.

This machine was running RedHat 9 (stock kernel), I did an upgrade
install to Fedora Core 2 and changed "alias eth0 bcm4400" to "alias
eth0 b44".

The bcm4400 driver worked fine under RedHat 9.

Shall I try the patch above? I guess I'll need to install kernel
source from CD. If someone has a binary they'd care to share, it only
has to work long enough for me to get on the 'net.
Comment 47 Pekka Pietikäinen 2004-06-19 22:06:17 EDT
Different bug, was fixed a few weeks ago in 2.6 (should be in the fc2
kernel update errata candidate too). Workaround is to not run any
drivers from broadcom (either windows or the linux one) before b44, as 
they flip the magic bit that makes b44 not work (by powering down the
PHY on shutdown)
A full power-down (physically removing the cable is the surest) should
put the chip into a sane state, just pressing the reset button isn't
enough. 

Comment 48 Bertil Askelid 2004-07-13 00:10:57 EDT
I see this also in the latest released FC2 kernel 2.6.6-1.435.2.3 on a
Inspiron 8500 with the BCM 4401 100Base-T 10/100 ethernet card. Either
using b44.ko from FC2 or bcm4401 driver from Broadcom, my system
freezes when I do ifup. The mem=1024m solves the problem on my 2G RAM
system. Now, I really would like to be able to get my full RAM back
AND still be connected!
Comment 49 Pekka Pietikäinen 2004-07-13 18:27:59 EDT
As a temporary workaround, grab http://www.ee.oulu.fi/~pp/b44-4g4g.tgz
untar, compile (there's a script, too lazy to figure out 
how to make "make" do the right thing vs. having to give it
arguments). Then just replace your b44.ko with the one it creates and
you should be a-ok. This is the same patch as is attached to this bug,
this is just in a more user-friendly format. I briefly looked at
creating a src.rpm but that looked _WAY_ too tricky :-)

Getting the patch included (which would happen if this got merged
upstream) is a bit tricky. The workaround isn't
necessary with a non-fedora kernel since it only gets triggered with a
4:4 layout (possibly something less exotic like 2:2 too, but that'd
be a non-standard config too), and patches for neither are unlikely 
to make it to 2.6.

Well ok, corner-cases where this fix or something similar would be
required even on a vanilla kernel (x86-64's with a bcm4401) could be
imagined but in reality there's really only a few types of x86
motherboards/laptops with the chip out there.

Anyway, the fix is pretty ugly, which is why I'm somewhat reluctant to
push it too much, but there's not really much that can be done about
that due to the nature of the (apparent hardware) bug.

Option B is making this a fedora-specific patch, but that'll increase
the maintenance of the already-overworked RH kernel people,
so I can see why they're reluctant to add it either.
Comment 50 Barry K. Nathan 2004-07-13 19:16:47 EDT
> Getting the patch included (which would happen if this got merged
> upstream) is a bit tricky. The workaround isn't
> necessary with a non-fedora kernel since it only gets triggered with a
> 4:4 layout (possibly something less exotic like 2:2 too, but that'd
> be a non-standard config too), and patches for neither are unlikely 
> to make it to 2.6.

"I plan to merge the 4g split immediately after 2.7 forks."

So says Andrew Morton, here:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0402.3/1351.html

And here's a post by Andrew Morton explaining the logic behind doing
it that way:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0308.0/0365.html

So, unless Andrew Morton has changed his mind recently and I haven't
noticed, merging of 4:4 into the mainline kernel is almost a certainty.
Comment 51 Barry K. Nathan 2004-07-13 19:20:36 EDT
On second thought, maybe "almost a certainty" is too strong a phrase,
but it's still quite likely to happen unless I've missed something
recent...
Comment 52 Marc Schwartz 2004-07-14 01:11:25 EDT
Just a confirmation that Pekka's fix in #49 above works.

This is on a new Dell Inspiron 5150 laptop with a 3.2 Ghz P4 with 2 Gb
RAM running FC2 with the 2.6.6-1.435.2.3 kernel.  The internal NIC is
a BCM4401 100Base-T (rev 01).

Yeah!  I can now put away my PCMCIA NIC.

Prior to this I would get a hard lock up when trying to bring up the
internal NIC.

What can I send you for Christmas Pekka?  :-)

Thanks!
Comment 53 Bertil Askelid 2004-07-14 15:12:26 EDT
Pekka's fix (see #49 above) works just fine.

How can something that works be ugly? Something that fixes this
problem has to be released to the next kernel! Or do I have to switch
to another distribution?

When something that works is seen as ugly, I always fault the theory
from being incomplete when it doesn't take reality into account !
Comment 54 Barry K. Nathan 2004-07-14 22:15:10 EDT
> When something that works is seen as ugly, I always fault the theory
> from being incomplete when it doesn't take reality into account !

If ugly hardware necessitates ugly code, that doesn't magically make
the code not-ugly...
Comment 55 Pekka Pietikäinen 2004-08-03 14:57:06 EDT
Could you try http://www.ee.oulu.fi/~pp/b44-095.tgz ? That one includes 
a cleaned up version and bcm47xx support which someone just submitted
(you'll need to add a #define PCI_DEVICE_ID_BCM4713 0x4713 I know,
that goes in include/linux/pci_ids.h in principle :-) ). If it works
in the > 1GB of ram case (had to return the memory I borrowed so can't
verify easily myself :-( ) I'll go into patch-bomb mode until it goes in
in the mainstream kernel.
Comment 56 Marc Schwartz 2004-08-03 15:29:59 EDT
Pekka,

I am getting an Access Forbidden error message when trying to get the
new file. Can you verify the URL?

Thanks.
Comment 57 Pekka Pietikäinen 2004-08-03 16:16:24 EDT
yikes, fixed :-)
Comment 58 Marc Schwartz 2004-08-03 16:48:31 EDT
Pekka,

It is working here with the #DEFINE change made. Same system as
defined in comment #52. Same kernel as well.

That's one... :-)

Thanks!
Comment 59 Marc Schwartz 2004-08-03 18:29:25 EDT
One more data point. 

The latest FC2 kernel hit my mirror (2.6.7-1.494.2.2). I have
installed it and rebuilt the b44 driver.

It works.
Comment 60 Bertil Askelid 2004-08-07 11:49:19 EDT
Works fine on my Inspiron 8500 w/ 2 GByte memory and running
linux-2.6.7-1.494.2.2.

Thanks!
Comment 61 Marc Schwartz 2004-08-15 15:59:15 EDT
Just updated to 2.6.8-1.520 on FC2. The patch does not appear to work
now. Trying to bring up the NIC now results in X locking up. I can get
out of X back to a console.

There were no errors during the compilation and nothing reported when
using ifup eth0 in a console.

Anyone else try this with the new kernel?

I am temporarily back to 2.6.7-1.494.2.2 for the moment.

Thanks.
Comment 62 Pekka Pietikäinen 2004-08-16 07:07:15 EDT
Works for me (tm) but this is with 1GB of memory only (but with
things modified so the workaround code gets used for every packet :-)
). Suppose I'll have to find some extra memory again to try things out...

Could you try http://www.ee.oulu.fi/~pp/b44-095-2.tgz with and without
mem=1024m .

Other things to try, 
line 644
if(mapping+RX_PKT_BUF_SZ > ..) -> if(1 || ...) 
and on line 939
if (mapping+len > B44_DMA_MASK) -> if( 1 || ...)  to make it 
always use memory under 16 MB. Current logic _should_ be fine though.

 
Comment 63 Marc Schwartz 2004-08-16 10:57:12 EDT
Pekka,

The updated file works, both with and without mem=1024m.

So...back to 2.6.8-1.520.

Thanks!
Comment 64 Pekka Pietikäinen 2004-08-16 12:14:38 EDT
Whew, 095-2 is what was commited upstream (drivers touching skb->data
is a no-no apparently :-) ). Supposedly the fix might go in for 2.6.9,
remains to be seen...
Comment 65 Pekka Pietikäinen 2004-09-24 05:04:53 EDT
Just a status update, the fix is in 2.6.9-rc2-mm2 (bk-netdev.patch),
hopefully will propagate to the Linus (and thus Fedora) tree 
soon.
Comment 66 Pekka Pietikäinen 2004-10-20 18:16:58 EDT
Didn't make it into Linus's tree in time for FC3 to get it that way,
but hopefully the fix can still be included (davej added to Cc: list
since otherwise this would probably go on missed).

I've attached yet-another-patch, it's the patches related to b44.c and
b44.h (and nothing else) from bk-netdev.patch in 2.6.9-rc4-mm1.
Also standalone version of the same in
http://www.ee.oulu.fi/~pp/b44-bk.tgz for those who just want a quick
fix. Patches go into 2.6.9-1.639 just fine. No real changes wrt. the
previous one, it's just the version that will end up upstream eventually.

The one known issue that this patch creates is that loading the module
long after booting might fail if GFP_DMA is used by other stuff, this
is due to x86 pci_alloc_consistent() limitations, it uses GFP_DMA if
the mask is anything other than 4GB... We load network modules early
on boot tho, so this shouldn't be a big problem.

Comment 67 Pekka Pietikäinen 2004-10-20 18:19:50 EDT
Created attachment 105554 [details]
b44 update from -mm tree
Comment 68 Jwahar Bammi 2004-11-11 19:07:29 EST
One additional data point: Applicable to Shuttle XPC systems, with
FT61 motherboards. They have the broadcomm chip. Thanks to Pekka for
the patch. Applied to stock Fedora FC3 kernel-sources with no problems.
Comment 69 Dave Jones 2004-11-19 19:56:26 EST
fixed in cvs, will be in next build.
Comment 70 Bertil Askelid 2004-11-28 09:21:45 EST
When using the standalone package http://www.ee.oulu.fi/~pp/b44-bk.tgz
with linux-2.6.9-1.6_FC2 I get the following error

   # ifup /etc/sysconfig/network-scripts/ifcfg-eth0
   Determining IP information for eth0...
   SIOCSIFFLAGS: Cannot allocate memory
   SIOCSIFFLAGS: Cannot allocate memory

This happens when I have loaded all my usual applications and am using
up my RAM of 2 GByte. With no applications, after having rebooted, b44
works fine.
Comment 71 Pekka Pietikäinen 2004-11-28 13:23:44 EST
See last paragraph of comment #66. The problem is that the driver
needs about 750k of memory that has to be located under 1GB physically
to not trigger the hardware bug that causes crashes and other fun. The
driver tries to allocate that kind of memory
(pci_set_consistent_dma_mask(pdev, 0x3fffffff) ). There should be
plenty, right?

Unfortunately the way it's implemented right now in the generic x86
pci code is that if you ask for some memory with a dma mask of < 4GB,
it falls back to giving you memory from the first 16MB. Now that's a
pretty limited resource :-(. There seems to be 3 drivers that need
similar workarounds (wanxl, aacraid and b44).
Comment 72 Marc Schwartz 2004-12-04 16:06:29 EST
I just installed the latest test kernel 2.6.9-1.698_FC3 on the system
referenced in comment #52 and my b44 is working fine without having to
utilize the external patch.

Between the now included patch and the 4g/4g fix, it looks like we
might be able to put this one to bed.  :-)

Thanks to everyone who has spent time on the patches and fixes!

Marc
Comment 73 Need Real Name 2005-01-13 11:46:02 EST
I'm running FC3 w/ kernel-2.6.10-1.737 and I'm getting the "SIOCFLAGS:
Cannot allocate memory" error when trying to activate "eth0" (i.e. the
b44 driver).   Here's what I'm doing:

in the morning, boot up:
eth0 (the b44) is active on boot
eth1 (ipw2100, wireless) NOT active

everything is fine, in the evening, go home (NO poweroff/reboot):
% ifdown eth0 (to get route table default route cleared, etc)
% ifup eth1 (to use my wireless access at home)

still everything ok, UNTIL I come back to work in the morning (NO
poweroff/reboot):
% ifdown eth1 (to clear wireless route stuff)
% ifup eth0  <-- HERE's where I get the "SIOCCFLAGS: cannot allocate
memory" error

Any suggestions?
thanks,
russ
Comment 74 John W. Linville 2005-01-13 13:14:49 EST
What about if you do a "modprobe -r b44" in the evening:

% ifdown eth0
% modprobe -r b44
% ifup eth1

Does that change anything?
Comment 75 Need Real Name 2005-01-14 17:42:06 EST
thanks for the response, John,

that actually worked for a quick test.  I did notice that the "mii"
module has a dependency on the b44 module, perhaps that was
introducing some problems?  

I'll try it again over a longer period on monday and see how it works
and report back here.

thanks again.
Comment 76 Need Real Name 2005-01-17 11:05:41 EST
Well, sadly it just happened again.  Here's the deal this time:

Beginning of weekend
% ifdown eth0
% modprobe -r b44
% ifup eth1

This morning:
% ifup eth0  <--   SIOCCFLAGS cannot allocate memory error again..

It was working in short timeframes, but does unloading over long
periods of time (workday > 7 hours?) change conditions in the kernel
sufficiently that it can't get reloaded?

btw, this behavior has been consistent even over the last handful of
updated kernels for FC3 over the last 2 weeks or so.
Comment 77 Pekka Pietikäinen 2005-01-17 11:12:40 EST
See #66 and #71 :-) This is being tracked as #145109 and on netdev. 
There's a untested patch at http://www.ee.oulu.fi/~pp/b44hack
I no longer have the hardware so testing it is a bit hard :-)
Comment 78 D. Hugh Redelmeier 2005-04-04 02:08:37 EDT
If I understand the kernel's Documentation/DMA-mapping.txt (and I may not),
would it not be simpler to allocate buffers for the device using a suitable dma
mask?

(I'm visiting this bug entry because the Broadcom BCM94306 802.11g WiFi chip
seems to have a similar problem with 1G+ physical addresses.  Things are made
more interesting because the driver is a 64-bit MS Windows driver + ndiswrapper64.)
Comment 79 John W. Linville 2005-04-04 11:16:28 EDT
I see that you answered your own question in bug 145109 comment 28...posting 
here so that future searchers find the same answer... :-) 
Comment 80 Pekka Pietikäinen 2005-09-06 21:23:55 EDT
Just triaging some of bugs I've commented to. This definately is fixed upstream
and all possible relevant errata kernels these days, even the #145109 spin-off
bug is, so I'll close the bug. If there's anything new that's broken in b44,
please file a separate bug. And someone recently broke ACPI, so try acpi=off if
it breaks for you even now! And even that should be fixed now! This driver is
officially bug-free (tm)!

Note You need to log in before you can comment on or make changes to this bug.