Bug 145109
Summary: | b44 network interfaces can't be started after system boot | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Miloslav Trmač <mitr> | ||||||
Component: | kernel | Assignee: | John W. Linville <linville> | ||||||
Status: | CLOSED UPSTREAM | QA Contact: | |||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3 | CC: | hugh, jch, pp, wtogami | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2005-04-25 14:00:58 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Miloslav Trmač
2005-01-14 14:43:49 UTC
Possible workaround posted to netdev, someone needs to test it and figure out whether it helps or not. http://www.ee.oulu.fi/~pp/b44hack has it too I'm running the "b44hack" patch right now against 2.6.10_741_FC3 as we speak and I'll report findings. Stupid question. What is netdev? I'm completely oblivious to kernel development groups. I'm just a user, man! :-) Thanks for testing. netdev.com is the mailing list where most network code discussion happens. Would be nice to get the 1GB workaround path tested as well, that should happen by either having > 1GB of memory, running a kernel with the 4:4 split or just changing if(mapping+len > B44_DMA_MASK) { /* Chip can't handle DMA to/from >1GB, use bounce buffer */ to something like if(1 || mapping+len > B44_DMA_MASK) { If that works as well and the driver loads and unloads happily I'll prod upstream to merge the patch. well, the patch in 'b44hack' doesn't seem to work either. In my case, I left my wireless interface (eth1) up for the duration of the night. This morning I wake up and: # ifdown eth1 # ifup eth0 <-- SIOCCFLAGS: Cannot allocate memory There appears to be something related to duration of when the module is unloaded and reloaded again. Like those memory windows shrink the longer the time before the module is reloaded. BTW, the laptop i'm testing this on is a Dell Inspiron 8600 with 1GB RAM. is that 1GB path workaround something I should try too? (i.e. the "1 ||" force?) Just to try something, I went ahead and made the tweak you described above: if(1 || mapping+len > B44_DMA_MASK) { And I'll report the findings. Might also be that B44_BOUNCEBUF_SHIFT needs to be upped a bit. The problem is finding 700k of memory that is physically located under 16MB (< 1GB would be enough but the generic x86 pci code only can do < 16M and < 4GB). The first version tried to find one contiguous chunk, the b44hack patch tries to allocate multiple smaller chucks (8 of them with SHIFT==3), which should be easier to find (this does waste some memory though). well the above tweak "if (1 || mapping+len > B44_DMA_MASK)" didn't work either after a full night of being unloaded. Same error as before "SIOCSIFLAGS: Cannot allocate memory". Regarding your latest post, I experimented with values of B44_BOUNCEBUF_SHIFT <= 12 and nothing worked so far. I even put some printk's in there to see at what point in the bouncebuf allocation loop things were failing and at the last test where the shift == 12 it failed on the 230'th iteration trying to allocate 190 (bytes? i presume) so i figure around ~43K which is significantly less than the needed 770k :-) let me back up a few steps, if I exit out of X or anything like that will it free up some mem? is there ever going to be hope of fixing this? are there some "stupid user tricks" i can do to keep the mem the module allocated at bootup (which is succesful) around? thanks! Created attachment 110033 [details]
b44hack modified to allocate the bounce buffer at device probe time
The attached patch moves the allocation to the time the device is probed.
This works Well Enough(tm) for me because the module is loaded at boot time,
but it is not a general solution.
My other attempts (to use single-page or two-page allocations, even
with 5 buffers / 2 pages) always worked fine right after the kernel
compile, but not after a fresh boot.
I'd still prefer a solution that didn't waste 761 kB of DMA memory
on my 256MB laptop; would it be possible to only allocate a single
bounce buffer for each packet that is >1GB and deallocate/reuse it
after transmit?
Blah...
There is always the quick fix of a B44_DMA_MASK of 0xffffffff (see
#118165 for the long discussion that lead into this bug)
But of course if you remove the bcm4401 from your mobo with a
soldering iron and jury-rig it into a sparc64 it will totally break,
so this obviously cannot go upstream! ;) ;) ;)
Option b is some kind of awful hack to only set the consistent dma
mask to the real value after the kernel has given something that goes
> 1GB. Totally misuses the kernel PCI DMA api, but...
oddly enough, yesterday when I finished trying various combos of the SHIFT value, i reverted to the original b44.ko (from my 2.6.10_741_FC3 tree) and it loaded just AFTER I had killed off a few apps (eclipse, thunderbird, IM client, etc) ... do those apps consume memory that triggers this issue? I'll give Miroslav's b44hack2 a try and since it fits my usage pattern i'm guessing it'll work Well Enough(tm) for me also :-) What does setting the B44_DMA_MASK to 0xffffffff accomplish? Makes the kernel think that any memory <=4GB is good enough for the hardware/driver. 0x3fffffff means anything <=1GB is ok, which is actually the truth, but the kernel interprets that as "only use memory located in the first 16MB", which is a legacy broken ISA-board thing, but things have changed "slightly" since. Ergo lots'o'problems: I don't think there's even any attempt to reserve that area for broken hardware (vs. drivers asking for any kind of memory getting it), so the kernel quickly runs out of it no matter what you do. the 0xffffffff will work because the kernel will never use anything >= 1GB in the default configuration, but 3rd party patches (like the 4:4 split used prevoiously in Fedora, and some other vendors have done similar things) do make this assumption false. So far so good with Miroslav's 'b44hack2' for the last two days with my same usage pattern. While it may be inelegant, pre-allocating the buffers at probe time seems to do the trick. is there a probable long-term solution for this? or is it an issue of the PCI DMA API improving? Pekka and/or Miloslav, Was the b44hack2 patch proposed upstream? If so, how did it fare? I have not proposed it upstream... I personally don't consider it upstream-worthy. Not "upstream-worthy"? Yikes! that's scary. without that patch (or something of the sort), my laptop is not "linux-worthy". :-) what's the most likely long-term solution here? (i've asked this before, but can anyone respond to this question) I'd like the "preferred solution" described in comment #9, but I don't know the network API enough to say even whether it's possible or not. I discussed this briefly w/ Jeff Garzik, and he leans toward the "preferred solution" from comment 9 as well, fwiw... I'll probably take a peak at that, but if someone beat me to it that would be OK too... :-) Created attachment 111837 [details] b44-bounce-bufs.patch My interpretation of the preferred solution from comment 9... I have pre-built test kernels available here: http://people.redhat.com/linville/kernels/fc3/ These, of course, include the patch from comment 20. Unfortunately, I don't have a box w/ >1GB of memory. But, I did test by setting B44_DMA_MASK to just 16MB...that seems to be working fine -- ttcp has been pounding on it for hours. Please give this a test and let me know the results! Looks sane to me (but -ENOHARDWARE so not tested). Patch should be sent to netdev and akpm I think. It's certainly better than whatever code currently is in any tree. Definitely will push it upstream provided I get no negative feedback. I would prefer to here some positive reports first, so I'll wait a day or two to hear... :-) The test kernel works fine here, but I don't have >1 GB of RAM either. As of today, I have submitted the patch upstream (which is the best path into Fedora)... cool John L. I was just going to drop a note here saying that it's been working fine for me for the last two days. Thank you and nice job! These seem to be the relevant netdev messages: http://oss.sgi.com/projects/netdev/archive/2005-01/msg00053.html http://oss.sgi.com/projects/netdev/archive/2005-01/msg01096.html http://oss.sgi.com/projects/netdev/archive/2005-03/msg00855.html http://oss.sgi.com/projects/netdev/archive/2005-03/msg00950.html I'm visiting this bug entry because the Broadcom BCM94306 802.11g WiFi chip seems to have a similar problem with 1G+ physical addresses. Things are made more interesting because the driver is a 64-bit MS Windows driver + ndiswrapper64. From the netdev messages, I understand the a 1G DMA mask forces allocation below 16M on x86 and that this is way overconstrained. Is this true on x86_64 (AMD variant)? In other words, is the adopted fix best on all architectures? For b44 x86_64 is is irrelevant, they can only be found embedded on x86's (and embedded broadcom MIPS platforms, which usually have 32MB of memory max). And PCI-based cards, but those are reference designs that you can only get directly from broadcom (if I understood correctly). And with the latest "allocate only if necessary" patch the amount of GFP_DMA used is quite small. Probably best forum to discuss the wifi stuff would be the ndiswrapper mailing list (with possible Cc:s to netdev and/or linux-kernel). Not that they necessarily care on those lists ;) For arches that take the pci_set_dma_mask value literally, then pci_alloc_* could be safely used. However, I think this solution should still work. And, it prevents us from having arch-specific versions of b44. Do you have another alternative? If so, please suggest it (and include a patch if possible)... b44 works again out-of-the box with kernel-2.6.11-1.14_FC3 (and in FC4t2). Thanks again! *** Bug 134790 has been marked as a duplicate of this bug. *** |