Bug 324191
Summary: | Broadcom 43xx driver very frequently fails to associate | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | David Campbell <david> | ||||||
Component: | kernel | Assignee: | John W. Linville <linville> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 7 | CC: | cebbert, davej, jarod | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | 2.6.23.1-5.fc7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2007-10-18 17:51:51 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
David Campbell
2007-10-09 03:59:36 UTC
Are you using Networkmanager? Does it behave differently if you stop NetworkManager and use the ifconfig/iwconfig commands by hand? service NetworkManager stop ifconfig wlan0 up iwconfig wlan0 key <wep key> iwconfig wlan0 essid <essid for wlan> dhclient wlan0 Yes I am using NetworkManager, but it does not make any difference if I use the above, I get all the same errors, notably the "failed to parse AssocResp" message. If you provide me with a precompiled driver (2.6.22.9-91.fc7) or something I can use to compile up the driver with changes that might indicate the issue, I'm happy to install and test and report dmesg content. Hmmm, looks like the "failed to parse" error comes from /lib/modules/2.6.22.9-91.fc7/kernel/net/mac80211/mac80211.ko in function ieee802_11_parse_elems where it seems to be caused by an unknown element received. Code off the net I found seems to have used to logged a warning when an unknown element was received but now returns complete failure. Downloading official source now. I've got similar hardware (bcm4306 in a powerbook g4), but a bit different behavior, using the latest rawhide kernels. I had problems associating with F7 kernels up to 2.6.22.<something> and have since switched over to rawhide. No more association problems with the latest rawhide kernel, though I've had the connection go completely belly-up on me, killing off NetworkManager and my wireless base station (happened once, haven't yet tried to reproduce). I've managed to compile up my own mac80211.ko with debug in the ieee802_11_parse_elems function of ieee80211_sta.c that shows the content of the association response. It turns out that my router is sending the required association response detail elements, but most of the time it is trailing them with invalid element ids, and linux is being strict about them, failing the association because of it, even though other O/S's don't fail because of it. However, the linux code already illustrates a certain fault tolerance for apple hardware, as indicated by this comment: /* Do not trigger error if left == 1 as Apple Airport base stations * send AssocResps that are one spurious byte too long. */ I have emailed the support contact of the router vendor about this, but a practical solution to this issue is to make linux tolerant of the router bug by modifying its handling of invalid element ids to instead consider the parsing of the association response to be finished when it hits an invalid id. This is easily accomplished in the code. Well, I appreciate the "shoe leather"...but I think your analysis if flawed. All the places calling ieee802_11_parse_elems specifically check the return code for ParseFailed. If ieee802_11_parse_elems encounters unknown IE types, it will return ParseUnknown. So, this does not explain the error you are experiencing. That function only return ParseFailed if the next element runs off the end of the frame. So if you are getting that message, it would seem that your AP is generating bad association responses. Might your AP be generating fragmented association reponses? I don't know if we can handle that, or if it is actually compliant w/ the spec... Sorry for the lack of clarity. ParseFailed is being returned, as the code checks the run off the end of the frame before it checks the id, but both the id and the elen are wrong as returned by my router after the other valid ids and elens - it seems there's rubbish at the end of the frame or that the frame length is wrong. parse_elems len=18 (total frame len) WLAN_EID_SUPP_RATES id=1 elen=8 left=16) -> OK WLAN_EID_EXT_SUPP_RATES id=50 elen=4 left=6) -> OK IEEE 802.11 element parse failed (id=67 elen=207 left=0) -> BAD So it is hitting an id in this instance of 67 and elen of 207, both bad. Judging by the valid frames, the frame length should really be 2+8 + 2+4 which is 16, and the router is returning 18. The Apple airport bug that the driver already works around returns 17 instead of 16. Created attachment 223561 [details]
proposed patch to ieee80211_sta.c
I propose a patch, attached...basically it moves the frame overflow detection
in the ieee802_11_parse_elems function into the switch, but in the default case
of bad id, permits the return of ParseUnknown in the case of a frame overflow.
All the other cases still return ParseFailed.
Regarding patches, plesae use "diff -u" to generate them and please follow kernel style guidelines (esp. using tabs for indentation) if at all possible...thanks! If I understand your patch, you are trying to simply ignore any unknown elements at the end of a frame, even if they run off the end. This presumes they are junk data. This gave me some initial heartburn and I'm still not entirely sure it is the right way to go. But, I'm prepared to consider the approach. Created attachment 224811 [details]
parse-elems-trunk-junk.patch
I like it better coded this way.. :-)
Would you mind testing the above patch in your environment? Thanks! BTW, what is the make/model of your access point? Yes, your patch works fine for me. Thanks! The wireless-router-voip-printserver is a http://www.draytek.com.au/products/Vigor2900.php which has one of the richest feature sets around, supports lpr printing and hence linux, and has been good to me until now. A somewhat different patch was accepted upstream. The 2.6.23.1-5.fc7 kernels have it: http://koji.fedoraproject.org/koji/buildinfo?buildID=21517 |