Description of problem: In a very busy network space (i.e., I can see over 90 APs from here), the iwl3945 driver can sometimes fail to associate. According to a member on the networkmanager list this is due to the fact that the iwl3945 driver can scan while associating, and this can cause the association to fail. The patch at http://marc.info/?l=linux-wireless&m=119668234926912&w=2 is supposed to correct this. Version-Release number of selected component (if applicable): Tested in both kernel-2.6.23.8-34.fc7 and kernel-2.6.23.1-21.fc7 How reproducible: Somewhat reproducible, but not 100%. It seems to be a race condition.. Are you scanning while you're trying to associate (or re-associate). It's a little hard to actually test this, but if you can roll me a new kernel in the next two days I'll still be in this environment and can certainly test it for you! But only through 12 noon US/PST on Friday, December 7. Steps to Reproduce: 1. try to connect to a network 2. watch the driver flail 3. lather, rinse, repeat until you get connected Actual results: eth1: Initial auth_alg=0 eth1: authenticate with AP 00:19:a9:45:2f:a1 eth1: Initial auth_alg=0 eth1: authenticate with AP 00:19:a9:45:2f:a1 eth1: Initial auth_alg=0 eth1: authenticate with AP 00:19:a9:45:2f:a1 eth1: authenticate with AP 00:19:a9:45:2f:a1 eth1: authenticate with AP 00:19:a9:45:2f:a1 eth1: authentication with AP 00:19:a9:45:2f:a1 timed out Expected results: eth1: authenticate with AP 00:1c:b0:e6:d3:21 eth1: RX authentication from 00:1c:b0:e6:d3:21 (alg=0 transaction=2 status=0) eth1: authenticated eth1: associate with AP 00:1c:b0:e6:d3:21 eth1: RX ReassocResp from 00:1c:b0:e6:d3:21 (capab=0x1 status=0 aid=63) eth1: associated Additional info: http://mail.gnome.org/archives/networkmanager-list/2007-December/msg00087.html http://marc.info/?l=linux-wireless&m=119668234926912&w=2
Would be a good patch to get in anyway; there's no guarantee when some random app can call SIOCSIWSCAN and the driver (or stack) allowing scans during association or reassociation (or even EAP exchanges) is just plain 100% broken.
The patch in question is available in the rawhide kernels here: http://koji.fedoraproject.org/koji/buildinfo?buildID=26735 I'll probably have an F8 kernel w/ the new stuff soon as well. If you get a chance in the mean time, please test the kernel above.
Any chance you could build an FC7 kernel too? I'm not running 8 or rawhide. Thanks,
Hmmm...well, it may be a while...
Define "a while". If you mean "I can't get the patch in and get a kernel rebuild until the end of the day", then that's COMPLETELY fine. If, however, if means "I wont get it it at all in the next few days", I'd ask you humbly to change your mind because I can actually test it in a live, hostile environemnt only through Friday this week.
You are the itinerant tester, aren't you? :-) I'll see what I can do -- probably tomorrow at the earliest.
Yes, I am. I do a lot of travel into various environments. :-D It also means I find lots of issues because I have access to (and experience with) a multitude of harsh environments. The IETF meeting is the best testbed you can find! Thank you. Tomorrow would be perfect.
http://koji.fedoraproject.org/koji/buildinfo?buildID=26993 Wanna try that?
So far so good. It booted (although the screen didn't come up on the first boot -- but a cold restart later and it came right up). First thing I noticed is that it took a bit longer than usual to get onto the net. I think the reason is that when the device is first started, fedora tries to ifup the device and it goes and tries to associate on it's own.. So NM can't perform a scan until the initscripts timeout, at which point the device was down and NM couldn't get a read at all. I'm in a meeting right now but I'll see what happens when I move locations in about 15 minutes or so. But the good news is that once NM got ahold of the device, it connected on the first try, whereas many times before I would need three or four attempts to connect. So it's looking promising.
Okay... I just migrated across. I did lose connectivity as I migrated, but it came back all on its own by NM. Unfortunately I got a different IP address so my TCP connections died. But this is still a better situation than it used to be! My next test will be next week, an open, non-broadcast (SSID) network! But so far 2.6.23.9-45.fc7 is looking promising!
Oh, I figured out why my IP Address changed.. when I migrated, NM jumped to a different SSID and that different SSID hands out a different range of IPs. Oops. But that's probably more of a NM issue than a driver issue, I would guess. I'll be migrating again in about 10 minutes so I'll see what happens in the other direction.
It sounds like this problem is resolved in 2.6.23.9-45.fc7...