Red Hat Bugzilla – Bug 477821
mac80211 gets in association loop spat with wpa_supplicant
Last modified: 2009-12-18 02:23:09 EST
Created attachment 327779 [details]
Set the station state to DISABLED when locally deauthentication/disassociating
Description of problem:
Once per second, my wireless adapter associates with an AP and is immediately disassociated. This loop continues endlessly unless a timing race is won.
More information is available in bug 472248, comment #15, but the gist of it is that NetworkManager sets the ESSID of the iwl4965 (iwlagn driver) to NULL, and sets flag to 1 (disable/any). mac80211 auto-associates with an AP (as there is only one ESSID being broadcast) and emits a WEXT event to let wpa_supplicant know about it. wpa_supplicant doesn't have a config for this network and disassociates. ieee80211_sta_disassociate() doesn't set the station state to DISABLED, so when the de-auth packet comes in from the AP, a timer is set up to fire in IEEE80211_RETRY_AUTH_INTERVAL jiffies (1 second) and we reauth/reassoc and the loop starts again.
Version-Release number of selected component (if applicable):
Easily reproducible on boot. Even if it manages to win the race right off, just executing 'iwconfig wlan0 essid any' will cause it to start again.
Steps to Reproduce:
1. Be in an environment with multiple APs, using Cisco's 'Guest SSID mode'. Broadcast an open SSID, and desire to connect to a WPA secured SSID that is hidden.
2. Start NetworkManager and watch dmesg for the loop. If it fails to start looping, trying 'iwconfig wlan0 essid any' will usually kick the loop off.
Looping requests once per second.
There are logs of this in action in bug 472248, attachment 327604 [details].
This looks to be fixed in 2.6.28-rc1 and beyond due to commit aa458 by Tomas Winkler. It may have been fixed earlier, but fighting git gui blame to show earlier commits has been unproductive.
The attached patch seems to fix the problem. It sets the station state to DISABLED when we locally decide to disassociate or deauthenticate. The second hunk is enough to fix my specific problem, but the first hunk makes sense as well. The stack should probably be checked to see if other areas need the same attention.
Part of the problem here is wpa_supplicant not telling the kernel to disassociate completely; can you try out the package in:
those should solve the looping association problem; the supplicant will *really* try to stop the driver assocating.
Ugh; I lied; that update doesn't have the patch because I was going to let it sit in Rawhide for a bit. Do you mind rebuilding a test supplicant RPM?
That will contain the disassociation patch.
Created attachment 332248 [details]
wpa_supplicant log of loop
Nulling out the base station and setting the SSID to a random string seems like using a tank to kill a fly... but I suppose that defence in depth is a good practice, and you need to work with old kernels as well.
In any event, the patch doesn't help, as the kernel/wpa_supplicant gets into the loop before wpa_supplicant ever thinks it is associated, so it never calls wpa_driver_wext_disassociate, per the attached debug log.
So, I think we're perhaps avoiding the disassociate part from wpa_supplicant, and it seems the kernel is quite capable of getting itself into a loop on its own. Fortunately, I hear a 2.6.28 kernel is on the way for F10, so that particular problem should be fixed soon.
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '10'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 10's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 10 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
So if the driver was already in that loop when the supplicant started then it wont' be able to break out of it, but installing that RPM and rmmod/modprobe or a reboot should fix the issue since the supplicant will forever after clobber the driver when disconnecting, and the driver won't get into this loop-spam.
In any case, I think this is fixed in F11 and later, and hopefully with F10's wpa_supplicant-0.6.4-5?
If I recall correctly, the driver was _not_ in the loop before wpa_supplicant started -- wpa_supplicant was the initiator of the loop. I've avoided using wireless on this laptop at work due to the issue, so it's been a while.
I'll try this out once I upgrade to F12, but this bug can be closed -- I'll reopen if it recurs.
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.