Bug 477821 - mac80211 gets in association loop spat with wpa_supplicant
Summary: mac80211 gets in association loop spat with wpa_supplicant
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 10
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-12-23 23:08 UTC by David Dillow
Modified: 2009-12-18 07:23 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-12-18 07:23:09 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Set the station state to DISABLED when locally deauthentication/disassociating (590 bytes, patch)
2008-12-23 23:08 UTC, David Dillow
no flags Details | Diff
wpa_supplicant log of loop (34.76 KB, text/plain)
2009-02-17 16:49 UTC, David Dillow
no flags Details

Description David Dillow 2008-12-23 23:08:43 UTC
Created attachment 327779 [details]
Set the station state to DISABLED when locally deauthentication/disassociating

Description of problem:
Once per second, my wireless adapter associates with an AP and is immediately disassociated. This loop continues endlessly unless a timing race is won.

More information is available in bug 472248, comment #15, but the gist of it is that NetworkManager sets the ESSID of the iwl4965 (iwlagn driver) to NULL, and sets flag to 1 (disable/any). mac80211 auto-associates with an AP (as there is only one ESSID being broadcast) and emits a WEXT event to let wpa_supplicant know about it. wpa_supplicant doesn't have a config for this network and disassociates. ieee80211_sta_disassociate() doesn't set the station state to DISABLED, so when the de-auth packet comes in from the AP, a timer is set up to fire in IEEE80211_RETRY_AUTH_INTERVAL jiffies (1 second) and we reauth/reassoc and the loop starts again.


Version-Release number of selected component (if applicable):
kernel-2.6.27.7-134.fc10.x86_64
wpa_supplicant-0.6.4-2.fc10.x86_64


How reproducible:
Easily reproducible on boot. Even if it manages to win the race right off, just executing 'iwconfig wlan0 essid any' will cause it to start again.


Steps to Reproduce:
1. Be in an environment with multiple APs, using Cisco's 'Guest SSID mode'. Broadcast an open SSID, and desire to connect to a WPA secured SSID that is hidden.
2. Start NetworkManager and watch dmesg for the loop. If it fails to start looping, trying 'iwconfig wlan0 essid any' will usually kick the loop off.
3.
  
Actual results:
Looping requests once per second.


Expected results:
No looping.


Additional info:
There are logs of this in action in bug 472248, attachment 327604 [details].
https://bugzilla.redhat.com/attachment.cgi?id=327604

This looks to be fixed in 2.6.28-rc1 and beyond due to commit aa458 by Tomas Winkler. It may have been fixed earlier, but fighting git gui blame to show earlier commits has been unproductive.

The attached patch seems to fix the problem. It sets the station state to DISABLED when we locally decide to disassociate or deauthenticate. The second hunk is enough to fix my specific problem, but the first hunk makes sense as well. The stack should probably be checked to see if other areas need the same attention.

Comment 1 Dan Williams 2009-02-13 12:52:01 UTC
Part of the problem here is wpa_supplicant not telling the kernel to disassociate completely; can you try out the package in:

https://admin.fedoraproject.org/updates/F10/FEDORA-2009-1333

those should solve the looping association problem; the supplicant will *really* try to stop the driver assocating.

Comment 2 Dan Williams 2009-02-13 12:57:31 UTC
Ugh; I lied; that update doesn't have the patch because I was going to let it sit in Rawhide for a bit.  Do you mind rebuilding a test supplicant RPM?

http://bigw.org/~dan/wpa_supplicant-0.6.4-3.fc10.src.rpm

That will contain the disassociation patch.

Comment 3 David Dillow 2009-02-17 16:49:06 UTC
Created attachment 332248 [details]
wpa_supplicant log of loop

Nulling out the base station and setting the SSID to a random string seems like using a tank to kill a fly... but I suppose that defence in depth is a good practice, and you need to work with old kernels as well.

In any event, the patch doesn't help, as the kernel/wpa_supplicant gets into the loop before wpa_supplicant ever thinks it is associated, so it never calls wpa_driver_wext_disassociate, per the attached debug log.

So, I think we're perhaps avoiding the disassociate part from wpa_supplicant, and it seems the kernel is quite capable of getting itself into a loop on its own. Fortunately, I hear a 2.6.28 kernel is on the way for F10, so that particular problem should be fixed soon.

Comment 4 Bug Zapper 2009-11-18 10:33:27 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 5 Dan Williams 2009-11-18 20:44:27 UTC
So if the driver was already in that loop when the supplicant started then it wont' be able to break out of it, but installing that RPM and rmmod/modprobe or a reboot should fix the issue since the supplicant will forever after clobber the driver when disconnecting, and the driver won't get into this loop-spam.

In any case, I think this is fixed in F11 and later, and hopefully with F10's wpa_supplicant-0.6.4-5?

Comment 6 David Dillow 2009-11-19 05:29:53 UTC
If I recall correctly, the driver was _not_ in the loop before wpa_supplicant started -- wpa_supplicant was the initiator of the loop. I've avoided using wireless on this laptop at work due to the issue, so it's been a while.
I'll try this out once I upgrade to F12, but this bug can be closed -- I'll reopen if it recurs.

Comment 7 Bug Zapper 2009-12-18 07:23:09 UTC
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.