Bug 487183 - NM iwl3945 wpa2-psk, doesn't always connect (rawhide)
NM iwl3945 wpa2-psk, doesn't always connect (rawhide)
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: iwl3945-firmware (Show other bugs)
rawhide
All Linux
low Severity medium
: ---
: ---
Assigned To: Dan Williams
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-24 12:37 EST by Nigel Jones
Modified: 2009-03-06 05:39 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-03-06 05:39:37 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
syslogd log (23.46 KB, text/plain)
2009-02-24 12:37 EST, Nigel Jones
no flags Details
wpa supplicant log from same time period (53.49 KB, application/octet-stream)
2009-02-24 13:03 EST, Nigel Jones
no flags Details

  None (edit)
Description Nigel Jones 2009-02-24 12:37:44 EST
Created attachment 333065 [details]
syslogd log

Description of problem:
NetworkManager doesn't always connect to Network

Version-Release number of selected component (if applicable):
NetworkManager-0.7.0.97-4.git20090219.fc11.i586
wpa_supplicant-0.6.7-3.fc11.i386
kernel-PAE-2.6.29-0.137.rc5.git4.fc11.i686


How reproducible:
Intermittent

Steps to Reproduce:

Either

1. Resume suspended laptop
2. wait for NM to connect to previously selected SSID
3. observe that NM fails & pops up request for secrets
or

1. Start with NM already connected to an AP
2. Select an alternate (but correctly configured, and available) AP
3. observe that NM fails &^ pops up secrets request
  
Actual results:
NM fails to connect

Expected results:
NM connects

Additional info:

Basically sometimes this works, sometimes it doesn't. Association appears to time out. Other non-linux clients (S60/Nokia N95 8Gb/Nokia N96/Windows 7/Windows Vista) don't *appear* to have a problem

In the log file attached note the "jonesn: " messages in the log. These were created with the "logger" command to aid in documenting the scenario

The Ap that I had been connected to ok was "planetf1c". The one that failed was "planetf1f". Ignore "planetf1e" -- this one is not active and I clicked on it in the GUI by mistake.

Error seems to hinge around
Activation (wlan0/wireless): association took too long.



Security in use is WPA2-PSK AES
wifi driver is iwl3945

Both AP are "fonera" routers running openwrt with atheros wireless. In this config the router broadcasts 2 SSIDs.

Same problem has been observed with ddwrt (single SSID)

Log file is attached (kernel debug log, -dddt )

Not entirely clear if the issue is supplicant, NM or driver.....

Final note:
I am also seeing issues with WPA Enterprise (LEAP and EAP-TLS) but figure it makes sense to address the simpler PSK case first (less variables, more control home vs enterprise...)
Comment 1 Nigel Jones 2009-02-24 12:45:02 EST
I tried restarting NM, and the problem persisted... ie once we get into this state the "bad" AP cannot be connected to.

I then stopped NM and unloaded the iwl3945 module, followed by reloading/restarting

NM started and connected to the "good" AP just fine (c). I then clicked on the "bad" AP in the GUI (f) and again it connected perfectly.

So is this some form of driver blacklisting/bad state info in iwl3945 driver?

From syslog during driver load:
iwl3945 0000:03:00.0: PCI INT A disabled
iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux, 1.2.26kds
iwl3945: Copyright(c) 2003-2008 Intel Corporation
iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
iwl3945 0000:03:00.0: setting latency timer to 64
iwl3945: Tunable channels: 13 802.11bg, 23 802.11a channels
iwl3945: Detected Intel Wireless WiFi Link 3945ABG
wmaster0 (iwl3945): not using net_device_ops yet
phy1: Selected rate control algorithm 'iwl-3945-rs'
wlan0 (iwl3945): not using net_device_ops yet
iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
iwl3945 0000:03:00.0: irq 31 for MSI/MSI-X
iwl3945 0000:03:00.0: firmware: requesting iwlwifi-3945-2.ucode
iwl3945 loaded firmware version 15.28.2.8
Registered led device: iwl-phy1:radio
Registered led device: iwl-phy1:assoc
Registered led device: iwl-phy1:RX
Registered led device: iwl-phy1:TX
Comment 2 Nigel Jones 2009-02-24 13:03:25 EST
Created attachment 333072 [details]
wpa supplicant log from same time period 

Added wpa supplicant log -- hopefully from correct times!

Also adding
options iwl3945 debug=0x0000bc88
for next time (43fff was too verbose)
Comment 3 Nigel Jones 2009-02-24 14:11:14 EST
From full wpa_supplicant log it appears that Association timeout is the issue -- then causing blacklisting.

1235497219.167758: Selecting BSS from priority group 0
1235497219.167805: Try to find WPA-enabled AP
1235497219.167852: 0: 00:18:84:2a:74:67 ssid='planetf1f' wpa_ie_len=0 rsn_ie_len=20 caps=0x11
1235497219.167921:    skip - disabled
1235497219.167973:    selected based on RSN IE
1235497219.168020:    selected WPA AP 00:18:84:2a:74:67 ssid='planetf1f'
1235497219.168079: Trying to associate with 00:18:84:2a:74:67 (SSID='planetf1f' freq=2412 MHz)
1235497219.168131: Cancelling scan request
1235497219.168184: WPA: clearing own WPA/RSN IE
1235497219.168244: Automatic auth_alg selection: 0x1
1235497219.168304: RSN: using IEEE 802.11i/D9.0
1235497219.168352: WPA: Selected cipher suites: group 16 pairwise 16 key_mgmt 2 proto 2
1235497219.168403: WPA: clearing AP WPA IE
1235497219.168449: WPA: set AP RSN IE - hexdump(len=22): 30 14 01 00 00 0f ac 04 01 00 00 0f ac 04 01 00 00 0f ac 02 00 00
1235497219.168509: WPA: using GTK CCMP
1235497219.168557: WPA: using PTK CCMP
1235497219.168605: WPA: using KEY_MGMT WPA-PSK
1235497219.168652: WPA: Set own WPA IE default - hexdump(len=22): 30 14 01 00 00 0f ac 04 01 00 00 0f ac 04 01 00 00 0f ac 02 00 00
1235497219.168727: No keys have been configured - skip key clearing
1235497219.168772: wpa_driver_wext_set_drop_unencrypted
1235497219.168822: State: SCANNING -> ASSOCIATING
1235497219.168998: wpa_driver_wext_set_operstate: operstate 0->0 (DORMANT)
1235497219.169088: WEXT: Operstate: linkmode=-1, operstate=5
1235497219.169164: wpa_driver_wext_associate
1235497219.169249: wpa_driver_wext_set_psk
1235497219.184949: Setting authentication timeout: 20 sec 0 usec
1235497219.186846: EAPOL: External notification - EAP success=0
1235497219.186871: EAPOL: External notification - EAP fail=0
1235497219.186894: EAPOL: External notification - portControl=Auto
1235497219.186933: RSN: Ignored PMKID candidate without preauth flag
1235497219.186962: RTM_NEWLINK: operstate=0 ifi_flags=0x1003 ([UP])
1235497219.186986: RTM_NEWLINK, IFLA_IFNAME: Interface 'wlan0' added
1235497219.187009: Wireless event: cmd=0x8b04 len=12
1235497219.785878: RTM_NEWLINK: operstate=0 ifi_flags=0x1003 ([UP])
1235497219.786026: RTM_NEWLINK, IFLA_IFNAME: Interface 'wlan0' added
1235497219.786082: Wireless event: cmd=0x8b15 len=20
1235497219.786137: Wireless event: new AP: 00:00:00:00:00:00
1235497219.786196: Added BSSID 00:18:84:2a:74:67 into blacklist
1235497219.786256: CTRL-EVENT-DISCONNECTED - Disconnect event - remove keys
1235497219.786309: wpa_driver_wext_set_key: alg=0 key_idx=0 set_tx=0 seq_len=0 key_len=0
1235497219.786382: wpa_driver_wext_set_key: alg=0 key_idx=1 set_tx=0 seq_len=0 key_len=0
1235497219.786448: wpa_driver_wext_set_key: alg=0 key_idx=2 set_tx=0 seq_len=0 key_len=0
1235497219.786512: wpa_driver_wext_set_key: alg=0 key_idx=3 set_tx=0 seq_len=0 key_len=0
1235497219.786576: wpa_driver_wext_set_key: alg=0 key_idx=0 set_tx=0 seq_len=0 key_len=0
1235497219.786721: Driver did not support SIOCSIWENCODEEXT
1235497219.786791: State: ASSOCIATING -> DISCONNECTED
1235497219.787000: wpa_driver_wext_set_operstate: operstate 0->0 (DORMANT)
Comment 4 Nigel Jones 2009-02-25 01:24:28 EST
Tried with disable_hw_scan=1, but didn't appear to prevent the problem. Occurred again this am. Other clients ok. Recycle (of iwl3945+NM) fixed. Kernel at 2.6.29-0.145.rc6.fc11.i686.PAE

Interesting section in debug log
Feb 25 06:12:59 snowdon NetworkManager: <info>  (wlan0): supplicant connection state:  scanning -> associating
Feb 25 06:12:59 snowdon kernel: iwl3945: U iwl3945_add_station Add STA ID 24: ff:ff:ff:ff:ff:ff
Feb 25 06:12:59 snowdon kernel: wlan0: direct probe to AP 00:18:84:2a:74:67 try 1
Feb 25 06:13:00 snowdon kernel: wlan0: direct probe to AP 00:18:84:2a:74:67 try 2
Feb 25 06:13:00 snowdon kernel: wlan0: direct probe to AP 00:18:84:2a:74:67 try 3
Feb 25 06:13:00 snowdon NetworkManager: <info>  (wlan0): supplicant connection state:  associating -> disconnected
Feb 25 06:13:00 snowdon kernel: wlan0: direct probe to AP 00:18:84:2a:74:67 timed out
Feb 25 06:13:08 snowdon NetworkManager: <info>  wlan0: link timed out.
Feb 25 06:13:08 snowdon NetworkManager: <info>  (wlan0): supplicant connection state:  disconnected -> scanning

So "direct probe to AP XXX timed out"?
Comment 5 Nigel Jones 2009-02-25 01:25:42 EST
Moving to iwl3945 -- don't believe this to be a NM issue?
Comment 6 Nigel Jones 2009-03-02 15:51:53 EST
With

kernel-PAE-2.6.29-0.176.rc6.git5.fc11.i686
& a change to /etc/modprobe.d/iwl3945
options iwl3945 disable_hw_scan=1

don't appear to be getting issues.

Will remove the option to disable hw scan at next reboot to see which change fixed it.
Comment 7 Nigel Jones 2009-03-02 15:52:56 EST
My error. HW scan is already reenabled - currently have
options iwl3945 debug=0x0000bc88

so kernel fix may have addressed this. will confirm in a few more days.
Comment 8 Nigel Jones 2009-03-06 05:39:37 EST
closing. no reoccurances

Note You need to log in before you can comment on or make changes to this bug.