916203 – WiFi performance tanks every 120 seconds with ath9k and kernel 3.7.6-102 and later

Bug 916203 - WiFi performance tanks every 120 seconds with ath9k and kernel 3.7.6-102 and later

Summary: WiFi performance tanks every 120 seconds with ath9k and kernel 3.7.6-102 and ...

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	17
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-02-27 14:21 UTC by Bill Gianopoulos
Modified:	2013-07-16 13:55 UTC (History)
CC List:	10 users (show)
Fixed In Version:	kernel-3.9.8-300.fc19.x86_64
Clone Of:
Environment:
Last Closed:	2013-07-16 13:55:13 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
dmesg log (70.21 KB, text/plain) 2013-03-15 11:30 UTC, Bill Gianopoulos	no flags	Details
wpa-supplicant.log file (36.08 KB, text/plain) 2013-04-16 15:23 UTC, Bill Gianopoulos	no flags	Details
New wpa_supplicant.log (116.37 KB, text/plain) 2013-04-18 23:55 UTC, Bill Gianopoulos	no flags	Details
wpa_supplicant.log with BSSID specified in Network Manager (25.38 KB, text/plain) 2013-04-24 10:52 UTC, Bill Gianopoulos	no flags	Details
wwpa_supplicant.log while doing "iwlist scanning" (30.80 KB, text/plain) 2013-04-24 10:54 UTC, Bill Gianopoulos	no flags	Details
View All

Description Bill Gianopoulos 2013-02-27 14:21:35 UTC

After updating to kernel version 3.7.6-102, I began experiencing spotty WiFi perfromance on one of my laptops that has an Atheros AR5009.  Pinging my wireless router it starts out fine like this:

PING 192.168.1.28 (192.168.1.28) 56(84) bytes of data.
64 bytes from 192.168.1.28: icmp_req=1 ttl=64 time=1.96 ms
64 bytes from 192.168.1.28: icmp_req=2 ttl=64 time=1.57 ms
64 bytes from 192.168.1.28: icmp_req=3 ttl=64 time=1.58 ms
64 bytes from 192.168.1.28: icmp_req=4 ttl=64 time=2.87 ms
64 bytes from 192.168.1.28: icmp_req=5 ttl=64 time=1.61 ms
64 bytes from 192.168.1.28: icmp_req=6 ttl=64 time=1.59 ms
64 bytes from 192.168.1.28: icmp_req=7 ttl=64 time=1.90 ms
64 bytes from 192.168.1.28: icmp_req=8 ttl=64 time=1.88 ms
64 bytes from 192.168.1.28: icmp_req=9 ttl=64 time=1.93 ms

But then you see things like this:

64 bytes from 192.168.1.28: icmp_req=328 ttl=64 time=1.80 ms
64 bytes from 192.168.1.28: icmp_req=329 ttl=64 time=1.59 ms
64 bytes from 192.168.1.28: icmp_req=330 ttl=64 time=10107 ms
64 bytes from 192.168.1.28: icmp_req=331 ttl=64 time=9107 ms
64 bytes from 192.168.1.28: icmp_req=332 ttl=64 time=8107 ms
64 bytes from 192.168.1.28: icmp_req=333 ttl=64 time=7107 ms
64 bytes from 192.168.1.28: icmp_req=334 ttl=64 time=6107 ms
64 bytes from 192.168.1.28: icmp_req=335 ttl=64 time=5107 ms
64 bytes from 192.168.1.28: icmp_req=336 ttl=64 time=4107 ms
64 bytes from 192.168.1.28: icmp_req=337 ttl=64 time=3111 ms
64 bytes from 192.168.1.28: icmp_req=338 ttl=64 time=2114 ms
64 bytes from 192.168.1.28: icmp_req=339 ttl=64 time=1114 ms
64 bytes from 192.168.1.28: icmp_req=340 ttl=64 time=114 ms
64 bytes from 192.168.1.28: icmp_req=341 ttl=64 time=1.63 ms
64 bytes from 192.168.1.28: icmp_req=342 ttl=64 time=1.64 ms

And this:

64 bytes from 192.168.1.28: icmp_req=447 ttl=64 time=2.82 ms
64 bytes from 192.168.1.28: icmp_req=448 ttl=64 time=1.88 ms
64 bytes from 192.168.1.28: icmp_req=450 ttl=64 time=9953 ms
64 bytes from 192.168.1.28: icmp_req=451 ttl=64 time=8953 ms
64 bytes from 192.168.1.28: icmp_req=452 ttl=64 time=7963 ms
64 bytes from 192.168.1.28: icmp_req=453 ttl=64 time=6963 ms
64 bytes from 192.168.1.28: icmp_req=454 ttl=64 time=5963 ms
64 bytes from 192.168.1.28: icmp_req=455 ttl=64 time=4963 ms
64 bytes from 192.168.1.28: icmp_req=456 ttl=64 time=3964 ms
64 bytes from 192.168.1.28: icmp_req=457 ttl=64 time=2965 ms
64 bytes from 192.168.1.28: icmp_req=458 ttl=64 time=1965 ms
64 bytes from 192.168.1.28: icmp_req=459 ttl=64 time=965 ms
64 bytes from 192.168.1.28: icmp_req=460 ttl=64 time=1.44 ms
64 bytes from 192.168.1.28: icmp_req=461 ttl=64 time=1.88 ms

Notice that the ~10 second responses occur exactly 120 seconds apart.

I see this problem also on another laptop with an Atheros AR9285, but it is not quite a severe as this.  Ping response only goes to about 3 seconds and recovers quicker.

I do not see this issue on 2 other laptops that use different WiFi drivers.

I see this issue with both kernel version 3.7.6-102 and 3.7.9-101.

I do not see this issue with kernel version 3.7.3-101.

Comment 1 Bill Gianopoulos 2013-02-27 19:34:11 UTC

Because my previous issues with this driver had all been related to my use of WEP, I took the extra step of switching to WPA/WPA-2 and also encryption disabled and the problem persisted regardless of the type of encryption being used (or not used).

Comment 2 Peter Kwarts 2013-03-02 14:12:22 UTC

I can confirm that this issue is still present in kernel-3.7.9-205.fc18.x86_64 , but not yet present in kernel-3.7.5-201.fc18.x86_64 (although connectivity with this kernel seems to be a lot more erratic, but I didn't do a proper comparison).

Comment 3 Peter Kwarts 2013-03-04 10:02:51 UTC

Still present in the latest kernel 3.8.1-201.fc18.x86_64 , every 120 seconds like clockwork.

64 bytes from 192.168.1.1: icmp_seq=111 ttl=64 time=2.45 ms
64 bytes from 192.168.1.1: icmp_seq=112 ttl=64 time=10.6 ms
64 bytes from 192.168.1.1: icmp_seq=124 ttl=64 time=536 ms

64 bytes from 192.168.1.1: icmp_seq=232 ttl=64 time=20.1 ms
64 bytes from 192.168.1.1: icmp_seq=244 ttl=64 time=365 ms
64 bytes from 192.168.1.1: icmp_seq=245 ttl=64 time=5.13 ms

Comment 4 Erick 2013-03-08 12:10:20 UTC

I'm not sure if this is related, but when pinging I see the following:

64 bytes from 192.168.178.1: icmp_seq=136 ttl=64 time=5.55 ms
64 bytes from 192.168.178.1: icmp_seq=137 ttl=64 time=4.30 ms
64 bytes from 192.168.178.1: icmp_seq=138 ttl=64 time=4.42 ms
64 bytes from 192.168.178.1: icmp_seq=139 ttl=64 time=4.30 ms
64 bytes from 192.168.178.1: icmp_seq=140 ttl=64 time=5.54 ms
64 bytes from 192.168.178.1: icmp_seq=146 ttl=64 time=4.38 ms
64 bytes from 192.168.178.1: icmp_seq=147 ttl=64 time=4.68 ms
64 bytes from 192.168.178.1: icmp_seq=148 ttl=64 time=5.92 ms
64 bytes from 192.168.178.1: icmp_seq=149 ttl=64 time=6.17 ms
64 bytes from 192.168.178.1: icmp_seq=150 ttl=64 time=4.18 ms
64 bytes from 192.168.178.1: icmp_seq=152 ttl=64 time=402 ms
64 bytes from 192.168.178.1: icmp_seq=158 ttl=64 time=5.25 ms

basically there are gaps in the icmp sequence of 5 packages every 5 seconds. This behaviour is quite noticeable when loading a website or viewing movies.

Comment 5 Erick 2013-03-08 12:12:23 UTC

Additional info:

kernel version: 3.8.1-201.fc18.i686.PAE
wifi driver: rt2800usb.

Comment 6 John Greene 2013-03-14 13:15:13 UTC

On these 120sec gaps, interesting.  Can someone please post the dmesg log?  Interesting its with WPA/WPA2 or WEP.  Perhaps something in the logs will help isolate.

Comment 7 Bill Gianopoulos 2013-03-15 11:30:32 UTC

Created attachment 710599 [details]
dmesg log

Attaching dmesg log as requested.  Nothing was logged during my ping test.  This is basically log messages from before i started the ping command and represents messages form boot up and log in.

Comment 8 John Greene 2013-03-15 12:35:46 UTC

Hmm..Nothing really obvious there.  Are you using the bluetooth at the same time?  Try to disable it and retest if so.  BT and wifi share antenna, lets eliminate that if so.

Comment 9 Bill Gianopoulos 2013-03-15 12:42:19 UTC

My hardware does not support Bluetooth.  I use a USB Bluetooth adapter when I need to do Bluetooth, so no shared antenna.  I did not have the Bluetooth dongle inserted during this.

Comment 10 Bill Gianopoulos 2013-03-23 16:17:13 UTC

I did some more troubleshooting of this.  I realized that if instead of pinging the gateway, I instead pinged my server, I could run tcpdump there to see if the delay was on the transmit or receive side of things.

I also did pings with an interval of 2 seconds instead of one to see if this was a 120 pings or 120 seconds issue.

The results were:

1.  It is definitely a 120 second interval issue.

2.  It is a problem with data being received.  tcpdumps on the server showed the echo requests being received at 1 second intervals (after i switched back to that interval) and the echo replies were transmitted at exactly 1 second intervals.

3.  a tcpdump on the affected laptop showed the same delay in the tcpdump data as in the ping output, so whatever the issue is it happens before tcpdump sees the data also.

I should point out that this is obviously not restricted to pings.  The original issue was that it became impossible to use the laptop to ssh to my server and edit files using vi.

I should probably also state that this still fails with linux-kernel-3.8.3-103.fc17.x86_64.

Comment 11 Bill Gianopoulos 2013-03-23 18:02:42 UTC

Oh, and it seems that if you are unfortunate in timing and run into this delay during an arp request, you can end up with commands failing with "no route to host" error.

Comment 12 John Greene 2013-04-15 13:00:55 UTC

I see this in the log:
ath9k 0000:09:00.0 wlan0: disabling HT/VHT due to WEP/TKIP use

Which in itself isn't too weird. But made me wonder: can you post output of:
modinfo ath9k

If hardware encryption is on, let's eliminate that part with s/w encryption and see if that helps.

Try this:

try the test loading ath9k with  hardware encrytption off 
I think you need
modprobe ath9k nohwcrypt=1 

parm:           nohwcrypt:Disable hardware encryption (int)

Can you test this?

Comment 13 Bill Gianopoulos 2013-04-15 16:11:32 UTC

(In reply to comment #12)
> I see this in the log:
> ath9k 0000:09:00.0 wlan0: disabling HT/VHT due to WEP/TKIP use
> 
> Which in itself isn't too weird. But made me wonder: can you post output of:
> modinfo ath9k
Here is the output of the command:

modinfo ath9k
filename:       /lib/modules/3.8.4-102.fc17.x86_64/kernel/drivers/net/wireless/ath/ath9k/ath9k.ko
license:        Dual BSD/GPL
description:    Support for Atheros 802.11n wireless LAN cards.
author:         Atheros Communications
alias:          platform:qca955x_wmac
alias:          platform:ar934x_wmac
alias:          platform:ar933x_wmac
alias:          platform:ath9k
alias:          pci:v0000168Cd00000036sv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000037sv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000034sv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000033sv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000032sv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000030sv*sd*bc*sc*i*
alias:          pci:v0000168Cd0000002Esv*sd*bc*sc*i*
alias:          pci:v0000168Cd0000002Dsv*sd*bc*sc*i*
alias:          pci:v0000168Cd0000002Csv*sd*bc*sc*i*
alias:          pci:v0000168Cd0000002Bsv*sd*bc*sc*i*
alias:          pci:v0000168Cd0000002Asv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000029sv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000027sv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000024sv*sd*bc*sc*i*
alias:          pci:v0000168Cd00000023sv*sd*bc*sc*i*
depends:        ath9k_hw,ath9k_common,mac80211,ath,cfg80211
intree:         Y
vermagic:       3.8.4-102.fc17.x86_64 SMP mod_unload 
parm:           debug:Debugging mask (uint)
parm:           nohwcrypt:Disable hardware encryption (int)
parm:           blink:Enable LED blink on activity (int)
parm:           btcoex_enable:Enable wifi-BT coexistence (int)
parm:           enable_diversity:Enable Antenna diversity for AR9565 (int)

As you can see hardware encryption is already disabled, and that could not be an issue anyway because if you read comment #1 I said I tried turning off encryption on my WiFi and that did not help either.

Comment 14 Bill Gianopoulos 2013-04-15 17:48:44 UTC

Oh I should mention that I have other systems with non Atheros WiFi and this problem is ONLY in the ath9k driver as far as I can determine.

Comment 15 John Greene 2013-04-16 14:28:25 UTC

So not hardware encrypt issue it would seem.
This device is a/g/n: do you know if you are connecting to 802.11a or 802.11g?  Not obvious from dmesg. I'm wondering if the scan for APs (a lot of channels on 802.11a and can take a long time in certain cases). Might be off-channel too long.

So a couple more ideas for now:
I know you said your hardware doesn't support BT but it appears the driver still has it enabled, i.e.
parm:           btcoex_enable:Enable wifi-BT coexistence (int)

Maybe the driver is getting long timeout waiting for no hardware.  It should always be off if you have no hardware.  Can you try btcoex_enable=0?

Also, a supplicant log would likely help here if it's a scan issue.  /var/log/wpa_supplicant.log.  May need to add a -d to command line to get helpful level of detail.  Can you please add this file?

I will mention I have seen something of this sort on RHEL with another driver using WEP, on my list to solve still.  May not be relevant, but interesting.

If nothing pops out of this, next step is likely a look at Peter's comment:
>I can confirm that this issue is still present in kernel-3.7.9-205.fc18.x86_64 , >but not yet present in kernel-3.7.5-201.fc18.x86_64 (although connectivity with >this kernel seems to be a lot more erratic, but I didn't do a proper comparison).

Comment 16 Bill Gianopoulos 2013-04-16 15:21:09 UTC

So far all i have is the wpa_supplicant.log file.  This seems to be the relevant thing.  I was doing a ping in one window and tailing the log in a second one,  the long ping response delay corresponds to when the supplicant was doing a periodic scan for available networks.

I will restrict my access point for wirelss modes to see if the mode a/b/g/n makes any difference.

As far as the bluetooth parameter goes, I am clearly doing something wrong.

I tried doing "modprobe ath9k btcoex_enable=0" and modinfo still showed it enabled.

So, I made up an ath9k file in /etc/modprobe.d which contained the line:

options ath9k btcoex_enable=0

and rebooted.  Still no joy.  Can you tell me what I am doing wrong here?

Comment 17 Bill Gianopoulos 2013-04-16 15:23:07 UTC

Created attachment 736400 [details]
wpa-supplicant.log file

Comment 18 Bill Gianopoulos 2013-04-16 15:40:36 UTC

My access point only does b g and n modes, it does not do A.  I locked the access point to each mode only and retried the ping test.  The mode at which I was associated made no difference.

Comment 19 John Greene 2013-04-17 13:11:49 UTC

The log file helps, thanks for that.  Your are on wpa_supplicant 1.0_rc3

All through the log I see this stuff:

dbus: flush_object_timeout_handler: Timeout - sending changed properties of object /fi/w1/wpa_supplicant1/Interfaces/1/BSSs/11
nl80211: Event message available

Which gets more invasive as the log goes on, crazy at the end..The log doesn't have timestamps, but all this stuff may be a big part of your 120 delays I'm thinking.  BTW: (can add timestamps to supplicant -t option, might circle back to this.) 

There is an issue on the web about this problem from a while back.
Background if you care:
https://mail.gnome.org/archives/networkmanager-list/2011-February/msg00027.html

which mentions a fix, I don't find it in latest supplicant 1.1_rc3, I need to do some checking to get updated state of this.  

Till then, would you be willing to update your supplicant package to that or later version?

Comment 20 John Greene 2013-04-17 14:38:28 UTC

Quick update: got some info that will help us decide if issue is in driver or above (supplicant/NM).

In your wireless profile editor, set the BSSID to the one of the AP you are currently connected to.  In your supplicant file its:

Current BSS: 10:6f:3f:63:81:b4 level=-52

This should eliminate much of the roaming scanning it the background likely the cause of issue you see.  Can you check that and let me know if the 120 delay improves?

Comment 21 Bill Gianopoulos 2013-04-18 23:55:06 UTC

Created attachment 737462 [details]
New wpa_supplicant.log

Information you requested.  This time it includes timestamps and the line you asked me to add to wpa-supplicant.conf.  I also changed form -d to -dd to get more detail.

HTis is a log from the entire run of the ping -c 1000 192.168.1.28 command

Note that the times that the issue occurs corresponds to these 2 lines appearing in the log file.

1366327957.788401: nl80211: Scan SSID - hexdump_ascii(len=0): [NULL]
1366327957.796279: Scan requested (ret=0) - scan timeout 30 seconds

I don't think the other errors relate to this issue.  It just seems that the periodic scanning for new SSID's by Network manger triggers this issue.

Comment 22 Bill Gianopoulos 2013-04-18 23:56:19 UTC

Oh and just in case you missed earlier comments.  I have several other configurations using different wireless devices and it is only dives using the ath9k driver that experience this issue.

Comment 23 John Greene 2013-04-19 12:33:11 UTC

(In reply to comment #21)
> Created attachment 737462 [details]
> New wpa_supplicant.log
> 
> Information you requested.  This time it includes timestamps and the line
> you asked me to add to wpa-supplicant.conf.  I also changed form -d to -dd
> to get more detail.
> 
> HTis is a log from the entire run of the ping -c 1000 192.168.1.28 command
> 
> Note that the times that the issue occurs corresponds to these 2 lines
> appearing in the log file.
> 
> 1366327957.788401: nl80211: Scan SSID - hexdump_ascii(len=0): [NULL]
> 1366327957.796279: Scan requested (ret=0) - scan timeout 30 seconds
> 
> I don't think the other errors relate to this issue.  It just seems that the
> periodic scanning for new SSID's by Network manger triggers this issue.

Thanks for the log.  You are correct about scan timeout 30 seconds.  That is just the timer doing the periodic scan starting the scan.  Scans are taking about 4 seconds in this log, way too little to cause much pain.

The most concerning thing is the stuff at the very end of the log (flurry of nl80211 message apparently not handled or recognized).  The log terminates but takes 94 seconds for what is there.  This is in the realm of the delay you see.

Have you been able to try the test in comment 20? 

It doesn't have to be that exact AP, any single AP will do.  Set the BSSID to any single one in the profile, then run your ping test.
This eliminates the scanning part of this issue, one way or other.

Can you do that please?

Comment 24 Bill Gianopoulos 2013-04-19 13:36:59 UTC

(In reply to comment #23)
> (In reply to comment #21)
> > Created attachment 737462 [details]
> > New wpa_supplicant.log
> > 
> > Information you requested.  This time it includes timestamps and the line
> > you asked me to add to wpa-supplicant.conf.  I also changed form -d to -dd
> > to get more detail.
> > 
> > HTis is a log from the entire run of the ping -c 1000 192.168.1.28 command
> > 
> > Note that the times that the issue occurs corresponds to these 2 lines
> > appearing in the log file.
> > 
> > 1366327957.788401: nl80211: Scan SSID - hexdump_ascii(len=0): [NULL]
> > 1366327957.796279: Scan requested (ret=0) - scan timeout 30 seconds
> > 
> > I don't think the other errors relate to this issue.  It just seems that the
> > periodic scanning for new SSID's by Network manger triggers this issue.
> 
> Thanks for the log.  You are correct about scan timeout 30 seconds.  That is
> just the timer doing the periodic scan starting the scan.  Scans are taking
> about 4 seconds in this log, way too little to cause much pain.
> 
> The most concerning thing is the stuff at the very end of the log (flurry of
> nl80211 message apparently not handled or recognized).  The log terminates
> but takes 94 seconds for what is there.  This is in the realm of the delay
> you see.
> 
> Have you been able to try the test in comment 20? 
> 
> It doesn't have to be that exact AP, any single AP will do.  Set the BSSID
> to any single one in the profile, then run your ping test.
> This eliminates the scanning part of this issue, one way or other.
> 
> Can you do that please?

This is part of what I don't understand.  I had that line in the wpa_supplicant.conf file during that last log collection, and it is still doing the scan.  I am not sure why.  I added that line then rebooted with the affected kernel and captured that log.

Comment 25 Bill Gianopoulos 2013-04-19 14:21:56 UTC

I just double checked by trying again and with this configuration in /etc/wpa_supplicant/wpa_supplicant.conf:

ctrl_interface=/var/run/wpa_supplicant
ctrl_interface_group=wheel
Current BSS: 10:6f:3f:63:81:b4 level=-52

It still scans every 120 seconds.

Comment 26 Bill Gianopoulos 2013-04-19 14:27:41 UTC

Is there something I also need to put in the NetowrkManager.conf file to stop it from initiating WIFI scans?

Comment 27 John Greene 2013-04-23 14:27:12 UTC

For our purposes you you should be able to do it where you create the connection in the GUI. (Assuming you are using one.)

In KDE for instance, click network manager icon, select wireless.  There should be several profiles in there (which NM will use).  Edit the one of interest, you should see a BSSID field.  Just add the BSSID of the AP you are connected to in this field.  You may need to reconnect afterward.

Gnome has similar setup I believe.  Does this help?

Comment 28 Bill Gianopoulos 2013-04-24 10:52:33 UTC

Created attachment 739388 [details]
wpa_supplicant.log with BSSID specified in Network Manager

With the BSSID specified in Network Manger, it does not do the scans and the performance issue disappears.

Here are the statistics for 1000 pings.  As you can see the longest time was ~17 milliseconds as opposed to the previous ~10 seconds.

--- 192.168.1.28 ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 1000276ms
rtt min/avg/max/mdev = 1.369/1.984/17.282/0.771 ms

Comment 29 Bill Gianopoulos 2013-04-24 10:54:45 UTC

Created attachment 739389 [details]
wwpa_supplicant.log while doing "iwlist scanning"

Here is a log file for a period during which I was using the "iwlist scanning" command to force scans, which caused the ping issue every time I forced a scan.

Comment 30 John Greene 2013-04-24 12:08:44 UTC

As to the stopping scans you really can't stop them or you will never find or connect to anything, the trick is the balance of how many.

From the log it appears the scan occur about every 23 seconds.  If you aren't concerned about being mobile at this location this solution would work fine it appears.  

Those ping times are reasonable for most thing: the question is does that fit the need of your real work, not just pings now?  If so, we can close your issue.

I'll check around for new fixes at this area when I get time.

Comment 31 Bill Gianopoulos 2013-04-24 12:17:17 UTC

I am running fine using kernel 3.8.4-102.fc17.x86_64 with this workaround.

This works for me now.

Comment 32 John Greene 2013-04-25 17:30:49 UTC

Good deal. Would like a cleaner solution that would focus on less invasive scanning by NM, etc.  That said, I'll look at the flurry of nl80211 msgs I found at the end of logs more from kernel perspective.. Not sure driver/kernel  is totally innocent either.

Bill: is the workaround good enough that this issue can be closed for you?

Comment 33 Bill Gianopoulos 2013-04-30 00:36:48 UTC

Well you are asking 2 different questions here.  This does fix the issue for me because the 2 devices i have the  issue with are not really mobile so they are at 2 different locations where they only ever have to connect to a single AP.  The problem is for the general case where people using this driver actually depend on being able to roam to varyi9ng public WiFi hotspots.  In that scenario this is entirely unworkable.

Comment 34 John Greene 2013-05-03 14:40:52 UTC

Yes, it is two different questions. I intended that ;)
I agree it's only a workaround for mobile users and an irritating one at that.  Mostly wanted to get you workable and shrink the problem domain.  Which it did..

Will leave this open a place holder for the possible kernel issue issue for me.  

Re-read this today: you might consider updating the wpa_supplicant, NetworkManager, crda packages: they are all outside the kernel and may give you some better relief.  Did you try this and I just missed it?

Comment 35 Bill Gianopoulos 2013-05-03 15:34:01 UTC

Problem is if it is not a package I will then get updates for is maybe something i can figure out how to do once, but if no one is maintaining packages that auto update for fedora 17 I might be stuck with things breaking later on.  I am very busy at this point on my real job and really don't have a lot of time to spend on figuring out how to keep this updated versus other updates that get auto-updated.

Comment 36 Bill Gianopoulos 2013-05-04 15:40:17 UTC

There seem to be new packages in testing for fedora 18
i will try upgrading to fc18 and then loading the ones from testing

Comment 37 John Greene 2013-05-06 18:21:58 UTC

I did look at Network manager config file, no much jumped out to help you there either.
As to figuring out what is out there, it's pretty easy for these guys:
 wanna take a shot at it? here you go. 

I'm on F17 still, latest:

yum list crda

Installed Packages
crda.x86_64                             1.1.3_2013.02.13-2.fc17  

yum list NetworkManager

NetworkManager.x86_64                           1:0.9.6.4-3.fc17


$yum list wpa_supplicant

wpa_supplicant.x86_64                             1:1.0-1.fc17             

What is available for you?
yum list available NetworkManager wpa_supplicant crda

and what ever suits you
yum update <package>

Again, this might help: still plan work to see if driver might have issues "soon".

Comment 38 Fedora End Of Life 2013-07-04 06:01:32 UTC

This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 39 John Greene 2013-07-08 18:01:59 UTC

Anything new Bill?  Have you tried new packages?   It's been a couple months, will close this if it's fixed.

Comment 40 Bill Gianopoulos 2013-07-09 01:05:27 UTC

(In reply to John Greene from comment #39)
> Anything new Bill?  Have you tried new packages?   It's been a couple
> months, will close this if it's fixed.

I am unable to reproduce this issue under fedora 19 with kernel 3.9.9-301.fc19.x86_64.

Comment 41 John Greene 2013-07-09 12:41:53 UTC

That's great news.. I'll close the bug as fixed in 3.9.9-301.fc19.  Can you let me know for the record whether you believe any user side packages above were updated and had some bearing on the solution or was it just a kernel update?

There are 213 changes in ath9k from 3.7 to 3.9 alone, 105 fromn 3.8 forward.. 
It's hard to guess what might have addressed this kernel side without more testing.

Can you let me know for the record whether you believe any user side packages above were updated and had some bearing on the solution or was it just a kernel update?  Thanks Bill.

Comment 42 Bill Gianopoulos 2013-07-14 17:01:18 UTC

(In reply to John Greene from comment #41)
> That's great news.. I'll close the bug as fixed in 3.9.9-301.fc19.  Can you
> let me know for the record whether you believe any user side packages above
> were updated and had some bearing on the solution or was it just a kernel
> update?
> 
> There are 213 changes in ath9k from 3.7 to 3.9 alone, 105 fromn 3.8
> forward.. 
> It's hard to guess what might have addressed this kernel side without more
> testing.
> 
> Can you let me know for the record whether you believe any user side
> packages above were updated and had some bearing on the solution or was it
> just a kernel update?  Thanks Bill.

OK more information.  It is definitely just a difference in kernel versions.  With the same tools I tested with the 3 kernels I still had available on the most affected system and narrowed this down a bit further.  All testing was using the same tools.  The only change was the kernel version.

kernel-3.9.6-200.fc18.x86_64 - affected
kernel-3.9.8-300.fc19.x86_64 - not affected

Comment 43 John Greene 2013-07-16 13:55:13 UTC

Bill, 
That's great work! Thanks for the info.  I'll certainly look into this.  I'm curious.

Closing Fixed->next release.
Changing the fixed version to kernel-3.9.8-300.fc19.x86_64, which is earlier than the 3.9.9-301 originally thought, should help isolate issues for future interests.

Note You need to log in before you can comment on or make changes to this bug.