Bug 758543 - Wireless disconnects under load on Acer Aspire One 150Aw (Atheros AR242x / AR542x using ath5k)
Summary: Wireless disconnects under load on Acer Aspire One 150Aw (Atheros AR242x / A...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 18
Hardware: i686
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-30 01:50 UTC by Alexander Ploumistos
Modified: 2013-03-04 18:17 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-10-24 13:20:04 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Related /var/log/messages segment (51.30 KB, text/plain)
2011-11-30 01:52 UTC, Alexander Ploumistos
no flags Details
dmesg output (15.71 KB, text/plain)
2011-11-30 01:59 UTC, Alexander Ploumistos
no flags Details
ath_info info (6.16 KB, text/plain)
2011-11-30 16:45 UTC, Alexander Ploumistos
no flags Details
dmesg output 2 (60.57 KB, text/plain)
2011-11-30 16:46 UTC, Alexander Ploumistos
no flags Details
dmesg output 3 (48.60 KB, text/plain)
2011-12-14 18:53 UTC, Alexander Ploumistos
no flags Details
dmesg output 4 (23.29 KB, text/plain)
2011-12-15 01:54 UTC, Alexander Ploumistos
no flags Details
Dmesg (105.95 KB, application/x-bzip)
2011-12-16 14:37 UTC, Tommi Tervo
no flags Details
kernel-3.2.5-2.bz758543.1.fc16.i686 test & comparison (23.73 KB, text/plain)
2012-02-08 02:39 UTC, Alexander Ploumistos
no flags Details

Description Alexander Ploumistos 2011-11-30 01:50:31 UTC
User-Agent:       Mozilla/5.0 (X11; Linux i686; rv:8.0) Gecko/20100101 Firefox/8.0

Whenever the wireless connection is under heavy load (e.g. streaming video, downloading many packages with yum, etc.) it gets disconnected. Running a clean install of Fedora 16. There was no such problem with versions up to Fedora 14. I'm not sure if it is the ath5k module or something else at fault. A Backtrack 4 installation on another partition with kernel 2.6.30 does not suffer from this. The system is an Acer Aspire One 150Aw netbook.

Reproducible: Always

Steps to Reproduce:
1. Connect over wireless.
2. Download something big, e.g. wget http://download.fedoraproject.org/pub/fedora/linux/releases/16/Live/i686/Fedora-16-i686-Live-Desktop.iso
Actual Results:  
After a while, the connection is dropped.

Two different security dialogs appear, one in a black-themed pop-up, the other in a gray one, saying that a key is needed to connect to the access point. Their order of appearance is random and even if I do not click on "connect" on either of them, the connection will be reestablished on its own, until I stress it again.

Expected Results:  
The connection should remain established regardless of traffic.

I've been getting this problem with every kernel version that shipped with Fedora 16. No such problem up to Fedora 14.

I couldn't get anything useful out of ath_info, just this (maybe I'm using it wrong):

# ath_info -v wlan0
#DBG main: sleep_ctl reg a5a4a5a5   reset_ctl reg 00000000
MAC revision 0x5a5a is not supported!

I'll attach /var/log/messages and dmesg output from the moment that the connection was lost.

Comment 1 Alexander Ploumistos 2011-11-30 01:52:49 UTC
Created attachment 538357 [details]
Related /var/log/messages segment

Comment 2 Alexander Ploumistos 2011-11-30 01:59:13 UTC
Created attachment 538360 [details]
dmesg output

Comment 3 John W. Linville 2011-11-30 15:24:25 UTC
That ath_info output looks really suspicious -- maybe it just doesn't understand this hardware?  Not sure...

Those code = 12 lines map to WLAN_STATUS_ASSOC_DENIED_UNSPEC, so not a lot of info coming from the AP.

I'll copy the ath5k guys, in case they have some insight...?

Comment 4 Alexander Ploumistos 2011-11-30 15:32:10 UTC
The AP mentioned in the logs is hidden, but I've got the same problem with every AP I've tried so far, either open access ones or WPA/WPA2 (no WEP). I do not think though that the AP is to blame, moments before the installation I downloaded the Fedora iso on my AAO running Fedora 14.

I'll try another 3.x series kernel from some other distro and see how that goes.

Comment 5 Nick Kossifidis 2011-11-30 15:51:26 UTC
Hello ;-)

a) From ath_info's README:

First compile...

gcc ath_info.c -o ath_info

then find card's physical address

lspci -v

02:02.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC (rev 01)
        Subsystem: Fujitsu Limited. Unknown device 1234
        Flags: bus master, medium devsel, latency 168, IRQ 23
        Memory at c2000000 (32-bit, non-prefetchable) [size=64K]
        Capabilities: [44] Power Management version 2

address here is 0xc2000000

load madwifi-ng/madwifi-old/ath5k if not already loaded (be sure the
interface is down!)

OR

call:
setpci -s 02:02.0 command=0x41f cache_line_size=0x10

to enable access to the PCI device.

and we run the thing...

./ath_info 0xc2000000


b) Have you tried loading the module with the nohwcrypt parameter ?

c) Can you please re-send your dmesg output including the part when ath5k loads (to see srev values etc) ?

Comment 6 Alexander Ploumistos 2011-11-30 16:45:03 UTC
Created attachment 538645 [details]
ath_info info

Comment 7 Alexander Ploumistos 2011-11-30 16:46:20 UTC
Created attachment 538651 [details]
dmesg output 2

No stressing this time, dmesg contains everything from boot to connection establishment.

Comment 8 Alexander Ploumistos 2011-11-30 17:03:24 UTC
This is embarassing, but at the time it didn't cross my mind that by device address it meant memory address...

I had read about the nohwcrypt=1 tweak in F15 common bugs, but I never installed Fedora 15 on my AAO and I thought it had to do with the ath9k driver and slow transfer speeds. I'll try later tonight or tomorrow another 3.x kernel and then the nohwcrypt thingy.

If I remember correctly, since Fedora 11, the only problem I 've ever had with the wireless was the status light that wasn't blinking in the first versions of the driver.

Comment 9 Alexander Ploumistos 2011-12-01 11:42:41 UTC
No problems with Ubuntu's kernel 3.0.0-12. I checked ath5k parameters and they were identical to those used by my current kernel (all_channels, fastchanswitch and nohwcrypt all disabled). The problem has been encountered in every F16 kernel so far.

I tried the nohwcrypt option and that didn't go well; CPU usage hit the ceiling, download speed dropped from ~1.5MB/sec to ~150KB/sec and within seconds I got hundreds of these: https://bugzilla.redhat.com/show_bug.cgi?id=759063
I also got a few of the errors mentioned in bugs 759066 & 759068, which I suspect are identical to the one above, but abrt couldn't keep up with their rate of occurrence.

Next I'll try to get NetworkManager out of the equation and see how that goes.

Comment 10 Alexander Ploumistos 2011-12-02 01:41:34 UTC
Nope, NetworkManager has nothing to do with the issue. On the upside, forgoing NetworkManager did speed up the reconnection.

Comment 11 Tommi Tervo 2011-12-12 21:26:15 UTC
I've same problem, AAONE 110. WAN over wlan works but rsync using LAN dies almost immediately.

ath_info -v wlan0
#DBG main: sleep_ctl reg a5a5a5a5   reset_ctl reg 5a5a5a5a
#DBG main: waking up the chip
#DBG main: removing resets
MAC revision 0x5a5a is not supported!

Comment 12 Alexander Ploumistos 2011-12-12 21:39:30 UTC
As Nick pointed out you need to provide ath_info with the memory address of the device. Run

lspci -v

and where it says "Memory at" you'll find the memory address. Use that as parameter in ath_info and you'll get your device information.

Comment 13 Tommi Tervo 2011-12-14 17:34:48 UTC
03:00.0 Ethernet controller: Atheros Communications Inc. AR242x / AR542x Wireless Network Adapter (PCI-Express) (rev 01)
        Subsystem: Foxconn International, Inc. Device e008
        Flags: bus master, fast devsel, latency 0, IRQ 18
        Memory at 75200000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: [40] Power Management version 2
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
        Capabilities: [60] Express Legacy Endpoint, MSI 00
        Capabilities: [90] MSI-X: Enable- Count=1 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Virtual Channel
        Kernel driver in use: ath5k
        Kernel modules: ath5k


[root@baron teve]# ath_info 0x75200000
sleep_ctl reg 00000000   reset_ctl reg 00000000
 -==Device Information==-
MAC Revision: 2425  (0xe2)
Warning: Invalid EEPROM Magic number!
Device type:  3

/============== EEPROM Information =============\
| EEPROM Version:   5.3 | EEPROM Size:   4 kbit |
| EEMAP:              2 | Reg. Domain:     0x65 |
|================= Capabilities ================|
| 802.11a Support:  no  | Turbo-A disabled: yes |
| 802.11b Support:  no  | Turbo-G disabled: yes |
| 802.11g Support:  yes | 2GHz XR disabled: yes |
| RFKill  Support:  yes | 5GHz XR disabled: yes |
| 32kHz   Crystal:  no  |                       |
\===============================================/

/=========================================================\
|          Calibration data common for all modes          |
|=========================================================|
|          CCK/OFDM gain delta:             1             |
|          CCK/OFDM power delta:            5             |
|          Scaled CCK delta:                5             |
|          2GHz Antenna gain:               0             |
|          5GHz Antenna gain:               0             |
|          Turbo 2W maximum dBm:           38             |
|          Target power start:          0x16e             |
|          EAR Start:                   0x1b8             |
\=========================================================/

/=========================================================\
|          Calibration data for 802.11g operation         |
|=========================================================|
| I power:              0x00 | Q power:              0x10 |
| Use fixed bias:       0x01 | Max turbo power:      0x26 |
| Max XR power:         0x24 | Switch Settling Time: 0x28 |
| Tx/Rx attenuation:    0x19 | TX end to XLNA On:    0x00 |
| TX end to XPA Off:    0x00 | TX end to XPA On:     0x0e |
| 62db Threshold:       0x1c | XLNA gain:            0x00 |
| XPD:                  0x01 | XPD gain:             0x0a |
| I gain:               0x00 | Tx/Rx margin:         0x01 |
| False detect backoff: 0x00 | Noise Floor Threshold:  -1 |
| ADC desired size:      -38 | PGA desired size:      -80 |
|=========================================================|
| Antenna control   0:  0x00 | Antenna control   1:  0x02 |
| Antenna control   2:  0x21 | Antenna control   3:  0x21 |
| Antenna control   4:  0x00 | Antenna control   5:  0x00 |
| Antenna control   6:  0x01 | Antenna control   7:  0x22 |
| Antenna control   8:  0x22 | Antenna control   9:  0x00 |
| Antenna control  10:  0x00 | Antenna control  11:  0x02 |
|=========================================================|
| Octave Band 0:           3 | db 0:                    3 |
| Octave Band 1:           4 | db 1:                    4 |
| Octave Band 2:           0 | db 2:                    0 |
| Octave Band 3:           0 | db 3:                    0 |
\=========================================================/
/==================== Turbo mode infos ===================\
| Switch Settling time: 0x28 | Tx/Rx margin:         0x01 |
| Tx/Rx attenuation:    0x19 | ADC desired size:      -32 |
| PGA desired size:      -80 |                            |
\=========================================================/
/============== Per rate power calibration ===========\
| Freq | 6-24Mbit/s | 36Mbit/s |  48Mbit/s | 54Mbit/s |
|======|============|==========|===========|==========|
| 2412 |    18.00   |  17.00   |   15.01   |  13.01   |
|======|============|==========|===========|==========|
| 2437 |    18.00   |  17.00   |   15.01   |  13.01   |
|======|============|==========|===========|==========|
| 2472 |    18.00   |  17.00   |   15.01   |  13.01   |
\=====================================================/
/====================== Per channel power calibration ===================\
| Freq |  pwr_i  |    pwr_0    |    pwr_1    |    pwr_2    |    pwr_3    |
|      | pddac_i |   pddac_0   |   pddac_1   |   pddac_2   |   pddac_3   |
|======|=========|=============|=============|=============|=============|
| 2412 |         |             |             |             |             |
|------|---------|-------------|-------------|-------------|-------------|
|      |     0   |     4.50    |     9.00    |    13.50    |     0.00    |
|      |    10   |       15    |       31    |       60    |        0    |
|------|---------|-------------|-------------|-------------|-------------|
|      |    10   |    14.00    |    16.00    |    19.50    |    22.00    |
|      |     9   |       16    |       27    |       44    |       56    |
|======|=========|=============|=============|=============|=============|
| 2442 |         |             |             |             |             |
|------|---------|-------------|-------------|-------------|-------------|
|      |     0   |     4.00    |     8.50    |    13.00    |     0.00    |
|      |    10   |       15    |       29    |       61    |        0    |
|------|---------|-------------|-------------|-------------|-------------|
|      |    10   |    14.00    |    16.50    |    19.00    |    22.50    |
|      |     7   |       19    |       32    |       43    |       61    |
|======|=========|=============|=============|=============|=============|
| 2472 |         |             |             |             |             |
|------|---------|-------------|-------------|-------------|-------------|
|      |     0   |     4.00    |     8.50    |    13.00    |     0.00    |
|      |    10   |       15    |       31    |       67    |        0    |
|------|---------|-------------|-------------|-------------|-------------|
|      |    10   |    14.00    |    17.00    |    19.50    |    22.50    |
|      |     9   |       20    |       39    |       57    |       74    |
\========================================================================/

GPIO registers: CR 0x000080c0, DO 0x00000009, DI 0x0000000b
STA_ID0: 00:23:4d:00:0c:0d
STA_ID1: 0x10000d0c, AP: 0, IBSS: 0, KeyCache Disable: 0
TIMER0: 0x00000030, TBTT:    48, TU: 0x50240030
TIMER1: 0x0007ffff, DMAb: 65535, TU: 0x5023ffff (-49)
TIMER2: 0x01ffffff, SWBA: 65535, TU: 0x503fffff (+1834959)
TIMER3: 0x00000031, ATIM:    49, TU: 0x50240031 (+1)
TSF: 0x000001408e9b4c68, TSFTU: 42707, TU: 0x5023a6d3
BEACON: 0x00000000
LAST_TSTP: 0x8e58f1e4

Comment 14 Alexander Ploumistos 2011-12-14 17:51:09 UTC
No luck with kernel 3.1.5-1 either.

Comment 15 Alexander Ploumistos 2011-12-14 18:07:40 UTC
Same thing with 3.1.5-2 from testing.

Comment 16 Nick Kossifidis 2011-12-14 18:21:47 UTC
O.K. can you please load the module with debug=0x20 and post dmesg ? (do a dmesg -c > /dev/null first to clean it up)

Comment 17 Nick Kossifidis 2011-12-14 18:22:28 UTC
Hmmm maybe 0x23 just in case...

Comment 18 Alexander Ploumistos 2011-12-14 18:53:33 UTC
Created attachment 546854 [details]
dmesg output 3

NetworkManager and network services stopped, connection brought down, ath5k module removed, kernel ring buffer cleared, then:

modprobe ath5k debug=0x23

NetworkManager and network services are restarted and my user begins downloading:

wget ftp://ftp.ntua.gr/pub/linux/fedora/linux/releases/16/Fedora/x86_64/iso/Fedora-16-x86_64-DVD.iso

About half a minute later, the connection is lost.

Comment 19 Nick Kossifidis 2011-12-15 00:03:48 UTC
Something weird is going on and it's not on the driver side...

So during initialization you only get 1 data queue, have in mind that your chip was identified correctly and the driver registers more queues on the protocol stack for your chip by default (I've tested this with mine AR2425), actually the number of queues for your chip is hardcoded on the driver. For some reason the protocol stack only uses one queue, probably because your AP doesn't support WME (multiple queues) or for some reason I'm not aware off.

This is what you get...
[  412.404579] ath5k phy0: (ath5k_conf_tx:590): Configure tx [queue 0],  aifs: 2, cw_min: 7, cw_max: 15, txop: 102

And this is what you 'd get with multiple queues (ignore the line number)...
ath5k phy0: (ath5k_conf_tx:612): Configure tx [queue 0],  aifs: 2,
cw_min: 3, cw_max: 7, txop: 47
ath5k phy0: (ath5k_conf_tx:612): Configure tx [queue 1],  aifs: 2,
cw_min: 7, cw_max: 15, txop: 94
ath5k phy0: (ath5k_conf_tx:612): Configure tx [queue 2],  aifs: 3,
cw_min: 15, cw_max: 1023, txop: 0
ath5k phy0: (ath5k_conf_tx:612): Configure tx [queue 3],  aifs: 7,
cw_min: 15, cw_max: 1023, txop: 0

So, after initialization you get a connection working...
[  415.504805] wlan0: authenticate with <SSID MAC Address> (try 1)
[  415.506558] wlan0: authenticated
[  415.514401] wlan0: associate with <SSID MAC Address> (try 1)
[  415.516920] wlan0: RX AssocResp from <SSID MAC Address> (capab=0x411 status=0 aid=1)
[  415.516932] wlan0: associated

Then traffic goes on...
[  417.488321] ath5k phy0: (ath5k_intr:2172): status 0x1/0x800814b5
[  417.524499] ath5k phy0: (ath5k_intr:2172): status 0x400/0x800814b5
[  417.524684] ath5k phy0: (ath5k_intr:2172): status 0x80/0x800814b5
[  417.527439] ath5k phy0: (ath5k_intr:2172): status 0x1/0x800814b5
[  417.527503] ath5k phy0: (ath5k_intr:2172): status 0x400/0x800814b5
[  417.527710] ath5k phy0: (ath5k_intr:2172): status 0x80/0x800814b5
[  417.529609] ath5k phy0: (ath5k_intr:2172): status 0x1/0x800814b5
[  417.573920] ath5k phy0: (ath5k_intr:2172): status 0x400/0x800814b5
[  417.574077] ath5k phy0: (ath5k_intr:2172): status 0x80/0x800814b5
[  417.576410] ath5k phy0: (ath5k_intr:2172): status 0x1/0x800814b5
...

And then weird things start to happen, first you get this...
[  423.510679] ath5k phy0: (ath5k_conf_tx:590): Configure tx [queue 0],  aifs: 2, cw_min: 3, cw_max: 7, txop: 47
[  423.510803] wlan0: deauthenticating from <SSID MAC Address> by local choice (reason=3)

So someone from above or your AP changed tx queue properties on data queue 0 (our first -and in your case only- data queue), and from cwmin 7/cwmax 15 it switches to cwmin 3 and cwmax 7. This results smaller contention window for your frames than default (cwmin 4, cwmax 15) and then you deauthenticate with reason code 3 (deauthenticationLeaving).

This doesn't seem to be a problem because you reconnect and everything seems to go on fine.

Until you get this...
[  541.073060] wlan0: deauthenticated from <SSID MAC Address> (Reason: 14)

And then this repeats again and again...
[  593.524910] wlan0: <SSID MAC Address> denied association (code=12)
[  593.525175] wlan0: deauthenticating from <SSID MAC Address> by local choice (reason=3)

This is probably the reason your connection dies, reason 14 means "unknown auth transaction" and reason 12 means "association denied for unspecified reason".

So what happens is that your Access Point is throwing you out ! Or at least some AP is throwing you out...

Lets make things cleaner and get rid of NetworkManager and whatever it does. Try disabling NetworkManager and connect to an open access point using the iw command, or on a WPA/WPA2 AP using wpa_supplicant directly. We don't want NM or anyone to trigger scans and probably trying to connect to other APs. It'll help a lot on debugging this.

Comment 20 Alexander Ploumistos 2011-12-15 01:54:59 UTC
Created attachment 547037 [details]
dmesg output 4

# systemctl stop NetworkManager.service
# systemctl stop network.service
# killall -TERM wpa_supplicant
# killall -TERM dhclient
# ip link set wlan0 down
# modprobe -r ath5k

# dmesg -C

# modprobe ath5k debug=0x23
# ip link set wlan0 up
# wpa_supplicant -B -Dwext -i wlan0 -c /root/my_wpa_supplicant.conf
# dhclient -v wlan0

On another terminal my user tries again to download the F16 iso with wget from the same mirror. After a while, wget starts reporting decreasing download speeds, until it becomes obvious that the connection has died.

my_wpa_supplicant.conf:

ctrl_interface=/var/run/wpa_supplicant
ctrl_interface_group=wheel

network={
         scan_ssid=1
         ssid="<My SSID>"
         proto=RSN
         key_mgmt=CCMP TKIP
         group=CCMP TKIP
         #psk="<my passphrase>"
         psk=<hexadecimal string provided by wpa_passphrase>
}


Would it be easier to pinpoint the problem if we concentrate on what has changed between F14 and F16 or what is different between the 3.0.x kernel provided by Ubuntu and the 3.1.x Fedora ones?

Comment 21 Tommi Tervo 2011-12-15 20:47:17 UTC
I tried F16 with F15 kernels, F15 original 2.6.38.6-26.rc1.fc15 works fine
but 2.6.40.8-4.fc15 fails similarly like F16 kernel. I'll continue bisecting tomorrow.

Comment 22 Tommi Tervo 2011-12-16 14:37:22 UTC
Created attachment 547826 [details]
Dmesg

I tried with latest f17 kernel and got following poison overwritten warnings when connection was reset.

Comment 23 Bob Copeland 2011-12-16 15:25:30 UTC
(In reply to comment #22)
> Created attachment 547826 [details]
> Dmesg
> 
> I tried with latest f17 kernel and got following poison overwritten warnings
> when connection was reset.

Dec 16 11:57:31 baron kernel: [ 1246.280013] Object ee9753b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Dec 16 11:57:31 baron kernel: [ 1246.280021] Object ee9753c0: 08 42 2c 00 00 23 4d 00 0c 0d 00 13 64 33 5a 91  .B,..#M.....d3Z.
Dec 16 11:57:31 baron kernel: [ 1246.280029] Object ee9753d0: 00 13 64 33 5a 8f a0 83 78 4c 2d 00 aa aa 03 00  ..d3Z...xL-.....


Hmm, do you recognize the mac address 00:23:4d:00:0c:0d?

This looks like a DMA-after-free bug.  Is it reproducible or did you just get unlucky?

Comment 24 Tommi Tervo 2011-12-16 15:54:07 UTC
Mac address is local wlan0 mac. This is reproducible, I rebooted my machine and after 50MB dl I got similar warning.

ifconfig wlan0
wlan0     Link encap:Ethernet  HWaddr 00:23:4D:00:0C:0D  
          inet addr:10.0.0.4  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::223:4dff:fe00:c0d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:36837 errors:0 dropped:0 overruns:0 frame:0
          TX packets:19039 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:54903724 (52.3 MiB)  TX bytes:1618881 (1.5 MiB)

Comment 25 Alexander Ploumistos 2011-12-18 02:17:30 UTC
No change with recently released 3.1.5-6.

Comment 26 Alexander Ploumistos 2011-12-26 08:02:02 UTC
I've just installed kernel 3.1.6-1 and I noticed that the whole disconnect-reconnect loop happens a lot faster. Apps like yum and wget don't have the time to register a connection time-out.

Comment 27 Alexander Ploumistos 2012-01-10 20:28:38 UTC
Tommi, would you mind giving kernel 3.1.7-1 a shot? I've been using it for the past hour and I haven't had any problems yet.

Comment 28 Tommi Tervo 2012-01-10 21:58:31 UTC
at least 3.1.8-2.fc16 doesn't work better.

Comment 29 moonshine 2012-01-12 10:54:40 UTC
	Kernel modules: atl1c

03:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01)
	Subsystem: Lenovo Device 30a1
	Flags: bus master, fast devsel, latency 0, IRQ 17
	Memory at 90000000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
	Capabilities: [60] Express Legacy Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Virtual Channel
	Capabilities: [160] Device Serial Number 00-15-17-ff-ff-24-14-12
	Capabilities: [170] Power Budgeting <?>
	Kernel driver in use: ath9k
	Kernel modules: ath9k

[root@localhost ~]# uname -a
Linux localhost.localdomain 3.1.8-2.fc16.x86_64 #1 SMP Sat Jan 7 13:35:24 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]#

Comment 30 Alexander Ploumistos 2012-01-27 21:06:46 UTC
I hadn't used my Aspire One for quite some time. Today I upgraded to kernel 3.2.1-3 and I am still getting disconnected every now and then, but less frequently than before.

Comment 31 getnaked 2012-01-31 07:53:08 UTC
I've same problem, Acer Aspire One aoa150. 
3.2.2-1.fc16.i686

03:00.0 Ethernet controller: Atheros Communications Inc. AR242x / AR542x Wireless Network Adapter (PCI-Express) (rev 01)
	Subsystem: Foxconn International, Inc. Device e008
	Flags: bus master, fast devsel, latency 0, IRQ 18
	Memory at 55200000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 2
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
	Capabilities: [60] Express Legacy Endpoint, MSI 00
	Capabilities: [90] MSI-X: Enable- Count=1 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Virtual Channel
	Kernel driver in use: ath5k
	Kernel modules: ath5k

Comment 32 Alexander Ploumistos 2012-02-06 19:56:05 UTC
Currently using kernel 3.2.3-2 and after a number of restarts and cold boots there have been no disconnections.

However, my download speed now seems to peak at ~600 KBytes/s while the average speed fluctuates between 100 and 250 Kbytes/s, even when I'm getting a file hosted on my own LAN and the netbook is placed within 50cm from the AP.

I am also seeing a line in my syslog that wasn't there before:

ath5k_hw_get_isr: ISR: 0x00000001 IMR: 0x00000000

Comment 33 Nick Kossifidis 2012-02-06 22:05:56 UTC
Hello and sorry for the delay, I'm totally out of time these days :-(

So this seems like a bug from switching to write-to-clear operation for interrupt handling, I just noticed that we keep checking against ah->ah_imr snapshot.

Try this:

Go to drivers/net/wireless/ath/ath5k/dma.c in function get_isr, find this line

636                 /*
637                  * Filter out the non-common bits from the interrupt
638                  * status.
639                  */
640                 *interrupt_mask = (pisr & AR5K_INT_COMMON) & ah->ah_imr;

and change it to
*interrupt_mask = (pisr & AR5K_INT_COMMON);

(you'll find this twice in this function, the first one is for AR5210 chips -not yours- and the second one is for the rest)

And see how it goes...

You may also remove the check

724         /*
725          * In case we didn't handle anything,
726          * print the register value.
727          */
728         if (unlikely(*interrupt_mask == 0 && net_ratelimit()))
729                 ATH5K_PRINTF("ISR: 0x%08x IMR: 0x%08x\n", data, ah->ah_imr);

I need to clean this ah_imr/imask etc thing up soon..

Comment 34 Alexander Ploumistos 2012-02-07 02:13:09 UTC
I have a lot of reading to do, as the last time I compiled a kernel or a module for Fedora was when Fedora Core 3 or 4 was released and it seems that quite a lot has changed since then. kernel-devel and kernel-headers for my current kernel are installed, but there were no source files for ath5k. I downloaded the src.rpm and there were two different ath5k versions inside that and I got confused. To add to the frustration, the relevant fedoraproject wiki section is outdated and points back to the modules.txt file from kernel-doc.

On my Gentoo boxes this process is really straightforward, the ebuild takes care of patches and backports and you're left with a clean source tree to mess around with.

Oh well, I'll get back on this tomorrow.

By the way, does this have to do with the speed issue or just the message about the registers?

Comment 35 John W. Linville 2012-02-07 20:50:20 UTC
Test kernels w/ a patch based on Nick's suggestions in comment 33 is building here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=3770251

When that build finishes, please give it a try and post the results here -- thanks!

Comment 36 Alexander Ploumistos 2012-02-08 02:39:23 UTC
Created attachment 560111 [details]
kernel-3.2.5-2.bz758543.1.fc16.i686 test & comparison

Kernel 3.2.5-2.bz758543.1 for i686 was the last in line... Got it, installed it and went on to check.

My ADSL connection is currently synchronized at 12.2Mbps. I timed a download from my provider to see how things would go (see attachment). Again, my peak speed was 604 KB/s while my average speed was 224 KB/s. 5-6 MBs before completion, I got disconnected, but it reconnected in a couple of seconds. You can see the dmesg and messages relevant segments (accidentally I deleted the /var/log/messages part before the disconnection).

Digging through my logs, I noticed that with kernel 3.2.3-2 I did get disconnections, but they were too brief to notice.

I had an Ubuntu 11.10 USB stick handy, so I decided to run a comparative test. Peak speed was 1.22 MB/s and the average speed 967 KB/s. No disconnections.

If you need me to run tests with debugging turned on, just ask.

The patch got rid of the message about register values and there is a new message about station states (what do these numbers mean by the way?).

All I can guess so far is that something broke between kernel versions 3.0 and 3.1 and after 3.1.7 reconnections are a lot faster. I am currently downloading the daily build of pangolin which has a 3.2 kernel to test my theory.

Comment 37 Alexander Ploumistos 2012-02-08 03:10:11 UTC
Ubuntu's i386 ISO runs a pae kernel, 3.2.0-14-generic-pae.

I downloaded the same file from the same mirror. I didn't get disconnected and my download speed did climb near 1MB/s 3-4 times, but it was mostly under 600 KB/s, averaging at 496 KB/s and the download took 4m 37s to complete.

I'm calling it a night.

Comment 38 Alexander Ploumistos 2012-02-08 23:31:56 UTC
I tested some ISOs I had from Fedora, SUSE and Ubuntu. My hypothesis seemed to stand. Then I found aptosid (www.aptosid.com), which is based on Debian sid. I thought that since Ubuntu is based on Debian there wouldn't be any difference, but I was wrong.

Their latest LiveCD has a 3.1 kernel, "3.1-6.slh.1-aptosid-686" and 

modinfo ath5k

reports

srcversion: 7BC437582869AD97FC05942

Using that kernel (on the KDE-lite version) I downloaded about 30GB of data from various locations and my download speed averaged consistently around 1MB/s. No disconnections.

I couldn't find what aptosid or sid have done with that kernel and I don't know what to make of all this.

Comment 39 Christian Ide 2012-02-19 12:09:26 UTC
I'm trying to install Fedora 16 on my Acer AOA110 with Ath5K WLAN chipset right now and experience the same problem during the installation (via DVD image and download).

I have one important thing to add: Not only the connection on the Acer breaks constantly on heavy downloads but also the whole WLAN network goes down! The event log of my router (AVM Fritz!Box 7050) always tells me that all devices are disconnecting at the same second and reconnecting about 30 seconds later.

I had the same problem with Ubuntu 9.04 three years ago and solved it by using the discontinued MadWifi drivers. Afterwards I moved to Meego and ChromiumOS where the problems doesn't exist any more. (Sorry, no technical details about the Ath5K modules.)

Comment 40 Alexander Ploumistos 2012-03-18 22:01:30 UTC
Still the same with kernel 3.2.10-3. I had been trying to copy a 170MB file across computers with scp and the connection was dropped every 25-40MB.

Comment 41 Daniel 2012-03-20 20:01:03 UTC
I too suffer from this bug. Fedora 16 up-to-date on a AAO 110L.

Has anyone been able to implement a workaround at least?




03:00.0 Ethernet controller: Atheros Communications Inc. AR242x / AR542x Wireless Network Adapter (PCI-Express) (rev 01)
	Subsystem: Foxconn International, Inc. Device e008
	Flags: bus master, fast devsel, latency 0, IRQ 18
	Memory at 55200000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 2
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
	Capabilities: [60] Express Legacy Endpoint, MSI 00
	Capabilities: [90] MSI-X: Enable- Count=1 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Virtual Channel
	Kernel driver in use: ath5k
	Kernel modules: ath5k

Comment 42 Dave Jones 2012-03-22 16:50:42 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 43 Dave Jones 2012-03-22 16:54:59 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 44 Dave Jones 2012-03-22 17:05:41 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 45 Tommi Tervo 2012-03-22 19:23:16 UTC
kernel-3.3.0-4.fc16 has still same problem.

Comment 46 Alexander Ploumistos 2012-03-23 20:03:06 UTC
I'd say it's a bit worse with 3.3.0-4. Also, I've been getting some "Corrupted MAC on input" messages when I transfer files with scp from other machines to my AAO, just around the time I lose the connection. Until now, it just stalled.

Comment 47 Alexander Ploumistos 2012-05-11 08:57:41 UTC
Just tried with 3.3.2-1 and 3.3.4-3 -no improvement.

Comment 48 Alexander Ploumistos 2012-07-09 10:08:44 UTC
Well, I upgraded to Fedora 17 about a month ago and I haven't had any problems yet, even with kernels of the same version as in F16.

Comment 49 Tommi Tervo 2012-07-12 20:11:44 UTC
I just installed F17 and stock kernel had similar problems but the latest 3.4.4 seems to work fine.

Comment 50 Kieran Clancy 2012-07-12 20:14:20 UTC
Still on F16 with the 3.4.2 kernel, still have the same problem...

Comment 51 Dave Jones 2012-10-23 15:37:25 UTC
# Mass update to all open bugs.

Kernel 3.6.2-1.fc16 has just been pushed to updates.
This update is a significant rebase from the previous version.

Please retest with this kernel, and let us know if your problem has been fixed.

In the event that you have upgraded to a newer release and the bug you reported
is still present, please change the version field to the newest release you have
encountered the issue with.  Before doing so, please ensure you are testing the
latest kernel update in that release and attach any new and relevant information
you may have gathered.

If you are not the original bug reporter and you still experience this bug,
please file a new report, as it is possible that you may be seeing a
different problem. 
(Please don't clone this bug, a fresh bug referencing this bug in the comment is sufficient).

Comment 52 Alexander Ploumistos 2012-10-23 22:16:28 UTC
As I stated earlier, I've been using Fedora 17 for quite some time now and since the upgrade I have not encountered the issue again. I just ran some tests on 3.6.2-4.fc17 and everything is fine.

Comment 53 Alexander Ploumistos 2013-03-04 18:17:45 UTC
I upgraded to F18 today, latest kernel and the problem reappeared...


Note You need to log in before you can comment on or make changes to this bug.