Bug 506659 - ath5k: phy0 periodicaly uses considerable cpu and ssh connections lock up for several seconds
ath5k: phy0 periodicaly uses considerable cpu and ssh connections lock up for...
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
11
i686 Linux
low Severity low
: ---
: ---
Assigned To: John W. Linville
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-18 04:10 EDT by Bidwell Ducanh
Modified: 2009-11-28 22:23 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-06-30 09:59:54 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Bidwell Ducanh 2009-06-18 04:10:07 EDT
Description of problem:
SSH connections stall for several seconds while the phy0 process's CPU usage becomes high. This happens randomly, usually every few minutes.

Version-Release number of selected component (if applicable):
kernel-2.6.29.4-167.fc11.i586
openssh-5.2p1-2.fc11.i586

How reproducible:
Not sure, using an Atheros AR5414 chip, a WPA/TKIP connection over the SSH connection.

Steps to Reproduce:
1. Connect to an SSH server
2. Have top or a CPU monitor program running
3. SSH connection stalls as the phy0 process consumes way-more-than-usual CPU cycles.
  
Actual results:
SSH stalls while phy0's process goes up.

Expected results:
Continuous SSH connection availability.

Additional info:

lspci -vvv (results for the atheros chip):
03:00.0 Ethernet controller: Atheros Communications Inc. AR5212 802.11abg NIC (rev 01)
	Subsystem: IBM ThinkPad 11a/b/g Wireless LAN Mini Express Adapter (AR5BXB6)
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 128 bytes
	Interrupt: pin A routed to IRQ 18
	Region 0: Memory at 75200000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: ath5k
	Kernel modules: ath5k

No kernel errors are thrown.
Comment 1 Bidwell Ducanh 2009-06-18 07:12:06 EDT
I realize this is probably related to ath5k's known rate-setting issues and the new minstrel rate algorithm. Please let me know if this is a duplicate or how I can contribute some more information.
Comment 2 John W. Linville 2009-06-18 10:44:04 EDT
Actually, it sounds more like periodic scanning to me.  Are you using NetworkManager?
Comment 3 Bidwell Ducanh 2009-06-18 11:47:13 EDT
Yes, I'm using NetworkManager. I guess I should fall back to wpa_supplicant?
Comment 4 John W. Linville 2009-06-18 13:39:59 EDT
That might be worthwhile, although honestly I'm not sure if NM is triggering scans or if wpa_supplicant does it.  I wonder if you can recreate the issue with and open or WEP network?

Also, it is probably worthwhile to run iwevent and see if there are any events coincident with the delays.
Comment 5 Bidwell Ducanh 2009-06-18 14:55:24 EDT
I've confirmed by watching iwevent and iwconfig wlan0 that you were indeed correct that the periodic scanning is the culprit. On a WPA or open network iwevent will output:
11:41:56.761202   wlan0    Scan request completed
periodically as iwconfig reports frequency change. This will coincide with the connection stalls and phy0 CPU usage.

Furthermore, using wpa_supplicant and disabling NetworkManager, this behavior is not reproducible (perhaps only by manually triggering a scan?)
Comment 6 John W. Linville 2009-06-30 09:59:54 EDT
OK, at least we understand the issue.  FWIW, there has been some work done in the upstream kernel that will mitigate this issue.  However, it will take some time for this to appear in Fedora -- could be F-12...
Comment 7 Jussi Eloranta 2009-11-28 22:23:57 EST
Any progress on this? I am at FC12 and this problem still persists. I have:
2.6.31.5-127.fc12.i686.PAE

When I ping my base station, I see occasionally:

64 bytes from 192.168.1.1: icmp_seq=57 ttl=64 time=1.33 ms
64 bytes from 192.168.1.1: icmp_seq=58 ttl=64 time=1.39 ms
64 bytes from 192.168.1.1: icmp_seq=59 ttl=64 time=1.39 ms
64 bytes from 192.168.1.1: icmp_seq=60 ttl=64 time=1.29 ms
64 bytes from 192.168.1.1: icmp_seq=61 ttl=64 time=4236 ms
64 bytes from 192.168.1.1: icmp_seq=62 ttl=64 time=3243 ms
64 bytes from 192.168.1.1: icmp_seq=63 ttl=64 time=2244 ms
64 bytes from 192.168.1.1: icmp_seq=64 ttl=64 time=1244 ms
64 bytes from 192.168.1.1: icmp_seq=65 ttl=64 time=237 ms
64 bytes from 192.168.1.1: icmp_seq=66 ttl=64 time=1.34 ms
64 bytes from 192.168.1.1: icmp_seq=67 ttl=64 time=1.24 ms
64 bytes from 192.168.1.1: icmp_seq=68 ttl=64 time=1.28 ms
64 bytes from 192.168.1.1: icmp_seq=69 ttl=64 time=1.83 ms

The system is doing scan at the points when this happens. I am using NetworkManager. This is a really annoying problem as it makes the system very difficult to use (especially ssh connections stop from time to time and make editing files pain).

(the system where I see this is an up to date FC12 as of Nov-28 2009)

lspci says:

03:00.0 Ethernet controller: Atheros Communications Inc. AR242x 802.11abg Wireless PCI Express Adapter (rev 01)

Note You need to log in before you can comment on or make changes to this bug.