Bug 659415

Summary: 802.11n connections drop sporadically
Product: [Fedora] Fedora Reporter: Joshua Boyd <boydjd>
Component: kernelAssignee: Stanislaw Gruszka <sgruszka>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 13CC: boydjd, dougsland, gansalmon, itamar, jonathan, kernel-maint, kmcmartin, linville, madhu.chinakonda, steinpilz, wey-yi.w.guy
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-21 08:05:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joshua Boyd 2010-12-02 18:37:26 UTC
Description of problem:

Using an Intel 5300AGN wireless card, connecting to 802.11n access points is successful, and speed is as expected. However, after random periods of time, the connection is lost, and several attempts are required to re-connect to the access point.


Version-Release number of selected component (if applicable):

Fedora 13
Kernel 2.6.34.7-61.fc13.x86_64
iwlwifi 8.24.2.12


How reproducible:

Happens consistently.


Steps to Reproduce:
1. Connect to a 802.11N network
2. Use network 
  
Actual results:
Network adapter disconnects after random period of time.


Expected results:
No disconnections.


Additional info:

I've tried many things to negate this problem, as can be seen in options.conf and grub.conf.

This problem only exists on 802.11n networks, no problems on G or B.

This message shows up often in dmesg when a disconnect occurs:
No probe response from AP c0:3f:0e:5e:91:0e after 500ms, disconnecting.

[root@carbine ~]# uname -ar
Linux carbine 2.6.34.7-61.fc13.x86_64 #1 SMP Tue Oct 19 04:06:30 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

[root@carbine ~]# cat /etc/modprobe.d/options.conf 
options iwlagn 11n_disable=0 11n_disable50=0 swcrypto50=1 swcrypto=1

[root@carbine ~]# cat /boot/grub/grub.conf | grep pcie_aspm
	kernel /vmlinuz-2.6.34.7-61.fc13.x86_64 ro root=UUID=334ac71f-bb28-41b2-9168-1d53580d4594 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb quiet pcie_aspm=off


[root@carbine ~]# dmesg | grep iwl
iwlagn 0000:03:00.0: PCI INT A disabled
iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d
iwlagn: Copyright(c) 2003-2010 Intel Corporation
iwlagn 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
iwlagn 0000:03:00.0: setting latency timer to 64
iwlagn 0000:03:00.0: Detected Intel Wireless WiFi Link 5300AGN REV=0x24
iwlagn 0000:03:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
iwlagn 0000:03:00.0: irq 36 for MSI/MSI-X
iwlagn 0000:03:00.0: firmware: requesting iwlwifi-5000-2.ucode
iwlagn 0000:03:00.0: loaded firmware version 8.24.2.12
phy0: Selected rate control algorithm 'iwl-agn-rs'


[root@carbine ~]# modinfo iwlagn
filename:       /lib/modules/2.6.34.7-61.fc13.x86_64/kernel/drivers/net/wireless/iwlwifi/iwlagn.ko
alias:          iwl4965
license:        GPL
author:         Copyright(c) 2003-2010 Intel Corporation <ilw.com>
version:        in-tree:d
description:    Intel(R) Wireless WiFi Link AGN driver for Linux
firmware:       iwlwifi-4965-2.ucode
firmware:       iwlwifi-5150-2.ucode
firmware:       iwlwifi-5000-2.ucode
firmware:       iwlwifi-6050-4.ucode
firmware:       iwlwifi-6000-4.ucode
firmware:       iwlwifi-1000-3.ucode
srcversion:     673056017A6DD2BC218418D
alias:          pci:v00008086d00000084sv*sd00001316bc*sc*i*
alias:          pci:v00008086d00000084sv*sd00001216bc*sc*i*
alias:          pci:v00008086d00000083sv*sd00001326bc*sc*i*
alias:          pci:v00008086d00000083sv*sd00001226bc*sc*i*
alias:          pci:v00008086d00000083sv*sd00001306bc*sc*i*
alias:          pci:v00008086d00000083sv*sd00001206bc*sc*i*
alias:          pci:v00008086d00000084sv*sd00001315bc*sc*i*
alias:          pci:v00008086d00000084sv*sd00001215bc*sc*i*
alias:          pci:v00008086d00000083sv*sd00001325bc*sc*i*
alias:          pci:v00008086d00000083sv*sd00001225bc*sc*i*
alias:          pci:v00008086d00000083sv*sd00001305bc*sc*i*
alias:          pci:v00008086d00000083sv*sd00001205bc*sc*i*
alias:          pci:v00008086d00000089sv*sd00001316bc*sc*i*
alias:          pci:v00008086d00000089sv*sd00001311bc*sc*i*
alias:          pci:v00008086d00000087sv*sd00001326bc*sc*i*
alias:          pci:v00008086d00000087sv*sd00001321bc*sc*i*
alias:          pci:v00008086d00000087sv*sd00001306bc*sc*i*
alias:          pci:v00008086d00000087sv*sd00001301bc*sc*i*
alias:          pci:v00008086d00004239sv*sd00001316bc*sc*i*
alias:          pci:v00008086d00004239sv*sd00001311bc*sc*i*
alias:          pci:v00008086d00004238sv*sd00001111bc*sc*i*
alias:          pci:v00008086d0000422Csv*sd00001326bc*sc*i*
alias:          pci:v00008086d0000422Csv*sd00001321bc*sc*i*
alias:          pci:v00008086d0000422Csv*sd00001307bc*sc*i*
alias:          pci:v00008086d0000422Csv*sd00001306bc*sc*i*
alias:          pci:v00008086d0000422Csv*sd00001301bc*sc*i*
alias:          pci:v00008086d0000422Bsv*sd00001121bc*sc*i*
alias:          pci:v00008086d0000422Bsv*sd00001101bc*sc*i*
alias:          pci:v00008086d0000423Dsv*sd00001316bc*sc*i*
alias:          pci:v00008086d0000423Dsv*sd00001216bc*sc*i*
alias:          pci:v00008086d0000423Dsv*sd00001311bc*sc*i*
alias:          pci:v00008086d0000423Dsv*sd00001211bc*sc*i*
alias:          pci:v00008086d0000423Csv*sd00001321bc*sc*i*
alias:          pci:v00008086d0000423Csv*sd00001221bc*sc*i*
alias:          pci:v00008086d0000423Csv*sd00001306bc*sc*i*
alias:          pci:v00008086d0000423Csv*sd00001206bc*sc*i*
alias:          pci:v00008086d0000423Csv*sd00001301bc*sc*i*
alias:          pci:v00008086d0000423Csv*sd00001201bc*sc*i*
alias:          pci:v00008086d0000423Bsv*sd00001011bc*sc*i*
alias:          pci:v00008086d0000423Asv*sd00001021bc*sc*i*
alias:          pci:v00008086d0000423Asv*sd00001001bc*sc*i*
alias:          pci:v00008086d00004236sv*sd00001114bc*sc*i*
alias:          pci:v00008086d00004236sv*sd00001014bc*sc*i*
alias:          pci:v00008086d00004236sv*sd00001111bc*sc*i*
alias:          pci:v00008086d00004236sv*sd00001011bc*sc*i*
alias:          pci:v00008086d00004235sv*sd00001104bc*sc*i*
alias:          pci:v00008086d00004235sv*sd00001004bc*sc*i*
alias:          pci:v00008086d00004235sv*sd00001101bc*sc*i*
alias:          pci:v00008086d00004235sv*sd00001001bc*sc*i*
alias:          pci:v00008086d00004235sv*sd00001124bc*sc*i*
alias:          pci:v00008086d00004235sv*sd00001024bc*sc*i*
alias:          pci:v00008086d00004235sv*sd00001121bc*sc*i*
alias:          pci:v00008086d00004235sv*sd00001021bc*sc*i*
alias:          pci:v00008086d00004237sv*sd00001316bc*sc*i*
alias:          pci:v00008086d00004237sv*sd00001216bc*sc*i*
alias:          pci:v00008086d00004237sv*sd00001315bc*sc*i*
alias:          pci:v00008086d00004237sv*sd00001215bc*sc*i*
alias:          pci:v00008086d00004237sv*sd00001314bc*sc*i*
alias:          pci:v00008086d00004237sv*sd00001214bc*sc*i*
alias:          pci:v00008086d00004237sv*sd00001311bc*sc*i*
alias:          pci:v00008086d00004237sv*sd00001211bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001326bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001226bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001325bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001225bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001324bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001224bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001321bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001221bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001306bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001206bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001305bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001205bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001304bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001204bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001301bc*sc*i*
alias:          pci:v00008086d00004232sv*sd00001201bc*sc*i*
alias:          pci:v00008086d00004230sv*sd*bc*sc*i*
alias:          pci:v00008086d00004229sv*sd*bc*sc*i*
depends:        iwlcore,mac80211,cfg80211
vermagic:       2.6.34.7-61.fc13.x86_64 SMP mod_unload 
parm:           swcrypto50:using software crypto engine (default 0 [hardware])
 (bool)
parm:           queues_num50:number of hw queues in 50xx series (int)
parm:           11n_disable50:disable 50XX 11n functionality (int)
parm:           amsdu_size_8K50:enable 8K amsdu size in 50XX series (int)
parm:           fw_restart50:restart firmware in case of error (int)
parm:           antenna:select antenna (1=Main, 2=Aux, default 0 [both]) (int)
parm:           swcrypto:using crypto in software (default 0 [hardware]) (int)
parm:           disable_hw_scan:disable hardware scanning (default 0) (int)
parm:           queues_num:number of hw queues. (int)
parm:           11n_disable:disable 11n functionality (int)
parm:           amsdu_size_8K:enable 8K amsdu size (int)
parm:           fw_restart4965:restart firmware in case of error (int)
parm:           debug50:50XX debug output mask (deprecated) (uint)
parm:           debug:debug output mask (uint)

Comment 1 Joshua Boyd 2010-12-02 19:00:20 UTC
[root@carbine ~]# iw dev wlan1 station dump
Station c0:3f:0e:5e:91:0e (on wlan1)
	inactive time:	1750 ms
	rx bytes:	576228
	rx packets:	931
	tx bytes:	66884
	tx packets:	541
	signal:  	-45 dBm
	tx bitrate:	270.0 MBit/s MCS 15 40Mhz

Comment 2 Joshua Boyd 2010-12-02 19:09:03 UTC
I am also seeing these errors now after doing a factory restore of my router:


iwlagn 0000:03:00.0: PCI INT A disabled
iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d
iwlagn: Copyright(c) 2003-2010 Intel Corporation
iwlagn 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
iwlagn 0000:03:00.0: setting latency timer to 64
iwlagn 0000:03:00.0: Detected Intel Wireless WiFi Link 5300AGN REV=0x24
iwlagn 0000:03:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
iwlagn 0000:03:00.0: irq 36 for MSI/MSI-X
iwlagn 0000:03:00.0: firmware: requesting iwlwifi-5000-2.ucode
iwlagn 0000:03:00.0: loaded firmware version 8.24.2.12
phy0: Selected rate control algorithm 'iwl-agn-rs'
iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = c0:3f:0e:5e:91:0e tid = 0
iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = c0:3f:0e:5e:91:0e tid = 0
iwlagn 0000:03:00.0: low ack count detected, restart firmware
iwlagn 0000:03:00.0: On demand firmware reload
iwlagn 0000:03:00.0: Stopping AGG while state not ON or starting
iwlagn 0000:03:00.0: queue number out of range: 0, must be 10 to 19
iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = c0:3f:0e:5e:91:0e tid = 0
iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = c0:3f:0e:5e:91:0e tid = 0
iwlagn 0000:03:00.0: low ack count detected, restart firmware
iwlagn 0000:03:00.0: On demand firmware reload
iwlagn 0000:03:00.0: Stopping AGG while state not ON or starting
iwlagn 0000:03:00.0: queue number out of range: 0, must be 10 to 19
iwlagn 0000:03:00.0: PCI INT A disabled

Comment 3 Stanislaw Gruszka 2010-12-17 17:00:19 UTC
I think this is known problem reported here: https://bugzilla.kernel.org/show_bug.cgi?id=16691 , and here: http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2275

There is patch you can test: https://bugzilla.kernel.org/attachment.cgi?id=38632
it disable check for "low ack count" so firmware resets do not happen. However this is rather workaround not right fix. If you are not familiar with kernel compilation, I can prepare test kernel for you.

Comment 4 Stanislaw Gruszka 2010-12-17 17:08:36 UTC
Eee, above is wrong patch, this one is correct:
https://bugzilla.kernel.org/attachment.cgi?id=38502

Comment 5 Stanislaw Gruszka 2011-01-03 15:33:05 UTC
Joshua, please test this kernel http://koji.fedoraproject.org/koji/taskinfo?taskID=2697824. Does it solve the problem for you and not cause any other issues?

Comment 6 Joshua Boyd 2011-01-03 17:39:48 UTC
I will check this evening and report back.

Comment 7 Joshua Boyd 2011-01-03 21:51:53 UTC
I've got the kernel installed and am testing over an 802.11n connection now. No drops so far, but it's only been about 10 minutes.

Comment 8 Joshua Boyd 2011-01-03 22:02:46 UTC
The machine hard locked and was unresponsive after about 5 minutes of testing network throughput over 802.11n against my local NFS server.

I did not install the debug kernel. If you'd like me to do that and attempt to reproduce, let me know what steps I need to take to get you any usable data.

Comment 9 Joshua Boyd 2011-01-03 22:47:14 UTC
I now have a whole bunch of these showing up without stressing the connection:


[ 2993.938818] iwlagn 0000:03:00.0: BA scd_flow 0 does not match txq_id 10

This is on an N only connection.

Comment 10 Joshua Boyd 2011-01-04 01:34:28 UTC
And, the same problem still occurs. So actually, this kernel is worse, because it'll lock up under high throughput.

Comment 11 Stanislaw Gruszka 2011-01-04 09:45:51 UTC
Re comment 8:
I think this can be NFS client problem, can you reproduce using ftp or scp ? 

Regarding steps to get useful data. I'm planing to prepare some text on fedora wiki describing how to get debug info when kernel hung. Unfortunately there is no one easy way, as long you do not have serial cable connected to other computer.

Re comment 9:
I think someone report this is fixed in 2.6.36, I will provide you packages to test.

Comment 12 Joshua Boyd 2011-01-04 14:18:23 UTC
I can reproduce using netperf as well, so it's not an NFS problem specifically.

I wonder if a serial port on a docking station or the SmartBay would work. If so, I could try to find one of those, as both are available for my laptop.

Comment 13 Stanislaw Gruszka 2011-01-04 17:45:33 UTC
Serial console from docking station should work (if not, perhaps console is disabled in BIOS). You have to add "console=ttyS0,115200n8" to kernel boot parameters in /boot/grub/grub.conf , and configure same boud/parity on other machine.

Here are 2.6.36 wireless drivers/stack + test patch compiled for 2.6.35.10-74.fc14 kernel: http://koji.fedoraproject.org/koji/taskinfo?taskID=2700598 . Please test. Short description how to use package is here: http://people.redhat.com/sgruszka/compact_wireless.html

Comment 14 Stanislaw Gruszka 2011-01-07 09:52:51 UTC
Regarding hard lock from comment 8 and 12, we probably have problem mac80211 in 2.6.35.10 kernel, it was reported in bug 667459. I'll prepare a new test kernel.

Comment 15 Stanislaw Gruszka 2011-01-07 15:55:32 UTC
Here is a new kernel to test:
http://koji.fedoraproject.org/koji/taskinfo?taskID=2706682
Please give it a try.

Comment 16 Stanislaw Gruszka 2011-01-14 13:34:43 UTC
Joshua, any news about kernel from comment 15?

Comment 17 Christian L 2011-01-20 08:50:07 UTC
Stanislaw, does the new kernel (comment 15) need the drivers from comment 13 installed as well? I have the same wireless card, an Intel 5300, in a Dell Latitude 6400, and I am running into the same problems as the OP when connecting to my 802.11n network. The kernel from comment 15 (64bit) alone does not completely fix it, and I'd have to downgrade the kernel to install the compat-wireless files. 

I am still getting lots of 

  iwlagn 0000:0c:00.0: BA scd_flow 0 does not match txq_id 10

and a few 

  iwlagn 0000:0c:00.0: Fail finding valid aggregation tid: 1

 in /var/log/messages. It looks like the firmware resets 

  iwlagn 0000:0c:00.0: low ack count detected, restart firmware

might be gone, but I'll have to test it a bit more for that.

Comment 18 Stanislaw Gruszka 2011-01-20 09:44:16 UTC
(In reply to comment #17)
> Stanislaw, does the new kernel (comment 15) need the drivers from comment 13
No, kernel include patched drivers, it generally was intended to fix kernel crash from bug 667459.

> not completely fix it, and I'd have to downgrade the kernel to install the
> compat-wireless files. 
> 
> I am still getting lots of 
> 
>   iwlagn 0000:0c:00.0: BA scd_flow 0 does not match txq_id 10
> 

Intel wifi definitely have problems with 11n, this seems to be a firmware problems. You can check experimental firmware, see bug 648732 . If this not help, only advice is to disable 11n.

Comment 19 Stanislaw Gruszka 2011-01-21 08:05:27 UTC
I'm closing this bug as duplicate of firmware bug, since the real issue is in the firmware.

*** This bug has been marked as a duplicate of bug 648732 ***