|Summary:||Extremely slow network with Intel "Ultimate N WiFi Link 5300" (iwlagn) after upgrade from Fedora 14|
|Product:||[Fedora] Fedora||Reporter:||Håvard Wigtil <havardw>|
|Component:||kernel||Assignee:||Stanislaw Gruszka <sgruszka>|
|Status:||CLOSED ERRATA||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Version:||15||CC:||adlorenz, awilliam, daryll, donald.h.fry, gansalmon, gdeschner, itamar, johannes, jonathan, jvpgomes, kernel-maint, madhu.chinakonda, mads, pbrobinson, s2jawalt, sgruszka, sune, vbatts, ventero, wey-yi.w.guy, ykaul|
|Fixed In Version:||kernel-188.8.131.52-5.fc15||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|:||735721 804259 (view as bug list)||Environment:|
|Last Closed:||2011-09-09 08:33:23 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:|
|Bug Blocks:||735721, 804259|
Description Håvard Wigtil 2011-05-29 10:12:26 UTC
Description of problem: After upgrading from Fedora 14, wireless network is so slow that it's literally unusable. Even simple web pages time out before they can load, can't check mail, etc. Wired networking on upgraded machine works as normal, as do the wireless network from other devices. Version-Release number of selected component (if applicable): kernel-184.108.40.206-27.fc15.x86_64 How reproducible: always Steps to Reproduce: 1. Connect to wireless network 2. Access any network-based service Actual results: Timeouts or extremely slow responses Expected results: Internet! Additional info: I'd be happy to provide any extra information you may need. I'm connecting to a D-Link DAP-1522. Output from lspci: 03:00.0 Network controller: Intel Corporation Ultimate N WiFi Link 5300 Subsystem: Intel Corporation Device 1011 Physical Slot: 1 Flags: bus master, fast devsel, latency 0, IRQ 49 Memory at f4300000 (64-bit, non-prefetchable) [size=8K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [e0] Express Endpoint, MSI 00 Capabilities:  Advanced Error Reporting Capabilities:  Device Serial Number 00-16-ea-ff-ff-e5-76-72 Kernel driver in use: iwlagn Kernel modules: iwlagn Output from iwconfig: wlan0 IEEE 802.11abgn ESSID:"<masked>" Mode:Managed Frequency:2.412 GHz Access Point: <masked> Bit Rate=58.5 Mb/s Tx-Power=15 dBm Retry long limit:7 RTS thr:off Fragment thr:off Encryption key:off Power Management:off Link Quality=65/70 Signal level=-45 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:254 Invalid misc:35 Missed beacon:0
Comment 1 Håvard Wigtil 2011-06-02 09:12:51 UTC
This only seems to be a problem when connecting to a wireless N network. I don't see this when connecting to several G-based networks.
Comment 2 wey-yi.w.guy 2011-06-06 14:20:11 UTC
did you see firmware reload in dmesg? also, what firmware version you are using? Thanks Wey
Comment 3 Håvard Wigtil 2011-06-13 09:59:44 UTC
Sorry for the late reply, I haven't had the computer together with the N network for a while. Here's what I see i dmesg: [ 154.805031] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d [ 154.805039] iwlagn: Copyright(c) 2003-2010 Intel Corporation [ 154.805219] iwlagn 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 154.805236] iwlagn 0000:03:00.0: setting latency timer to 64 [ 154.805336] iwlagn 0000:03:00.0: Detected Intel(R) Ultimate N WiFi Link 5300 AGN, REV=0x24 [ 154.825947] iwlagn 0000:03:00.0: device EEPROM VER=0x11e, CALIB=0x4 [ 154.825954] iwlagn 0000:03:00.0: Device SKU: 0Xb [ 154.825960] iwlagn 0000:03:00.0: Valid Tx ant: 0X7, Valid Rx ant: 0X7 [ 154.848041] iwlagn 0000:03:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels [ 154.848167] iwlagn 0000:03:00.0: irq 48 for MSI/MSI-X [ 154.893660] iwlagn 0000:03:00.0: loaded firmware version 220.127.116.11 build 33692
Comment 4 Daryll 2011-06-13 21:40:32 UTC
I just installed F15 and seem to be having the same problem with my Intel 1000N. If I tell the router to do BG only, it works fine, but if allow N the performance is so bad as to be unusuable.
Comment 5 Stanislaw Gruszka 2011-06-15 11:01:04 UTC
So you are using the latest firmware. Let's try the latest driver version :-) Please test compat-wireless-next from http://people.redhat.com/sgruszka/compat_wireless.html . It contains tx power setting bug fix. This bug can manifest itself as bad tx performance.
Comment 6 Håvard Wigtil 2011-06-20 20:05:40 UTC
Apologies for the late reply, I've been away for a few days. I've installed kmod-compat-wireless-next-2011_06_14-0.fc15.2.x86_64 and rebooted, but the problem appears as before. Here's from dmesg, this includes a change to the problematic network after login: [ 19.355037] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d [ 19.355045] iwlagn: Copyright(c) 2003-2011 Intel Corporation [ 19.355231] iwlagn 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 19.355248] iwlagn 0000:03:00.0: setting latency timer to 64 [ 19.355353] iwlagn 0000:03:00.0: Detected Intel(R) Ultimate N WiFi Link 5300 AGN, REV=0x24 [ 19.375893] iwlagn 0000:03:00.0: device EEPROM VER=0x11e, CALIB=0x4 [ 19.375900] iwlagn 0000:03:00.0: Device SKU: 0Xb [ 19.375936] iwlagn 0000:03:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels [ 19.376081] iwlagn 0000:03:00.0: irq 49 for MSI/MSI-X [ 19.398003] iwlagn 0000:03:00.0: loaded firmware version 18.104.22.168 build 33692 [ 23.249178] ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 30.111953] wlan0: authenticate with 00:22:07:0a:fa:a2 (try 1) [ 30.114964] wlan0: authenticated [ 30.128999] wlan0: associate with 00:22:07:0a:fa:a2 (try 1) [ 30.131596] wlan0: RX AssocResp from 00:22:07:0a:fa:a2 (capab=0x411 status=0 aid=2) [ 30.131601] wlan0: associated [ 30.148143] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready [ 40.626273] wlan0: no IPv6 routers present [ 73.314721] wlan0: deauthenticating from 00:22:07:0a:fa:a2 by local choice (reason=3) [ 76.189813] wlan0: authenticate with f0:7d:68:fd:39:94 (try 1) [ 76.191548] wlan0: authenticated [ 76.206506] wlan0: associate with f0:7d:68:fd:39:94 (try 1) [ 76.215434] wlan0: RX AssocResp from f0:7d:68:fd:39:94 (capab=0xc31 status=0 aid=2) [ 76.215443] wlan0: associated [ 87.563750] iwlagn 0000:03:00.0: Tx aggregation enabled on ra = f0:7d:68:fd:39:94 tid = 0
Comment 7 Stanislaw Gruszka 2011-06-22 20:16:17 UTC
Hmm, this need to be more investigated ... I think using module parameter 11n_disable=1 should workaround problem here.
Comment 8 Dawid Lorenz 2011-08-05 13:51:37 UTC
I am not sure whether this is related, however after upgrading kernel to 2.6.40-4.fc15.x86_64, my Intel 1000N wireless adapter has started crashing my home router (TP-Link TL-WR1043ND with dd-wrt firmware) as soon as it gets associated and some heavier network traffic starts. For example, pinging router's local IP address would work fine but when I try to load a website, router immediately freezes to the point I need to switch its power off. I couldn't find anything useful that could suggest a reason in either /var/log/messages on my laptop as well as router's internal syslog. It just silently fails. Workaround for this is either use 11n_disable=1 driver option or put router into 802.11bg mode. Also, it used to work fine on 802.11n standard with no noticeable performance problems with 22.214.171.124-35.fc15.x86_64 kernel.
Comment 9 Dawid Lorenz 2011-08-05 13:53:30 UTC
There are additional reports of this behaviour in this thread: http://www.dd-wrt.com/phpBB2/viewtopic.php?t=140461
Comment 10 wey-yi.w.guy 2011-08-05 15:02:44 UTC
yes, we believe we also can reproduce the failure in-house here and we have engineer looking into this now. important and high priority bug for us. Thanks Wey
Comment 11 Adam Williamson 2011-08-10 05:19:30 UTC
+1! Bought a wndr-3700 today, installed dd-wrt on it, nearly smashed the thing against the wall because of this bug...
Comment 12 Don Fry 2011-08-18 16:12:38 UTC
I have tracked down the cause for the dd-wrt crash. The crash is caused by a commit to 2.6.38-rc1+ To fix: Reverting the commit by Johannes 9b7688328422b88a7a15dc0dc123ad9ab1a6e22d will fix the problem from my testing with a Netgear DGN3500. If you can comment out the line in iwl-agn.c iwl_mac_setup_register() which says: hw->max_tx_aggregation_subframes = LINK_QUAL_AGG_FRAME_LIMIT_DEF; I do not know if this will fix the general slow response, but it is worth testing. Please let me know how this affects the original problem.
Comment 13 Vincent Batts 2011-08-18 20:48:58 UTC
(In reply to comment #12) > I have tracked down the cause for the dd-wrt crash. The crash is caused by a > commit to 2.6.38-rc1+ To fix: > > Reverting the commit by Johannes 9b7688328422b88a7a15dc0dc123ad9ab1a6e22d will > fix the problem from my testing with a Netgear DGN3500. If you can comment out > the line in iwl-agn.c iwl_mac_setup_register() which says: > > hw->max_tx_aggregation_subframes = LINK_QUAL_AGG_FRAME_LIMIT_DEF; > > I do not know if this will fix the general slow response, but it is worth > testing. Please let me know how this affects the original problem. I have tried the revert of this commit on linux 3.0.3, and confirm it allows me to connect to the wireless access point, with out causing the access point to become unresponsive. Although, the first connection, when I pulled a large tarball as a test (`wget ftp://ftp.kernel.org/pub/linux/kernel/v3.0/linux-3.0.3.tar.gz`) the connection stalled out after 5mb of progress. I brought down the interface, brought it up again, re-established the WPA authentication, etc., and on the second connection, I was able to successfully pull the entire 74M tarball.
Comment 14 wey-yi.w.guy 2011-08-18 21:09:18 UTC
thank you for testing. We still try to understand why this commit cause the problem. once we root csause the problem, we will submit patch to fix it. Wey
Comment 15 Adam Williamson 2011-08-19 16:43:01 UTC
I'm at LinuxCon so I won't be able to test this for a bit, but can it really be a commit made to a 2.6.38 rc? I'm using 126.96.36.199 as a way to workaround this problem, so it seemed like this must have been caused by a 2.6.39 or 3.0.0 change (I don't have a 2.6.39 kernel to try). Anyway, I'll test the proposed fix later. Thanks! -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
Comment 16 Vincent Batts 2011-08-20 01:33:48 UTC
Adam, the setter of "hw->max_tx_aggregation_subframes" is not present in 188.8.131.52, nor in 2.39.6
Comment 17 Vincent Batts 2011-08-20 01:38:56 UTC
err. I did the same slip-up, and pulled 184.108.40.206 It for sure is not in 220.127.116.11, and after pulling 18.104.22.168, I've found that it *is* present there. I am building it to verify the same behavior. Take care, vb
Comment 18 Adam Williamson 2011-08-20 05:25:56 UTC
ah, I see, so the offending code was committed in 2.6.38-rc1 but nothing called it till 2.6.39...makes sense, I guess.
Comment 19 Stanislaw Gruszka 2011-08-22 15:22:49 UTC
(In reply to comment #14) > thank you for testing. We still try to understand why this commit cause the > problem. once we root csause the problem, we will submit patch to fix it. If commit 9b7688328422b88a7a15dc0dc123ad9ab1a6e22d not cause iwlwifi device do something that break 802.11 specification, it is ok. Bug in *WRT should be fixed, since it crash. It should be fixed, even if iwlwifi do something that do not conform 802.11, as security denial of service issue. Someone should pass information about "bad" commit to *WRT developers, to help them fix bug on their site. Also this crash problem seems to be slightly related to performance issues originally reported here in comment 0.
Comment 20 wey-yi.w.guy 2011-08-22 17:53:03 UTC
Agree, but we still need to root cause the reason on both sides and make sure iwlwifi do the right thing. Wey
Comment 21 Adam Williamson 2011-08-22 18:36:20 UTC
and, seriously, from a practical standpoint, DoSing probably the single most popular third-party router firmware isn't really a smart thing to do, whether it's an 802.11 compliance issue or not... I'm going to test the fix in a sec.
Comment 22 Adam Williamson 2011-08-22 18:37:20 UTC
note that dd-wrt 'fixes' are problematic because most dd-wrt users do not use the latest code, as it often has instabilities or regressions; there are recommended, 'known good' versions on the dd-wrt site, and indeed it's sometimes hard to get support if you use a newer version than the known-good.
Comment 23 Adam Williamson 2011-08-22 20:36:39 UTC
Confirming Vincent's result, patching the current F15 kernel git (3.0.3) to comment out the specified line seems to resolve the issue. I'm able to transfer large amounts of data over the local network at speeds of 14MB/sec, too.
Comment 24 Dave Jones 2011-08-22 21:11:01 UTC
added the revert for the next f15 build. We should probably add it in f16 too, lacking a better fix, unless the Intel guys have any better ideas ?
Comment 25 Dawid Lorenz 2011-08-22 21:46:45 UTC
(In reply to comment #24) > added the revert for the next f15 build. Does that mean the next stock kernel update for F15 should have that patch applied?
Comment 26 Dave Jones 2011-08-22 21:50:18 UTC
Comment 27 Vincent Batts 2011-08-23 02:31:07 UTC
In response to Wey-yi, I too would like to help find a root cause. True enough, that reverting that line does allow the connection not to die off. For testing sake, I saw that the legacy driver had LINK_QUAL_AGG_FRAME_LIMIT_DEF set to (31), instead of (63) as the iwl-agn driver does. Neither of these values work, but leaving max_tx_aggregation_subframes to its default (0) does work. This max_tx_aggregation_subframes is not used, but one other place in the kernel, and it is not assigning anything to it. It is getting the value of it. Also, as a further note, I have upgraded my router the latest available firmware, from it's manufacturer (ActionTec). It did not assist any. I sent this report to them. They responded that they would notify their developers, *BUT* the ownership of any fixes to the firmware, would have to come from the OEM of the device, which is Verizon. That the report should go to them instead. I can find no such place to submit a report to Verizon's OEM router firmware team, and judging by personal experience think it would be a cold day, before they would take action on such. Take care, vb
Comment 28 Adam Williamson 2011-08-23 02:38:55 UTC
vincent: so, your router isn't actually running dd-wrt? or are actiontec / verizon using dd-wrt?
Comment 29 Vincent Batts 2011-08-23 12:48:55 UTC
Adam: The router I have, is the standard issue, from Verizon for Fios, ActionTect MI424-WR. For the past year it has been using its firmware version 22.214.171.124. After having these difficulties, I updated its firmware to 20.19.8, using the builtin utility on its webmin. I have done no sort of custom flashing to this device.
Comment 30 Adam Williamson 2011-08-23 16:29:31 UTC
-- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
Comment 31 Adam Williamson 2011-08-23 23:25:31 UTC
Created attachment 519535 [details] Kernel patch to workaround this issue This patch works around the issue.
Comment 32 Stanislaw Gruszka 2011-08-24 09:25:16 UTC
(In reply to comment #21) > and, seriously, from a practical standpoint, DoSing probably the single most > popular third-party router firmware isn't really a smart thing to do, whether > it's an 802.11 compliance issue or not... I thought work around WRT problem can cause other problems with driver, but seems to be fully safe to do iwlwifi change. Actually it looks that it could be really a problem in iwlwifi driver, i.e. it send more subframes in aggregate frame than it advertise, what can overflows AP buffer.
Comment 33 Josh Boyer 2011-08-24 12:42:57 UTC
I added this to f16 and rawhide as well. The next builds there will contain this.
Comment 34 Sune Mølgaard 2011-08-31 19:28:11 UTC
Ubuntu user here. My gf has a Buffalo router running dd-wrt, and I upgraded to 11.04 on her connection, seeing the router crash shortly after booting up the upgraded system. What is interesting, however, is that *I ran the same kernel* before and after the upgrade - to the best of my knowledge, it was 2.6.39, possibly rc-something. This leads me to believe that a (then) new version of wpa_supplicant or network-manager is at least partially responsible, possibly triggering the code above, where earlier versions didn't... Just my two cents, Sune Mølgaard
Comment 35 Fedora Update System 2011-09-01 11:06:33 UTC
kernel-126.96.36.199-5.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-188.8.131.52-5.fc15
Comment 36 ventero 2011-09-01 12:02:34 UTC
Even after applying the patch from comment #31 to 3.0.3, I'm still able to crash my TP-Link TL-WR1043ND with stock firmware. The crash happens whenenver I create a lot of traffic (e.g. by using iperf) for a longer period of time (about 50-60 seconds, but the exact time varies).
Comment 37 Adam Williamson 2011-09-01 16:43:01 UTC
That sounds like it may be a different bug, ventero - the symptom all of us see for the bug reported here, and apparently fixed by the patch, is the router going down as soon as virtually any traffic is transmitted.
Comment 38 Håvard Wigtil 2011-09-04 21:31:32 UTC
I've tested kernel-184.108.40.206-5.fc15.x86_64, and the issue as *originally* *reported* persists. The "router kill" problems that first appeared in comment #8 is most likely another issue, as I never had any problems with the wireless router, and it still works for other devices at the same time that I see these problems in Fedora 15.
Comment 39 Stanislaw Gruszka 2011-09-05 07:47:43 UTC
I cloned it to 735721 as we started to track router hung here.
Comment 40 Fedora Update System 2011-09-07 00:00:40 UTC
kernel-220.127.116.11-5.fc15 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report.
Comment 41 Dawid Lorenz 2011-09-07 13:09:52 UTC
Just upgraded to 18.104.22.168-5.fc15.x86_64, rebooted machine without iwlagn 11n_disable=1 option and it seems to work fine so far, at least my router didn't freeze yet (it used to freeze within just seconds after connecting and starting some traffic over wifi). I'll report here if I spot any other issues.
Comment 42 Stanislaw Gruszka 2011-09-08 15:04:53 UTC
*** Bug 736374 has been marked as a duplicate of this bug. ***
Comment 43 bung 2011-09-09 21:32:54 UTC
(In reply to comment #41) > Just upgraded to 22.214.171.124-5.fc15.x86_64, rebooted machine without iwlagn > 11n_disable=1 option and it seems to work fine so far, at least my router > didn't freeze yet (it used to freeze within just seconds after connecting and > starting some traffic over wifi). > > I'll report here if I spot any other issues. What does iwconfig say regarding Tx excessive retries and Invalid misc, respectively?
Comment 44 Dawid Lorenz 2011-09-21 17:12:37 UTC
(In reply to comment #41) > Just upgraded to 126.96.36.199-5.fc15.x86_64, rebooted machine without iwlagn > 11n_disable=1 option and it seems to work fine so far, at least my router > didn't freeze yet (it used to freeze within just seconds after connecting and > starting some traffic over wifi). > > I'll report here if I spot any other issues. OK, so after couple of weeks I can say that there's still something wrong with wireless "n" mode. Not sure if I should report it here or in #735721, but anyway - since I've re-enabled "n" mode in the driver, my WLAN router no longer freezes as described previously, however I am experiencing intermittent performance issues where wireless gets slow as hell and virtually unusable, to the point where router just crashes and reboots by itself. Pinging WLAN router results in massive packet loss and long response times: adl@v3350 ~$ ping tplink.adlnet PING tplink.adlnet (192.168.0.254) 56(84) bytes of data. 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=1 ttl=64 time=1535 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=2 ttl=64 time=2502 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=3 ttl=64 time=4922 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=4 ttl=64 time=5420 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=5 ttl=64 time=6123 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=6 ttl=64 time=6176 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=7 ttl=64 time=6302 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=8 ttl=64 time=6187 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=9 ttl=64 time=5833 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=11 ttl=64 time=10492 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=13 ttl=64 time=11581 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=17 ttl=64 time=10931 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=20 ttl=64 time=10296 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=21 ttl=64 time=9498 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=22 ttl=64 time=9349 ms 64 bytes from tplink.adlnet (192.168.0.254): icmp_req=24 ttl=64 time=7604 ms ^C --- tplink.adlnet ping statistics --- 32 packets transmitted, 16 received, 50% packet loss, time 33179ms rtt min/avg/max/mdev = 1535.332/7172.330/11581.424/2872.532 ms, pipe 12 This problem is intermittent - happens totally randomly at various times of the day (or night). Sometimes just forcing reconnect via NetworkManager seem to work around the issue and things get back to normal, but sometimes I wait until router surrenders and reboots by itself, so the subsequent connection is working stable again. Nonetheless, I've switched "n" mode off again for few days and no such issue occurred, so it's still somehow related with "n" mode. /var/log/messages doesn't say anything interesting, maybe apart from things like: iwlagn 0000:09:00.0: Aggregation not enabled for tid 6 because load = 3 But I'm not certain if that's related.