Bug 671366
Summary: | Wifi connection speed is very slow (intel PRO/Wireless 3945ABG) | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Stanislaw Gruszka <sgruszka> | ||||||||||||||||||||||
Component: | kernel | Assignee: | Stanislaw Gruszka <sgruszka> | ||||||||||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||||||||||
Priority: | low | ||||||||||||||||||||||||
Version: | 14 | CC: | aquini, brian, dougsland, erappleman, gansalmon, itamar, jjardon, jonathan, kernel-maint, madhu.chinakonda, radoslaw.piliszek, sgruszka, silvioto, wey-yi.w.guy | ||||||||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||||||
Hardware: | i686 | ||||||||||||||||||||||||
OS: | Linux | ||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||||||
Clone Of: | 654599 | Environment: | |||||||||||||||||||||||
Last Closed: | 2011-05-25 14:17:13 UTC | Type: | --- | ||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||
Bug Depends On: | 654599, 679002 | ||||||||||||||||||||||||
Bug Blocks: | |||||||||||||||||||||||||
Attachments: |
|
Description
Stanislaw Gruszka
2011-01-21 12:29:45 UTC
My bug seems to be unrelated to kernel. I compiled 2.6.32 and still experience the bug. However, when I pkill dbus and use command line tools to connect (wpa_supplicant, dhclient, wget) I don't experience problems. Disabling NetworkManager completely (using icon) and then connecting using command line tools (wpa_supplicant, dhclient): Connection in Firefox is fine even with a stock kernel. As far I could not imagine how NetworkManager could affect performance, perhaps DHCP implementation do not set proper default gateway or other network parameter. Or perhaps NM do something that broke AP, and it is not able send us frame fast. But maybe this is still kernel issue, but triggered only because NM do something that wpa_supplicant + dhcpclient don't do. When you download file, and run iwconfig many times, with and without NM, does values of Bit Rate differ? With NM: Start: 11 Download: 2 - 5.5 (actual speed is at first better than this and then much worse) Without NM: Start: 54 Download: 48 - 54 I think I could be also affected by PLCP - after a longer download the bit rate seems fine (54) but it is hard to even start a connection. Now with NM: constant 54 but speed is poor. (In reply to comment #5) > I think I could be also affected by PLCP Rather not if problem happens on 2.6.32. (In reply to comment #6) > Now with NM: > constant 54 but speed is poor. So that mean we are sending and 54 Mbis/s but receive form AP at low rate. Please at first place find out if new firmware for your AP is availably and update if so. Also test these patches (together) attached to bug 654599: 0002-iwl3945-remove-check_plcp_health.patch 0001-iwl3945-do-not-use-agn-specific-IWL_RATE_MASK.patch iwlwifi-mac80211-revert-QoS-changes.patch If above fail, we I will think what to do next, I still have some ideas ;-) Didn't they get "applied" when I switched to .32? My AP has already been upgraded (Gentoo and Windows - OK). Patches are for 2.6.35. I know but if 2.6.34 (which should't need them) doesn't work, then there seems to be no point (but I'll try because you want it so much :-) ). --- I found out that when I don't use NM before setting up connection manually the speed is slow both in bit rate and in reality. But when I set up a connection with NM, disable NM and set up it again manually it's all fine. 0001-iwl3945-do-not-use-agn-specific-IWL_RATE_MASK.patch potentially fix the problem that affect .34 as well. Others patches are needed to assure that you do not hit different bug. I applied all these 3 patches, compiled, installed (it named it 2.6.35.10+ - that plus is really handy) and booted. The same speed drop. Created attachment 474964 [details] iwl3945-debug-print-frames.patch Radosław, This patch add driver debug option to print all send/revived frames + it's rates. Please configure syslog as described here: > https://fedoraproject.org/wiki/DebugWireless#Configure_syslog_to_log_kernel_debug_messages then log messages like in example: $ rmmod iwl3945 iwlcore $ echo > /var/log/kernel $ modprobe iwl3945 debug=0x40 Connect with wpa_supplicant, download some data $ rmmod iwl3945 iwlcore $ cp /var/log/kernel ~/kernel.good Repeat the same with NetworkManager, and attach kernel.good and kernel.bad files. Debug patch is for 2.6.35. Also could you check, if problem is still reproducible with upstream kernel? You can compile upstream kernel by yourself or use compat-wireless http://wireless.kernel.org/en/users/Download (on fedora you can use: http://people.redhat.com/sgruszka/compact_wireless.html) (In reply to comment #15) > Also could you check, if problem is still reproducible with upstream kernel? > You can compile upstream kernel by yourself or use compat-wireless > http://wireless.kernel.org/en/users/Download (on fedora you can use: > http://people.redhat.com/sgruszka/compact_wireless.html) You mean compiling 2.6.37? Anyway, the only Linux OS that works properly is Mandriva. I have never used this one laptop with Linux and after tests I can tell all relatively new (maybe all) Ubuntus (9.04-10.10), Fedoras (12-14), and openSuSEs (11.2-11.3) are affected :/ (In reply to comment #13) > Created attachment 474964 [details] > iwl3945-debug-print-frames.patch > > Radosław, > > This patch add driver debug option to print all send/revived frames + it's > rates. > > Please configure syslog as described here: > > https://fedoraproject.org/wiki/DebugWireless#Configure_syslog_to_log_kernel_debug_messages > > then log messages like in example: > > $ rmmod iwl3945 iwlcore > $ echo > /var/log/kernel > $ modprobe iwl3945 debug=0x40 > Connect with wpa_supplicant, download some data > $ rmmod iwl3945 iwlcore > $ cp /var/log/kernel ~/kernel.good > > Repeat the same with NetworkManager, and attach kernel.good and kernel.bad > files. Ok, I will try. (In reply to comment #16) > You mean compiling 2.6.37? 2.6.37 is ok, you can compile whole kernel, or use compat-wireless. > Anyway, the only Linux OS that works properly is Mandriva. So, that confirm problem happens only with NetworkManager, right? You can also test 2.6.38-rcX, it have patches that defer some actions when scan is performed. If problem is caused because NM frequently request scans, they may help. While trying to get the debug info you asked for I have thoroughly researched the strange behavior: Slow speed happens ALWAYS if I use NetworkManager to connect. Slow speed happens the FIRST time I create a connection (even with wpa_supplicant). The next time created connection works fine. (time = since load of module) Due to the second fact the kernel.good is a little bit dirty and the clean part (after 2nd connection with wpa_supplicant) actually consists of almost the same lines with rate 36-54. Created attachment 475243 [details]
Bad behavior.
You forgot to attach good case. But wait, I will prepare next debug patch.
> Jan 25 20:05:06 fedora kernel: [ 241.979056] RX: Frame fc=0x4288 DA=00:1c:bf:3e:3c:c7 SA=70:5a:b6:0f:f0:b3 len=1550 rate=54
> Jan 25 20:05:06 fedora kernel: [ 241.979820] RX: Frame fc=0x4288 DA=00:1c:bf:3e:3c:c7 SA=70:5a:b6:0f:f0:b3 len=1550 rate=54
> Jan 25 20:05:06 fedora kernel: [ 241.979914] TX: Frame fc=0x4188 DA=70:5a:b6:0f:f0:b3 SA=00:1c:bf:3e:3c:c7 len=102 rate=0
> Jan 25 20:05:08 fedora kernel: [ 244.684852] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=127 rate=0
> Jan 25 20:05:08 fedora kernel: [ 244.684891] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=127 rate=0
> Jan 25 20:05:08 fedora kernel: [ 244.693185] RX: Frame fc=0x4288 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=335 rate=2
> Jan 25 20:05:08 fedora kernel: [ 244.699925] RX: Retry fc=0x4a88 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=201 rate=2
> Jan 25 20:05:08 fedora kernel: [ 244.700247] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=110 rate=0
> Jan 25 20:05:08 fedora kernel: [ 244.704336] TX: Frame fc=0x0040 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=50 rate=0
> Jan 25 20:05:08 fedora kernel: [ 244.707951] RX: PrbRsp fc=0x0050 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=251 rate=1
> Jan 25 20:05:08 fedora kernel: [ 244.723636] RX: Beacon fc=0x0080 DA=ff:ff:ff:ff:ff:ff SA=00:27:19:fc:8e:aa len=257 rate=1
> Jan 25 20:05:08 fedora kernel: [ 244.759975] RX: Frame fc=0x4288 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=110 rate=2
> Jan 25 20:05:08 fedora kernel: [ 244.760089] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=102 rate=0
> Jan 25 20:05:08 fedora kernel: [ 244.760275] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=967 rate=0
> Jan 25 20:05:09 fedora kernel: [ 244.815788] RX: Frame fc=0x4288 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=102 rate=5
> Jan 25 20:05:09 fedora kernel: [ 244.827584] ieee80211 phy1: I iwl3945_rx_reply_rx Bad CRC or FIFO: 0x00000702.
> Jan 25 20:05:09 fedora kernel: [ 244.829979] ieee80211 phy1: I iwl3945_rx_reply_rx Bad CRC or FIFO: 0x00000702.
> Jan 25 20:05:09 fedora kernel: [ 244.834271] RX: Beacon fc=0x0080 DA=ff:ff:ff:ff:ff:ff SA=00:27:19:fc:8e:aa len=257 rate=1
> Jan 25 20:05:09 fedora kernel: [ 244.841902] RX: Retry fc=0x4a88 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=1455 rate=2
> Jan 25 20:05:09 fedora kernel: [ 244.842076] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=102 rate=0
> Jan 25 20:05:09 fedora kernel: [ 244.843100] ieee80211 phy1: I iwl3945_rx_reply_rx Bad CRC or FIFO: 0x00000702.
> Jan 25 20:05:09 fedora kernel: [ 244.844845] ieee80211 phy1: I iwl3945_rx_reply_rx Bad CRC or FIFO: 0x00000702.
> Jan 25 20:05:09 fedora kernel: [ 244.848962] RX: Retry fc=0x4a88 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=752 rate=2
> Jan 25 20:05:09 fedora kernel: [ 244.849064] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=102 rate=0
> Jan 25 20:05:09 fedora kernel: [ 244.850866] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=125 rate=0
> Jan 25 20:05:09 fedora kernel: [ 244.850916] TX: Frame fc=0x4188 DA=00:27:19:fc:8e:aa SA=00:1c:bf:3e:3c:c7 len=125 rate=0
> Jan 25 20:05:09 fedora kernel: [ 244.865873] ieee80211 phy1: I iwl3945_rx_reply_rx Bad CRC or FIFO: 0x00000702.
> Jan 25 20:05:09 fedora kernel: [ 244.866807] ieee80211 phy1: I iwl3945_rx_reply_rx Bad CRC or FIFO: 0x00000702.
> Jan 25 20:05:09 fedora kernel: [ 244.869322] RX: Retry fc=0x4a88 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=344 rate=2
> Jan 25 20:05:09 fedora kernel: [ 244.869926] RX: Frame fc=0x4288 DA=00:1c:bf:3e:3c:c7 SA=00:27:19:fc:8e:aa len=210 rate=5
AP stop sending us frames at 54 rate after ProbeRequest, ProbeResponse. That frames are send when active scan is performed, so that's explain why problems happen only with NetworkManager.
Created attachment 475352 [details]
iwl3945-debug-print-frames-and-probe-req.patch
Please repeat experiment with that patch. It print Probe Request supported rates to see if we send proper information to AP, and have correct TX rates as well. Attach good (even if it's erroneous at the beginning) and bad case log.
Created attachment 475440 [details]
#2 BAD
Created attachment 475441 [details]
#2 GOOD
I have applied your latest patch. Oh, and it is worth noting that I got proper connection the first time I connected with wpa_supplicant. I am confused now :S I have compiled Linux from cloned Linus's tree (>2.6.38-rc2) and the problem persists. Gash, I thought we have software scanning on iwl3945 by default (I was confused because we have software crypto on 3945), but we do scanning in hardware. Could you check if disable_hw_scan=1 module option helps $ rmmod iwl3945 iwlcore $ modprobe iwl3945 disable_hw_scan=1 Please check on 2.6.38 kernel. Created attachment 475613 [details]
iwlwifi-fix-tx-power-when-channel-change-2.6.38.patch
Before testing disable_hw_scan=1, please apply this patch. It fix software scan bug, I just found.
I didn't apply your patch. This option seems to help. However, this parameter is deprecated. Do you still want me to try your patch? (I started writing my message just before it appeared) (In reply to comment #30) > Do you still want me to try your patch? No, I just wanted to know if sw scan helps. I think we could not deprecated sw scanning for older adapters like 3945, if no other solution we could find. I think I will need some more verbose logs, I will ask you about that shortly. All right. I am determined to finally fix this bug. Anyway, I can tell you that disable_hw_scan=1 does not help with Fedora's 2.6.35 kernel. > All right. I am determined to finally fix this bug.
I see :-) good.
Please provide logs from 2.6.35 with last debug patch (iwl3945-debug-print-frames-and-probe-req.patch) on bad case, however with different module option: debug=0x0843
Created attachment 475865 [details]
#3 BAD
As required, bad log is here.
I think problem is with transmission rates. On your logs tx rate is 2,5 or 11. On my system it's 48 and 54 . However everything seems to be fine when we do not scan, since AP send frames to us at 54 . But when we do scanning, we are unable to receive frames, AP things we are moved and start to send at low rate (slower rates are more noise robust and can be transmitted on longer distance). When the scan finish, AP do not back to high rates. I guess reason of that is, that we are transmitting at low rates. Created attachment 476197 [details]
iwl3945-rate-debug.patch
This patch contains previous debug prints, and some new prints to debug if we correctly receive supported rates info from AP. It also contains change, that make driver use common mac80211 rate scaling algorithm instead of custom implementation. There is some hope, that this will help. If so good, if not attach logs with same option as before: debug=0x0843
Created attachment 476229 [details]
#4 BAD
Patch didn't help.
At least TX rates improve: [ Downloads]$ grep "TX.*rate=48" kernel.bad_4 | wc -l 187 [ Downloads]$ grep "TX.*rate=54" kernel.bad_4 | wc -l 7119 [ Downloads]$ grep "TX.*rate=48" kernel.bad_3 | wc -l 0 [ Downloads]$ grep "TX.*rate=54" kernel.bad_3 | wc -l 0 But we receive on low rate from very beginning. Seems AP stays in low rate mode from last time. Does problem happens if you reset AP, then turn on computer with patched driver? 1, 2, 5 and 11 Mbit/s are CCK modulation (802.11b) rates. It must be some correlation that we send on that rates on unpached driver and receive that from AP at some point. Perhaps in hardware active scanning ProbeRequest, device introduce itself as 802.11b . But in that case, we will see slow down after each scan, but that happens only sometimes. Radosław , is there a chance you can borrow or buy wifi usb dongle, that works in monitor mode (index is here http://wireless.kernel.org/en/users/Drivers, ar9170 and rt73usb have good monitor support) ? We do not have full picture here since some frames are send/received only by firmware and can not seen them in logs. If you could capture what happens on the air, that would be great. I'm not sure if this is AP only problem or driver/firmware do something wrong. Undoubtedly AP start transfer at low rates after a scan (but only sometimes) and never go back to high rates. If we do not scan after we associate, problem will not happen. Also seems help when we do not offload scanning to hardware. However this work only in upstream, because kernel software scanning procedure was improved (when I use disable_hw_scan=1 on 2.6.35, I can achieve only about 200KBytes/s download speed). I can test it with another AP. It should be buried somewhere in house. I will also try resetting AP. More info soon. I have a notebook with AR5B93. Could it be of any use? Yes. This should be ath5k or ath9k driver, both have monitoring support. On user space, it's enough to have wireless-tools and wireshark (console tshark or wireshark-gnome). To capture proper traffic assure you have correct channel setting and interface run in monitor mode (if not, frames from other channel will be captured, or only packets from higher network layers like IP, ARP, etc). Ok. I will try it with BackTrack. Just tell me what I should be looking for. Resetting AP didn't help. Well, I would like to see how firmware sent ProbeRequest frame looks like, if we do not send broken frame (with bad supported rates information or other glitch), but as I wrote before, this is dubious. I general I want to find answer why AP put itself in low rate mode after hardware scanning and if driver can fix something to stop that. I don't know exactly what I'm looking for, but when I find this I'll tell you :-) I want you to do the same experiment as before, capture radio traffic on other laptop, compress it and attach here. Created attachment 476614 [details]
wireshark data
As requested.
Could you tell me why sometimes source field is empty?
Thanks for data, I'm analysing right now.
> Could you tell me why sometimes source field is empty?
Some control frames, mainly ACK or CTS, do not contain source address. It's not needed because ACK or CTS must be send just (9 micro seconds) after previous frame to confirm it was received. Source address is skipped because is known who send the frame - receiver of previous frame. I'm not quite sure why ACK frames have destination address, it could be skipped as well, but I guess having destination address make hardware filters implementation easier.
It's pretty clean from capture, that we are unable to receive frames that are transmitted at rate higher than 5.5 Mbit/s (we do not send ACK). AP tries to increase the speed, but since we do not ACK at increased rate, it stays at low rate. I'm not sure why we are unable to receive frames at high rate. I guess the firmware or hardware stays badly configured after hw scan. Generally only Intel can provide fix for that (Wey, are you reading? :-). BTW: RH requested for iwl3945 hw documantation, but Intel refused. So, only solution if Intel will not provide the fix, is switch to software scan. I'm going to post sw scan un-deprecate patch, and also TX rate scaling patch. Ok, I got it. I think the only thing left for me to do is to try with another AP. I doubt it would help but it won't hurt either. To be continued... :o Slow speed as well. To not confuse readers: above comment is regarding checking other AP. (In reply to comment #37) > It also contains change, that make driver use common mac80211 rate scaling > algorithm instead of custom implementation. (In reply to comment #39) > At least TX rates improve: I have worse performance results when using default mac80211 rate control algorithm (test of 5x downloading 100MB file): iwl-3945-rs: 2.41 MB/s, 1.98 MB/s, 2.32 MB/s, 2.32 MB/s, 2.37 MB/s, minstrel_ht: 2.02 MB/s, 1.97 MB/s, 1.97 MB/s, 1.92 MB/s, 2.01 MB/s So I'm not quite convinced to replace rate control code. As it didn't help me either, I fully agree with you. Downstream ndiswrapper: 23 Mbps compat-wireless iwlwifi: 4 Mbps with rare spikes up to 14 Mbps 2.6.38 iwlwifi: 4 Mbps Upstream All: 5 Mbps btw, my connection is 30/5 during speed tests, the speed during the test keeps throttling up and down. 11 Mbps to 4 to 8 to 10 it's like the driver can't decide what to do. Eric, are you using iwconfig to see speed? No. I'll give it a try to see if the reported connection rate jumps around. No, don't use it, iwconfig show rate of last send frame, what confuse some users who thinks it's average rate or something like that. So how do you measure speed? Speedtest.net, but I think it's a moot point since c91d01556f52255a31575be0cb1981c92a2a5028 seems to fix this bug. compat-wireless iwlwifi contain commit c91d01556f52255a31575be0cb1981c92a2a5028 it (I double check) and you had 4Mbit/s in comment 55. I'm not sure if speedtest.net is good method to measure wifi performance, what happens on internet between you and speedtest servers have influence on results. However ndiswrapper have constant good results and iwl3945 random bad, that simply mean there is still something wrong. Perhaps because of signal strength problems you reported elsewhere. I'm going to look at that and let you know. I probably relied too much on the speed tests and didn't do enough stress testing in the form of long-term throughput. What I have noticed is that pclp-less drivers have no problem maxing out my connection speeds if given a well-seeded torrent, but it doesn't burst as quickly as ndiswrapper for speed tests. To be more precise, on certain speed tests, the speed is still climbing as the test ends. With ndiswrapper, the speed is at its max within a second of the test starting. I'm also glad that the signal strength bug I reported is getting to the right people. Also, just for the record, 10% was an understatement. kernel-2.6.38.4-20.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.38.4-20.fc15 Patches that disable hw scan by default + some accompanying fixes that make sw scan work are already in F-15 2.6.38 kernel. They should be soon available in stable 2.6.38.5 . Radosław, thank you very much for hard work on debugging this problem! Eric , I remember about you weak signal bug, but I'm have other, with higher priority problems to work on. I will look at your problem, but I'm not sure when. kernel-2.6.38.4-20.fc15 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report. |