Bug 504316

Summary: Wireless speed drops
Product: [Fedora] Fedora Reporter: Georgi Hristozov <hristozov>
Component: kernelAssignee: Stanislaw Gruszka <sgruszka>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 11CC: itamar, jieshengz, kernel-maint, linville, marc, reinette.chatre
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-04-06 11:11:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
I'm using this patch debug frames traffic
none
iwl3945-rs-use-default.patch
none
iwl3945-use-default-rs.patch none

Description Georgi Hristozov 2009-06-05 14:57:37 UTC
Description of problem:
I'm experiencing problems with iwl3945 on my Toshiba Satellite A200-1ZW (3945ABG card). In the most time it works well at 16-20Mbps, but sometimes (2-3 times a day) the speed just "drops" to ~1Mbps. It happens much more often when the network connection is under heavy usage. The ping to the AP becomes huge (~1000ms), even if I stop all the applications with network activity. The only way to fix this is to "remind" the driver which the AP is, either by restarting the network manager, or manually. The wireless configuration (e.g. bit rate) and the IP configuration (e.g. MTU) remain the same when the bug triggers. Nothing strange appears in logs, tried Wireshark and didn't find anything suspicious. I have tried changing the network manager and the result is the same. I have tested other notebooks on the same AP and they work well. Tried Vista on the problematic machine and the speed didn't drop for 24 hours of heavy bandwidth usage.

A user from the Bulgarian LUG confirmed that he has exactly the same problem with iwl3945. Some time ago I started a thread on the Fedora list (http://osdir.com/ml/fedora-list/2009-05/msg00323.html), but the people there couldn't help with debugging this.

Version-Release number of selected component (if applicable):
2.6.29.4-167.fc11.i686.PAE

How reproducible:
Not reproducible.

Steps to Reproduce:
Not reproducible.
  
Actual results:


Expected results:


Additional info:
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 0c)
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 03)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 03)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 (rev 03)
00:1c.2 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 3 (rev 03)
00:1c.3 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 4 (rev 03)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 03)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f3)
00:1f.0 ISA bridge: Intel Corporation 82801HEM (ICH8M) LPC Interface Controller (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) IDE Controller (rev 03)
00:1f.2 SATA controller: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) SATA AHCI Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 03)
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8101E/RTL8102E PCI Express Fast Ethernet controller (rev 01)
05:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection (rev 02)
0c:04.0 CardBus bridge: Texas Instruments PCIxx12 Cardbus Controller
0c:04.1 FireWire (IEEE 1394): Texas Instruments PCIxx12 OHCI Compliant IEEE 1394 Host Controller
0c:04.2 Mass storage controller: Texas Instruments 5-in-1 Multimedia Card Reader (SD/MMC/MS/MS PRO/xD)
0c:04.3 SD Host controller: Texas Instruments PCIxx12 SDA Standard Compliant SD Host Controller

Module                  Size  Used by
nls_utf8                1500  1 
cifs                  214584  2 
aes_i586                7492  1 
aes_generic            26940  1 aes_i586
fuse                   49780  2 
mmc_block               9148  0 
omnibook               53208  0 
rfcomm                 30088  10 
sco                     9324  2 
bridge                 41488  0 
stp                     1944  1 bridge
llc                     4780  2 bridge,stp
bnep                   10888  2 
l2cap                  18884  20 rfcomm,bnep
ppdev                   6304  0 
parport_pc             22216  0 
parport                28496  2 ppdev,parport_pc
coretemp                5260  0 
hwmon                   2148  1 coretemp
sunrpc                152352  1 
ip6t_REJECT             3160  2 
nf_conntrack_ipv6      11368  3 
ip6table_filter         3132  1 
ip6_tables             10712  1 ip6table_filter
ipv6                  232288  28 ip6t_REJECT,nf_conntrack_ipv6
cpufreq_ondemand        6056  2 
acpi_cpufreq            8224  0 
dm_multipath           13512  0 
uinput                  6496  0 
snd_hda_codec_realtek   189364  1 
snd_hda_intel          24028  2 
arc4                    1580  2 
ecb                     2452  2 
snd_hda_codec          54280  2 snd_hda_codec_realtek,snd_hda_intel
uvcvideo               49784  0 
iwl3945               125016  0 
videodev               32200  1 uvcvideo
v4l1_compat            11560  2 uvcvideo,videodev
snd_hwdep               6584  1 snd_hda_codec
sdhci_pci               6412  0 
sdhci                  14212  1 sdhci_pci
btusb                  10360  2 
snd_pcm                62632  2 snd_hda_intel,snd_hda_codec
firewire_ohci          19456  0 
snd_timer              17888  1 snd_pcm
snd                    49056  10 snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_timer
pcspkr                  2156  0 
r8169                  28472  0 
soundcore               5404  1 snd
yenta_socket           21228  1 
mac80211              164704  1 iwl3945
joydev                  8964  0 
input_polldev           3076  0 
lib80211                5064  1 iwl3945
rsrc_nonstatic          9784  1 yenta_socket
bluetooth              44768  9 rfcomm,sco,bnep,l2cap,btusb
cfg80211               30560  2 iwl3945,mac80211
i2c_i801                8060  0 
firewire_core          36556  1 firewire_ohci
mii                     4028  1 r8169
mmc_core               41528  2 mmc_block,sdhci
iTCO_wdt               10464  0 
iTCO_vendor_support     2736  1 iTCO_wdt
tifm_7xx1               4996  0 
tifm_core               6304  1 tifm_7xx1
crc_itu_t               1580  1 firewire_core
snd_page_alloc          7644  2 snd_hda_intel,snd_pcm
i915                  143984  2 
drm                   165948  2 i915
i2c_algo_bit            4804  1 i915
i2c_core               18016  4 i2c_i801,i915,drm,i2c_algo_bit
video                  17336  1 i915
output                  2324  1 video

filename:       /lib/modules/2.6.29.4-167.fc11.i686.PAE/kernel/drivers/net/wireless/iwlwifi/iwl3945.ko
firmware:       iwlwifi-3945-2.ucode
license:        GPL
author:         Copyright(c) 2003-2008 Intel Corporation <ilw.com>
version:        1.2.26kds
description:    Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux
srcversion:     358AFE753F1C86BDB074776
alias:          pci:v00008086d00004227sv*sd*bc*sc*i*
alias:          pci:v00008086d00004222sv*sd*bc*sc*i*
alias:          pci:v00008086d00004227sv*sd00001014bc*sc*i*
alias:          pci:v00008086d00004222sv*sd00001044bc*sc*i*
alias:          pci:v00008086d00004222sv*sd00001034bc*sc*i*
alias:          pci:v00008086d00004222sv*sd00001005bc*sc*i*
depends:        mac80211,cfg80211,lib80211
vermagic:       2.6.29.4-167.fc11.i686.PAE SMP mod_unload 686 
parm:           antenna:select antenna (1=Main, 2=Aux, default 0 [both]) (int)
parm:           disable:manually disable the radio (default 0 [radio on]) (int)
parm:           hwcrypto:using hardware crypto engine (default 0 [software])
 (int)
parm:           debug:debug output mask (uint)
parm:           disable_hw_scan:disable hardware scanning (default 0) (int)
parm:           queues_num:number of hw queues. (int)

wlan0     IEEE 802.11abg  ESSID:"gh"  
          Mode:Managed  Frequency:2.462 GHz  Access Point: 00:1D:7E:C6:DB:03   
          Bit Rate=54 Mb/s   Tx-Power=15 dBm   
          Retry min limit:7   RTS thr:off   Fragment thr=2352 B   
          Power Management:off
          Link Quality=81/100  Signal level:-53 dBm  Noise level=-127 dBm
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

Comment 1 John W. Linville 2009-06-11 13:24:53 UTC
Sounds like there might be a problem with the iwl3945 rate control algorithm...

Comment 2 reinette chatre 2009-06-12 17:31:56 UTC
Please try the following upstream patches:

commit a5f847d75e50f89d9c524274780ee5c0b99d96b7
Author: Abhijeet Kolekar <abhijeet.kolekar>
Date:   Wed Mar 11 11:17:56 2009 -0700

    iwl3945 : fix rate scaling
    


commit 82127493a656f6293ffb1566410b5753f29991ef
Author: Abhijeet Kolekar <abhijeet.kolekar>
Date:   Tue Mar 17 21:51:51 2009 -0700

    iwl3945: control rate decrease

Comment 3 Georgi Hristozov 2009-06-13 23:38:56 UTC
As far as I can see, the patches are included in 2.6.30. I compared the diffs roughly and they seem to be in the kernel source. I compiled it, with the config from the f11 kernel, and the bug persists. I intentionally overloaded the network link and the speed dropped again.

The wlan settings seemed ok after the crash, the bit rate was still 54Mb/s, but the actual speed was terrible.

wlan0     IEEE 802.11abg  ESSID:"gh"  
          Mode:Managed  Frequency:2.462 GHz  Access Point: 00:1D:7E:C6:DB:03   
          Bit Rate=54 Mb/s   Tx-Power=15 dBm   
          Retry min limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=57/70  Signal level=-53 dBm  Noise level=-127 dBm
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

Here's the result of pinging the gateway:

7 packets transmitted, 6 received, 14% packet loss, time 6093ms
rtt min/avg/max/mdev = 684.297/736.163/822.311/51.182 ms

I don't know whether it's connected with the problem, but Firefox (and, as it seems, crond too) crashed at the exactly the same moment when the speed dropped. The system hanged for nearly a whole minute.

crond invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
crond cpuset=/ mems_allowed=0
Pid: 3098, comm: crond Not tainted 2.6.30devtest #1
Call Trace:
 [<c0484e09>] oom_kill_process+0x6e/0x1f2
 [<c04853b8>] ? select_bad_process+0x87/0xce
 [<c0485472>] __out_of_memory+0x73/0x82
 [<c04854e4>] out_of_memory+0x63/0x88
 [<c04877d7>] __alloc_pages_internal+0x2e8/0x38a
 [<c0489365>] __do_page_cache_readahead+0xb7/0x16e
 [<c0489462>] do_page_cache_readahead+0x46/0x50
 [<c04832ae>] filemap_fault+0x14d/0x320
 [<c0495262>] __do_fault+0x52/0x385
 [<c04261a5>] ? kmap_atomic_prot+0xfc/0x11b
 [<c04958c2>] handle_mm_fault+0x32d/0x7a3
 [<c0727abf>] do_page_fault+0x210/0x21f
 [<c07278af>] ? do_page_fault+0x0/0x21f
 [<c0726102>] error_code+0x72/0x78
 [<c07278af>] ? do_page_fault+0x0/0x21f
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 163
CPU    1: hi:  186, btch:  31 usd: 154
HighMem per-cpu:
CPU    0: hi:  186, btch:  31 usd: 185
CPU    1: hi:  186, btch:  31 usd:  96
Active_anon:305401 active_file:36 inactive_anon:161383
 inactive_file:1944 unevictable:2 dirty:0 writeback:1638 unstable:0
 free:12205 slab:7851 mapped:40 pagetables:3298 bounce:0
DMA free:8088kB min:64kB low:80kB high:96kB active_anon:640kB inactive_anon:684kB active_file:4kB inactive_file:340kB unevictable:0kB present:15868kB pages_scanned:960 all_unreclaimable? no
lowmem_reserve[]: 0 865 2007 2007
Normal free:40248kB min:3728kB low:4660kB high:5592kB active_anon:356560kB inactive_anon:356680kB active_file:80kB inactive_file:2084kB unevictable:0kB present:885944kB pages_scanned:35938 all_unreclaimable? no
lowmem_reserve[]: 0 0 9134 9134
HighMem free:484kB min:512kB low:1740kB high:2972kB active_anon:864404kB inactive_anon:288168kB active_file:60kB inactive_file:5352kB unevictable:8kB present:1169232kB pages_scanned:5408 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 2*4kB 0*8kB 1*16kB 2*32kB 1*64kB 2*128kB 2*256kB 2*512kB 2*1024kB 2*2048kB 0*4096kB = 8088kB
Normal: 478*4kB 332*8kB 164*16kB 124*32kB 74*64kB 40*128kB 19*256kB 6*512kB 5*1024kB 1*2048kB 1*4096kB = 40216kB
HighMem: 20*4kB 6*8kB 1*16kB 1*32kB 1*64kB 2*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 496kB
40308 total pagecache pages
154 pages in swap cache
Swap cache stats: add 141839, delete 141685, find 3052/4137
Free swap  = 0kB
Total swap = 524280kB
521936 pages RAM
294610 pages HighMem
24900 pages reserved
39711 pages shared
444189 pages non-shared
Out of memory: kill process 5535 (run-mozilla.sh) score 4542560 or a child
Killed process 5547 (firefox)

There weren't other applications running on the notebook, except Skype and a few consoles. Seems pretty strange on a machine with 2GB RAM.

Comment 4 reinette chatre 2009-06-15 13:55:27 UTC
(In reply to comment #3)
> 
> Here's the result of pinging the gateway:
> 
> 7 packets transmitted, 6 received, 14% packet loss, time 6093ms
> rtt min/avg/max/mdev = 684.297/736.163/822.311/51.182 ms

This looks really bad. Did you take this capture before or after the speed dropped? 

> 
> I don't know whether it's connected with the problem, but Firefox (and, as it
> seems, crond too) crashed at the exactly the same moment when the speed
> dropped. The system hanged for nearly a whole minute.
> 
> crond invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0

If your system is running out of memory it will affect everything running on that system.
 
> There weren't other applications running on the notebook, except Skype and a
> few consoles. Seems pretty strange on a machine with 2GB RAM.  

Potentially a memory leak on something running on that machine.

Comment 5 Georgi Hristozov 2009-06-15 15:12:44 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > 
> > Here's the result of pinging the gateway:
> > 
> > 7 packets transmitted, 6 received, 14% packet loss, time 6093ms
> > rtt min/avg/max/mdev = 684.297/736.163/822.311/51.182 ms
> 
> This looks really bad. Did you take this capture before or after the speed
> dropped? 
> 

Yes, this is after the speed dropped. Normally it is something like:

37 packets transmitted, 36 received, 2% packet loss, time 36058ms
rtt min/avg/max/mdev = 0.995/1.228/3.335/0.423 ms

> Potentially a memory leak on something running on that machine.  

Well, I hope it is a coincidence. :) Normally this bug doesn't affect the performance of the system, only the network.

Comment 6 John W. Linville 2009-08-18 15:40:10 UTC
Are you still seeing this issue with the latest F11 kernels?  What is the latest kernel you have tried?

Comment 7 Georgi Hristozov 2009-09-07 05:44:11 UTC
The problem persists with 2.6.30.5-43.fc11.x86_64.

Comment 8 Georgi Hristozov 2009-09-11 12:30:52 UTC
Today I compiled and tested the new 2.6.31 i686 kernel on the same hardware, this time running Slackware (I know it's not Red Hat related, but the problem seems to be reproducible in Fedora and you are upstream developers as well). The speed dropped again, in exactly the same way. I decided to turn on IWLWIFI_DEBUG in order to collect debugging info. I've checked IWLWIFI_SPECTRUM_MEASUREMENT, IWLWIFI_LEDS, IWL3945_SPECTRUM_MEASUREMENT as well, although they seem completely irrelevant to the rate scaling. After recompiling, the issue seems to have disappeared. In the last few hours I've transferred more than 20GB with stable speed of ~20-24MBit. Tonight I'm going to further play with the options, hoping to reproduce the problem.

By the way, can you please recommend me a debug mask which can produce useful info about the problem? I've tried just setting all the bits, but it's producing a huge amount of information and syslogd and klogd are under big pressure (60% CPU usage). I'm sticking to 0x00FF0000 for now (I think that IWL_DL_RF_KILL and IWL_DL_RATE are especially interesting), but I quite don't know what I'm doing, so I will appreciate some advice. :)

Comment 9 Georgi Hristozov 2009-09-11 13:58:57 UTC
I've just managed to reproduce the problem without touching the configuration. Seems that the additional config options do not matter. The log can be found at http://forkbomb.nl/pub/temp/iwl_msg.txt . Please note that the speed drop happened around 16:48 and after that the driver stopped outputting anything at all in syslog. It became responsive again after reconnecting to the AP.

Comment 10 Stanislaw Gruszka 2009-09-14 12:08:06 UTC
Georgi,

Just to clarify: problem is fixed in vanilla 2.6.31, right? 

Using just IWL_DL_RATE should be fine to debug, is problem is really in rate control algorithm. Turning on all flags except IWL_DL_TX_REPLY, IWL_DL_RX, IWL_DL_TX, IWL_DL_INFO should give good results. But if still some flags generate too much output and mess up things instead of giving any useful information, you can just disable this flag and try again.

File you provide in http://forkbomb.nl/pub/temp/iwl_msg.txt is somehow corrupted (?), I can not see any interesting things there.

To exclude that problem is related with memory leakage/fragmentation you could check output of /proc/buddyinfo and /proc/slabinfo . If output of these files are much different when speed drops happens and after driver restart to normal functionality, that would suggest problem is somewhere there.

Comment 11 Georgi Hristozov 2009-09-14 12:58:15 UTC
No, the problem is reproducible on vanilla 2.6.31, in exactly the same way as I've written in the beginning of my bug report. I didn't make this clear enough, sorry. :)

Anyway, I triggered the bug again. This time I've collected the log from the right place and using debug_level 0xBE7FFFFE (turned off the bits you told me). It's pretty big because of the somewhat "liberal" mask, but I hope it can be useful to you. The last message in the file is just ~10 seconds after the problem occurred.

I've monitored slabinfo too, but nothing suspicious changed. The system has 2GB RAM, has been up for just a few hours when the speed dropped, and only KDE with two text editors and two shells were running. So I think it's probably not a memory-related issue.

The log can be found at http://forkbomb.nl/pub/temp/log.tar.bz2 (843K compressed, over 100MB uncompressed)
I've uploaded the kernel config too - http://forkbomb.nl/pub/temp/kernel_config

Comment 12 Stanislaw Gruszka 2009-09-14 13:41:39 UTC
(In reply to comment #11)
> Anyway, I triggered the bug again. This time I've collected the log from the
> right place and using debug_level 0xBE7FFFFE (turned off the bits you told me).
> It's pretty big because of the somewhat "liberal" mask, but I hope it can be
> useful to you. The last message in the file is just ~10 seconds after the
> problem occurred.

I forgot to mention to also remove IWL_DL_IO flag :-/ , but it's fine,
I can filter out not necessary things using sed (and by the way refresh my sed skill :)

Thanks!

Comment 13 Stanislaw Gruszka 2009-09-22 12:52:23 UTC
Unfortunately I can not find any direct bug cause from provided logs.

We can see that driver activity is decreased in two times: one at about 34:29 and second time 34:58 . 

$ for min in 34 35 ; do for sec in `seq -f "%02g" 0 59` ; do echo -ne "$min:$sec\t" ;  grep "^Sep 14 15:$min:$sec Firebat" syslog | wc -l ; done ; 

[snip]
34:14   0
34:15   0
34:16   18084
34:17   33737
34:18   35589
34:19   34845
34:20   34517
34:21   32174
34:22   33491
34:23   34031
34:24   32465
34:25   35098
34:26   34863
34:27   35377
34:28   34739
34:29   33262
34:30   35008
34:31   33031
34:32   35023
34:33   33605
34:34   32579
34:35   34061
34:36   34377
34:37   34373
34:38   33179
34:39   31291
34:40   33543
34:41   16692
34:42   7843
34:43   8152
34:44   7872
34:45   9305
34:46   7599
34:47   8057
34:48   8709
34:49   8004
34:50   8786
34:51   9579
34:52   9119
34:53   7564
34:54   7568
34:55   8155
34:56   9475
34:57   10100
34:58   3066
34:59   333
35:00   811
35:01   1222
35:02   511
35:03   255
35:04   459
35:05   293
35:06   196
35:07   53
35:08   0
35:09   0
[snip]

$ grep ratelimit syslog
Sep 14 15:34:16 Firebat kernel: __ratelimit: 9715 callbacks suppressed
Sep 14 15:34:21 Firebat kernel: __ratelimit: 9712 callbacks suppressed
Sep 14 15:34:36 Firebat kernel: __ratelimit: 9303 callbacks suppressed
Sep 14 15:34:41 Firebat kernel: __ratelimit: 8848 callbacks suppressed
Sep 14 15:34:46 Firebat kernel: __ratelimit: 1186 callbacks suppressed
Sep 14 15:34:51 Firebat kernel: __ratelimit: 1341 callbacks suppressed
Sep 14 15:34:56 Firebat kernel: __ratelimit: 1246 callbacks suppressed
Sep 14 15:35:01 Firebat kernel: __ratelimit: 492 callbacks suppressed
Sep 14 15:35:07 Firebat kernel: __ratelimit: 50 callbacks suppressed

Number of MISSED_BEACONS_NOTIFICATION is increased after 34:29 but still we have some of them when driver works at high speed.

[stasiu@dhcp-lab-109 504316-iwlagn-speed-drops]$ grep MISSED_BEACONS syslog
Sep 14 15:34:17 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 75 i 74 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:19 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 133 i 132 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:21 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 136 i 135 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:23 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 199 i 198 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:26 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 200 i 199 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:32 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 72 i 71 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:34 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 31 i 30 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:38 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 11 i 10 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:40 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 223 i 222 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:42 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 140 i 139 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:44 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 81 i 80 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:46 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 36 i 35 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:48 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 27 i 26 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:50 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 44 i 43 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:52 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 64 i 63 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:54 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 14 i 13 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:56 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 20 i 19 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:34:58 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 188 i 187 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:35:00 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 250 i 249 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:35:02 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 81 i 80 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:35:05 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 131 i 130 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2
Sep 14 15:35:07 Firebat kernel: ieee80211 phy0: I iwl3945_rx_handle r 174 i 173 No handler needed for MISSED_BEACONS_NOTIFICATION, 0xa2

Comment 14 Stanislaw Gruszka 2009-09-22 13:11:56 UTC
Georgi, 

In current linus tree we have some new fixes related with rate scaling, could you give them a chance. You can directly check linus tree
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
or even better newest iwlwifi tree:
git://git.kernel.org/pub/scm/linux/kernel/git/rchatre/iwlwifi-2.6.git

Could you also try older firmware from:
http://intellinuxwireless.org/?n=downloads&f=ucodes_3945

Could you provide information about your network, type of network (a,b,g), used encryption, other AP or stations in radio scope, any other thing you think is relevant. Perhaps I will try to reproduce the bug by my own.

Comment 15 Stanislaw Gruszka 2009-09-25 09:09:20 UTC
I missed that in logs:

Sep 14 15:34:23 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 1, success_counter: 1, expected_tpt is not NULL
Sep 14 15:34:23 Firebat kernel: ieee80211 phy0: I rs_get_rate leave: 11

We have few such messages, all earlier than 15:34:43 when adapter slow down:

Sep 14 15:34:23 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 1, success_counter: 1, expected_tpt is not NULL
Sep 14 15:34:23 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 5, success_counter: 4, expected_tpt is not NULL
Sep 14 15:34:23 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 7, success_counter: 6, expected_tpt is not NULL
Sep 14 15:34:23 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 9, success_counter: 7, expected_tpt is not NULL
Sep 14 15:34:28 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 1, success_counter: 1, expected_tpt is not NULL
Sep 14 15:34:28 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 4, success_counter: 4, expected_tpt is not NULL
Sep 14 15:34:28 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 8, success_counter: 7, expected_tpt is not NULL
Sep 14 15:34:41 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 1, success_counter: 1, expected_tpt is not NULL
Sep 14 15:34:41 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 2, success_counter: 2, expected_tpt is not NULL
Sep 14 15:34:42 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 3, success_counter: 3, expected_tpt is not NULL
Sep 14 15:34:42 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 4, success_counter: 4, expected_tpt is not NULL
Sep 14 15:34:42 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 5, success_counter: 5, expected_tpt is not NULL
Sep 14 15:34:42 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 6, success_counter: 6, expected_tpt is not NULL
Sep 14 15:34:42 Firebat kernel: ieee80211 phy0: I rs_get_rate Invalid average_tpt on rate 11: counter: 7, success_counter: 7, expected_tpt is not NULL

Comment 16 Georgi Hristozov 2009-09-25 09:33:01 UTC
Hello,

It's a busy week for me and I will be able to test the git trees in a few days. :\ The problem appeared in different 802.11g environments - WRT54GL AP with WPA2 and ZyXEL AP with WEP. I will probably try different configurations soon.

About the log messages - I spotted them in the beginning, but thought they are not related to the issue. Can you please translate them into human language? :) Do they indicate misconfiguration, hardware problem or something else?

Comment 17 Stanislaw Gruszka 2009-09-25 11:12:10 UTC
> About the log messages - I spotted them in the beginning, but thought they are
> not related to the issue. Can you please translate them into human language? :)

I don't understand them too, at least not at level I would like to understand. As far as I can tell: 
- MISSED BEACONS notification means that adapter do not get beacon frame (http://en.wikipedia.org/wiki/Beacon_frame) from AP. As long number of lost frames are small, this should be not any problem.
- "Invalid average_tpt on rate" mean driver has not enough data (don't know exactly what data) to make decision about selecting new rate, instead it use previously selected rate.

> Do they indicate misconfiguration, hardware problem or something else?  

I thought they are suspicious but they seems to be pretty normal, hence not indicate nothing wrong.

Comment 18 Georgi Hristozov 2009-09-26 10:17:46 UTC
Unfortunately, the problem persists with the latest iwlwifi kernel tree (0e2dc29804cdfd84e2ffc23e8f65d32427129a6a). Additionally, I get some warnings (unaligned IP payload?!) when changing AP and I have to reload the module to use the wireless at all, but they seem irrelevant (they happen when the speed hasn't dropped aswell):

Sep 26 11:21:51 Firebat kernel: ------------[ cut here ]------------
Sep 26 11:21:51 Firebat kernel: WARNING: at net/mac80211/rx.c:409 __ieee80211_rx_handle_packet+0x5fe/0x6c0 [mac80211]()
Sep 26 11:21:51 Firebat kernel: Hardware name: Satellite A200
Sep 26 11:21:51 Firebat kernel: unaligned IP payload at 0xf5c94036
Sep 26 11:21:51 Firebat kernel: Modules linked in: i915 drm coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ipv6 cpufreq_ondemand speedstep_lib acpi_cpufreq freq_table ppdev parport_pc parport fuse arc4 ecb snd_hda_codec_realtek snd_hda_intel snd_hda_codec iwl3945 snd_hwdep snd_pcm iwlcore snd_timer snd mac80211 sdhci_pci sdhci soundcore mmc_core psmouse processor r8169 tifm_7xx1 intel_agp video thermal mii thermal_sys agpgart hwmon uhci_hcd cfg80211 sg tifm_core button ac i2c_i801 snd_page_alloc output serio_raw led_class battery rfkill evdev
Sep 26 11:21:51 Firebat kernel: Pid: 0, comm: swapper Not tainted 2.6.31-wl-git #2
Sep 26 11:21:51 Firebat kernel: Call Trace:
Sep 26 11:21:51 Firebat kernel:  [<f8493dde>] ? __ieee80211_rx_handle_packet+0x5fe/0x6c0 [mac80211]
Sep 26 11:21:51 Firebat kernel:  [<f8493dde>] ? __ieee80211_rx_handle_packet+0x5fe/0x6c0 [mac80211]
Sep 26 11:21:51 Firebat kernel:  [<c102ce26>] ? warn_slowpath_common+0x76/0xd0
Sep 26 11:21:51 Firebat kernel:  [<f8493dde>] ? __ieee80211_rx_handle_packet+0x5fe/0x6c0 [mac80211]
Sep 26 11:21:51 Firebat kernel:  [<c102cecb>] ? warn_slowpath_fmt+0x2b/0x30
Sep 26 11:21:51 Firebat kernel:  [<f8493dde>] ? __ieee80211_rx_handle_packet+0x5fe/0x6c0 [mac80211]
Sep 26 11:21:51 Firebat kernel:  [<f85e94be>] ? iwl3945_irq_tasklet+0x4ae/0x15a0 [iwl3945]
Sep 26 11:21:51 Firebat kernel:  [<f8482e17>] ? ieee80211_tasklet_handler+0xe7/0x110 [mac80211]
Sep 26 11:21:51 Firebat kernel:  [<c1035ed5>] ? run_timer_softirq+0x35/0x1c0
Sep 26 11:21:51 Firebat kernel:  [<c1031608>] ? tasklet_action+0x48/0xa0
Sep 26 11:21:51 Firebat kernel:  [<c1031cdb>] ? __do_softirq+0x8b/0x110
Sep 26 11:21:51 Firebat kernel:  [<c105abcd>] ? handle_IRQ_event+0x2d/0xc0
Sep 26 11:21:51 Firebat kernel:  [<c105d261>] ? move_native_irq+0x11/0x50
Sep 26 11:21:51 Firebat kernel:  [<c1031d8d>] ? do_softirq+0x2d/0x40
Sep 26 11:21:51 Firebat kernel:  [<c1004dc0>] ? do_IRQ+0x50/0xc0
Sep 26 11:21:51 Firebat kernel:  [<c10034e9>] ? common_interrupt+0x29/0x30
Sep 26 11:21:51 Firebat kernel:  [<f828812e>] ? acpi_idle_enter_bm+0x270/0x2a5 [processor]
Sep 26 11:21:51 Firebat kernel:  [<c130860f>] ? cpuidle_idle_call+0x6f/0xc0
Sep 26 11:21:51 Firebat kernel:  [<c1001f04>] ? cpu_idle+0x54/0x70
Sep 26 11:21:51 Firebat kernel:  [<c1515865>] ? start_kernel+0x2d1/0x334
Sep 26 11:21:51 Firebat kernel:  [<c151535d>] ? unknown_bootoption+0x0/0x1e9
Sep 26 11:21:51 Firebat kernel: ---[ end trace c9245dfa3a0496aa ]---

Comment 19 reinette chatre 2009-09-29 18:23:34 UTC
(In reply to comment #18)
> Additionally, I get some warnings
> (unaligned IP payload?!) when changing AP and I have to reload the module to
> use the wireless at all, but they seem irrelevant (they happen when the speed
> hasn't dropped aswell):
> 
> Sep 26 11:21:51 Firebat kernel: ------------[ cut here ]------------
> Sep 26 11:21:51 Firebat kernel: WARNING: at net/mac80211/rx.c:409
> __ieee80211_rx_handle_packet+0x5fe/0x6c0 [mac80211]()
> Sep 26 11:21:51 Firebat kernel: Hardware name: Satellite A200
> Sep 26 11:21:51 Firebat kernel: unaligned IP payload at 0xf5c94036

There is no need to set CONFIG_MAC80211_DEBUG_PACKET_ALIGNMENT since you are running on platform that can handle unaligned payloads.

Comment 20 Georgi Hristozov 2009-09-29 18:48:30 UTC
Yes, I've already found that (through the kernel bugzilla). Seems that I have accidentally switched on this option and this problem is unrelated to the rate scaling issues.

Comment 21 Stanislaw Gruszka 2009-10-02 08:01:22 UTC
Good news is I'm able to reproduce the issue, bad is I can not figure out what is going wrong as far.

The way I can reproduce is: downloading big file with speed 20Mbit/s, then go away from AP - signal strength become very week and speed decrease to something like 1.2Mbit/s . Then come back near AP, signal strength become high, but speed not. After dissociate and associate again with AP effective download rate come back to 20Mbit/s. Note I have to go quite far from AP to make signal very poor, otherwise after coming back speed become high automatically.

Comment 22 Stanislaw Gruszka 2009-10-02 08:08:32 UTC
Created attachment 363428 [details]
I'm using this patch debug frames traffic

Comment 23 Stanislaw Gruszka 2009-10-02 08:21:23 UTC
Log with applied patch when dowlad file and walking with laptop, looks like this: 

*) At the begining when signal is high:

 Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Tx sta id: 0, rate: 3 (plcp) 54 (ieee), flags: 0x   8 cck/ofdm mask: 0xf/0xff
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=49, tim=19734808 usec, phy=0x11, chnl=1 rate 54
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=49, tim=19735122 usec, phy=0x11, chnl=1 rate 54
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Tx sta id: 0, rate: 3 (plcp) 54 (ieee), flags: 0x   8 cck/ofdm mask: 0xf/0xff
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=49, tim=19735436 usec, phy=0x11, chnl=1 rate 54
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=49, tim=19735678 usec, phy=0x11, chnl=1 rate 54
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Tx sta id: 0, rate: 3 (plcp) 54 (ieee), flags: 0x   8 cck/ofdm mask: 0xf/0xff
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=39, tim=19736444 usec, phy=0x11, chnl=1 rate 54
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=50, tim=19736758 usec, phy=0x11, chnl=1 rate 54
Oct  2 08:53:29 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=49, tim=19737072 usec, phy=0x11, chnl=1 rate 54

*) Then when signal become low:

Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Tx sta id: 0, rate: 110 (plcp) 11 (ieee), flags: 0x   8 cck/ofdm mask: 0xf/0xff
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=19, tim=4419464 usec, phy=0x13, chnl=1 rate 2
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=19, tim=4426260 usec, phy=0x13, chnl=1 rate 2
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Tx sta id: 0, rate: 110 (plcp) 11 (ieee), flags: 0x   8 cck/ofdm mask: 0xf/0xff
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=19, tim=4447594 usec, phy=0x13, chnl=1 rate 2
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=19, tim=4456286 usec, phy=0x13, chnl=1 rate 2
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Tx sta id: 0, rate: 110 (plcp) 11 (ieee), flags: 0x   8 cck/ofdm mask: 0xf/0xff
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=19, tim=4463010 usec, phy=0x13, chnl=1 rate 2
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=19, tim=4480436 usec, phy=0x13, chnl=1 rate 2
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Tx sta id: 0, rate: 110 (plcp) 11 (ieee), flags: 0x   8 cck/ofdm mask: 0xf/0xff
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=18, tim=4487205 usec, phy=0x13, chnl=1 rate 2
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Beacon: 0x0080, dst=0xff, src=0x3e, rssi=19, tim=4502989 usec, phy=0x13, chnl=1 rate 1
Oct  2 08:53:56 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=19, tim=4515701 usec, phy=0x13, chnl=1 rate 2
 
*) Then when signal again become high:

Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Tx sta id: 0, rate: 3 (plcp) 54 (ieee), flags: 0x   8 cck/ofdm mask: 0xf/0xff
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=67, tim=35143592 usec, phy=0x17, chnl=1 rate 11
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=67, tim=35145399 usec, phy=0x17, chnl=1 rate 11
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=67, tim=35151952 usec, phy=0x17, chnl=1 rate 2
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=67, tim=35158280 usec, phy=0x17, chnl=1 rate 2
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=67, tim=35164805 usec, phy=0x17, chnl=1 rate 2
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=68, tim=35172467 usec, phy=0x17, chnl=1 rate 2
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=67, tim=35174208 usec, phy=0x17, chnl=1 rate 11
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=68, tim=35175859 usec, phy=0x17, chnl=1 rate 11
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=67, tim=35177897 usec, phy=0x17, chnl=1 rate 11
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=68, tim=35184930 usec, phy=0x17, chnl=1 rate 2
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=68, tim=35192088 usec, phy=0x17, chnl=1 rate 2
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=68, tim=35199220 usec, phy=0x17, chnl=1 rate 2
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Retry: 0x4a88, dst=0xcc, src=0x3c, rssi=68, tim=35206280 usec, phy=0x17, chnl=1 rate 2
Oct  2 08:54:27 dhcp-lab-161 kernel: iwl3945 0000:03:00.0: Frame: 0x4288, dst=0xcc, src=0x3c, rssi=68, tim=35208183 usec, phy=0x17, chnl=1 rate 11

As we can se we have high signal RSSI=68 but lots of replays. We are sending data at 54 Mbit/s but AP to us at 2 or 11 Mbit/s.

Comment 24 Stanislaw Gruszka 2009-10-06 15:05:43 UTC
Taking into account that AP send to Station at low rate, may suggest this can be problem with AP. However Georgi wrote that when he use Windows everything was fine. In my case I do test (walk far away from AP) with wifi usb dongle (Ralink chipset) and everything was fine to - after coming back near AP transfer become high again.

Comment 25 Stanislaw Gruszka 2009-10-06 15:13:58 UTC
Reinette, John,

Do You have any hints where the bug could be or how to debug this?

As far I'm sniffing over radio using wireshark, I'm comparing Ralink and Intel transfers, but can not found anything than can cause this issue.

Comment 26 Stanislaw Gruszka 2009-10-06 15:51:09 UTC
Note: problem is not fixed upstream, I reproduced it on iwlwifi Linus and fedora kernels trees.

Comment 27 Georgi Hristozov 2009-10-06 18:59:40 UTC
I'm glad that you have managed to reproduce the issue :)

It can't be a problem with the AP - as you have drawn attention to, everything is OK under Windows. Moreover I have tested this with 3 different APs from different vendors.

By the way, I can reproduce the problem without moving my notebook. Maybe the interference from other networks on closer channels (the environment is the centre of Sofia) is helping the rate scaling issue.

Comment 28 John W. Linville 2009-10-06 19:22:07 UTC
Created attachment 363883 [details]
iwl3945-rs-use-default.patch

Dunno if this will even work, but you might try using Minstrel instead of iwl3945's custom rate control algorithm...?

Comment 29 Stanislaw Gruszka 2009-10-07 11:04:41 UTC
Created attachment 363954 [details]
iwl3945-use-default-rs.patch

We have to remove explicit calls to iwl3945_rate_scale_init() to use "minstrel" rate scaling algorithm. 

However patch does not help. Bug must be somewhere else. Note, some rate scaling related functions (iwl3945_hw_build_tx_cmd_rate, iwl3945_init_hw_rate_table) still are used after applied this patch.

Comment 30 jason 2009-10-29 23:37:42 UTC
This information may be relevant. 
My wiress card is Broadcom Corporation BCM4322 802.11a/b/g/n Wireless LAN Controller (rev 01). My Ap is trednet TEW-632BRP.  The driver is from here:http://www.broadcom.com/support/802.11/linux_sta.php.
Kernel is 2.6.30.8-64.fc11.x86_64. 

My wireless network becomes slow after a while. Then I can not connect to my wireless AP after 5-6 hour. 

802.11 G AP seems ok. 

At the same time, I observed this in /var/log/messages:
Oct 29 16:34:56 f11 kernel: __ratelimit: 5 callbacks suppressed
Oct 29 16:34:56 f11 kernel: hpet1: lost 1 rtc interrupts
Oct 29 16:35:03 f11 kernel: hpet1: lost 1 rtc interrupts
Oct 29 16:35:04 f11 kernel: hpet1: lost 1 rtc interrupts
Oct 29 16:35:15 f11 kernel: hpet1: lost 1 rtc interrupts
Oct 29 16:35:21 f11 kernel: hpet1: lost 1 rtc interrupts

If I disable wireless, I do not have those messages.

-jason

Comment 31 John W. Linville 2009-10-30 13:12:18 UTC
Jason, that driver is not supported by Fedora.  Feel free to report problems to Broadcom, but please don't pollute this bug with reports about that driver.

Comment 32 Marc 2009-11-16 19:00:12 UTC
I have the exact same problem. My setup is as follow:

- Thinkpad T61 with iwl3945 wifi
- The Wifi router is Linksys WRT54GSV4
- Currently using latest 2.6.31 kernel

I've seen this problem for a couple years now, and it sometimes go away with kernel/firmware updates, and eventually comes back after a distro upgrade.

The symptoms are the same: the link is working great, and then after a heavy transfer, the speed drops to ~100KB/s. I'm realtively far away from the router (two stories UP, with couple concrete walls in between, but I can reproduce the bug getting closer to the wifi router. Can't seem to reproduce when in the same room as wifi router.


--
Marc

Comment 33 Stanislaw Gruszka 2010-01-14 08:03:45 UTC
Hi, sorry for delay I was busy working on RHEL.

I updated AP firmware (I have WRT61ON) and problem disappears for me. I can still reproduce with other router (WRT160NL with original firmware), but not in the same way as before, speed after getting back good signal increase to a half of previous. Now I think this is AP bug, that is not encountered on windows driver, only happens with linux stations. 

Georgi, Marc,

could you try upgrade firmware in your AP and see if that help? If you are using newest firmware could you try to install OpenWRT or dd-wrt?  I installed OpenWRT on WRT160NL and speeds drops gone.

Comment 34 Marc 2010-01-16 14:39:28 UTC
I have the most up2date Firmware on my router, and I was still getting the problem.

There must have been a recent change in the kernel (or drivers for the iw3945 card) because I have not seen the problem for a while. As I said in my previous post, the problem has come and gone in the past as well... so this might as well come back with another update (kernel or drivers).

Thanks.

Comment 35 Georgi Hristozov 2010-04-03 09:36:15 UTC
It's seems that the problem is now fixed - I haven't experienced this issue for a few months, using the newest stable kernels.

Comment 36 Georgi Hristozov 2010-04-03 09:37:05 UTC
It seems that the problem is now fixed - I haven't experienced this issue for a few months, using the newest stable kernels.