Description of problem: After a while using the computer for a while, the wireless speed drops to something like 1-2k. I could not fix it any other way other than a system restart (up/downing network connection or NetworkManager does not help) The device model is "RaLink RT2860" It is a pcix wireless card, I guess there is a problem in kernel module for this hardware! And I am sure there is no problem with the network connection or firefox. I monitor the speed using netspeed applet, and my laptop which also has fedora 14 works fine while the pc has this bandwidth. (This is haelty state) # iwconfig wlan0 wlan0 IEEE 802.11bgn ESSID:"CC251" Mode:Managed Frequency:2.462 GHz Access Point: 00:05:B4:06:C6:20 Bit Rate=1 Mb/s Tx-Power=9 dBm Retry long limit:7 RTS thr:off Fragment thr:off Encryption key:off Power Management:off Link Quality=70/70 Signal level=15 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 Version-Release number of selected component (if applicable): Hardware is: RaLink RT2860 kernel-2.6.35.6-48.fc14.x86_64 NetworkManager-0.8.1-10.git20100831.fc14.x86_64 How reproducible: 1. Turn on the computer 2. Use it for a while Steps to Reproduce: Nothing Actual results: Wireless speed reaches almost zero Expected results: Additional info:
We have a few bug reports about slow wireless on F-14 with 2.6.35 kernel, mostly on iwl3945 and iwl4965 but also with rt73. So I guess there is something wrong with mac80211 probably with rate scaling code.
Today It happened again, here is something I found it might be useful, This is the output of dmesg: [ 622.711358] wlan0: associated [ 622.717196] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready [ 622.717377] cfg80211: Calling CRDA for country: JP [ 623.018100] Intel AES-NI instructions are not detected. [ 623.044561] padlock: VIA PadLock not detected. [ 632.850029] wlan0: no IPv6 routers present [ 2052.114334] irq 17: nobody cared (try booting with the "irqpoll" option) [ 2052.114339] Pid: 0, comm: swapper Tainted: P 2.6.35.6-48.fc14.x86_64 #1 [ 2052.114341] Call Trace: [ 2052.114342] <IRQ> [<ffffffff810a6e2b>] __report_bad_irq.clone.1+0x3d/0x8b [ 2052.114349] [<ffffffff810a6f93>] note_interrupt+0x11a/0x17f [ 2052.114352] [<ffffffff810a7a73>] handle_fasteoi_irq+0xa8/0xce [ 2052.114355] [<ffffffff8100c2ea>] handle_irq+0x88/0x90 [ 2052.114357] [<ffffffff8146f034>] do_IRQ+0x5c/0xb4 [ 2052.114360] [<ffffffff81469593>] ret_from_intr+0x0/0x11 [ 2052.114361] <EOI> [<ffffffff8102b7f9>] ? native_safe_halt+0xb/0xd [ 2052.114366] [<ffffffff81010f03>] ? need_resched+0x23/0x2d [ 2052.114367] [<ffffffff8101102a>] default_idle+0x34/0x4f [ 2052.114370] [<ffffffff81008325>] cpu_idle+0xaa/0xcc [ 2052.114373] [<ffffffff81461f2a>] start_secondary+0x24d/0x28e [ 2052.114374] handlers: [ 2052.114375] [<ffffffff81332944>] (usb_hcd_irq+0x0/0x7c) [ 2052.114378] [<ffffffffa00697da>] (rt2800pci_interrupt+0x0/0x18d [rt2800pci]) [ 2052.114384] Disabling IRQ #17 And I removed the module from kernel, put it again and worked fine No need to restart that way
Anything better with current upstream drivers? http://people.redhat.com/sgruszka/compat_wireless.html
The problem was much sever in FC15, the network is usually too slow something like 32k-100k instead of 1M-2M (I mean the internet speed) After a while I found out these kmod-2860 (for my model) and similar ones on rpmfusion exist and they kind of cut the cheese. I wonder what is the issue with them that they are not being packed on fedora?
The problem still exists on Fedora 16 again after "Disabling IRQ #17" wireless goes extremely slow the rt2860 drivers on fc15 partially solved the problem, I mean they were ok but that's not available for fc16
Created attachment 530778 [details] rt2800pci_irq_none_workaround.patch Can you check if this simple workaround make issue gone? Probably the best way to test patch is compile complat-wireless (http://linuxwireless.org/download/compat-wireless-2.6/compat-wireless-2.6.tar.bz2) from source with patch on top, however I'm not sure if current compat-wireless compile on F-16 . Otherwise needs to rebuild kernel. If you are not able to do this, let me know, I will prepare kernel build in koji.
After selecting the driver model, the package was compiled (rt28xx driver complies but other may not) After modprobe there was a crash [16639.312400] rt2800pci 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [16639.312424] rt2800pci 0000:02:00.0: setting latency timer to 64 [16639.322186] ------------[ cut here ]------------ [16639.322215] WARNING: at net/wireless/core.c:562 wiphy_register+0x5f/0x3d8 [cfg80211]() [16639.322222] Hardware name: MS-7599 [16639.322226] Modules linked in: rt2800pci(+) rt2800lib rt2x00pci rt2x00lib tcp_lp ppdev parport_pc lp parport fuse 8021q garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack snd_hda_codec_hdmi arc4 snd_hda_codec_via crc_ccitt mac80211 snd_hda_intel snd_hda_codec snd_seq sp5100_tco virtio_net cfg80211 snd_usb_audio rfkill snd_hwdep kvm_amd kvm snd_pcm uvcvideo videodev snd_usbmidi_lib media microcode atl1c eeprom_93cx6 v4l2_compat_ioctl32 i2c_piix4 edac_core snd_rawmidi snd_seq_device snd_timer snd snd_page_alloc k10temp edac_mce_amd soundcore serio_raw binfmt_misc uinput joydev pata_acpi ata_generic pata_atiixp wmi radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: rt2x00lib] [16639.322336] Pid: 17223, comm: work_for_cpu Not tainted 3.1.0-1.fc16.x86_64 #1 [16639.322342] Call Trace: [16639.322359] [<ffffffff81057a56>] warn_slowpath_common+0x83/0x9b [16639.322369] [<ffffffff81057a88>] warn_slowpath_null+0x1a/0x1c [16639.322390] [<ffffffffa02b3e15>] wiphy_register+0x5f/0x3d8 [cfg80211] [16639.322402] [<ffffffff81119370>] ? __kmalloc+0xf0/0x102 [16639.322425] [<ffffffffa0365239>] ? ieee80211_register_hw+0xd4/0x55e [mac80211] [16639.322447] [<ffffffffa0365495>] ieee80211_register_hw+0x330/0x55e [mac80211] [16639.322462] [<ffffffffa018809b>] rt2x00lib_probe_dev+0x4b0/0x581 [rt2x00lib] [16639.322474] [<ffffffffa0049892>] rt2x00pci_probe+0x236/0x27c [rt2x00pci] [16639.322484] [<ffffffff8106d313>] ? move_linked_works+0x6e/0x6e [16639.322497] [<ffffffffa0288311>] rt2800pci_probe+0x15/0x17 [rt2800pci] [16639.322508] [<ffffffff8123f6e7>] local_pci_probe+0x44/0x75 [16639.322516] [<ffffffff8106d329>] do_work_for_cpu+0x16/0x28 [16639.322525] [<ffffffff81072d1f>] kthread+0x84/0x8c [16639.322536] [<ffffffff814be234>] kernel_thread_helper+0x4/0x10 [16639.322546] [<ffffffff81072c9b>] ? kthread_worker_fn+0x148/0x148 [16639.322555] [<ffffffff814be230>] ? gs_change+0x13/0x13 [16639.322561] ---[ end trace f083997c3eb512dd ]--- [16639.322569] (null) -> rt2x00lib_probe_dev: Error - Failed to initialize hw. [16639.322622] rt2800pci 0000:02:00.0: PCI INT A disabled [16639.322678] rt2800pci: probe of 0000:02:00.0 failed with error -22 [16641.632375] audit_printk_skb: 15 callbacks suppressed I was using compat-wireless-2011-10-29
Did you install compat-wirelss modues with "make install" and restart the system? Above crash looks like new rt2800pci driver is used with kernel mac80211/cfg80211 modules, whereas modules from compat-wireless should be used. If you do modprobe -r rt2800pci modprobe -v -n rt2800pci it will show witch modules are loaded, all wireless modules i.e mac80211 should be taken from /lib/modules/`uname -r`/updates/ instead of /lib/modules/`uname -r`/kernel/ .
After removing the modules, I tried inserting with a regular user and the error indicated that the modules have been installing from updates folder. I managed to compile rt-2860 from FC15, they are working much better, they are not ideal drivers, the frequency of problems is in order once a week. I would prefer using open-source drivers not these drivers.
Ok, please try this kernel (when it finish to compile) it include patch from comment 6: http://koji.fedoraproject.org/koji/taskinfo?taskID=3475656
Err, it failed to compile, I'll fix that tomorrow.
Hopefully this one will compile: http://koji.fedoraproject.org/koji/taskinfo?taskID=3477950
Let us know how above kernel works. Patch include in kernel is not a fix but a workaround, but if it works, we will more or less know how to fix problem.
Sorry for delay!, I was a busy for a few days. How do I download or compile that kernel? I even tried installing koji on my system it did not help!
Kernel build I did is now deleted, these scratch builds are kept only a week or so, I'll rebuild it.
Here is the new build: http://koji.fedoraproject.org/koji/taskinfo?taskID=3507412 It's 2.6.41-rc1 kernel, but I hope it will work fine.
Thanks this kernel is working great, so far better than non open source driver. I guess I need to wait a few more days to see if it is really stable. BTW is there any place I can vote for this?
(In reply to comment #17) > Thanks this kernel is working great, so far better than non open source > driver. This kernel contains a patch, which is only a workaround. We can not apply it as it would break other devices that could possibly share interrupt line with rt2800pci device. We need to find out how to read interrupt status on your hardware. I tried to read sources of driver from ralink site, and could not find such information. Ivo, Gertjan, do you have any hints? > BTW is there any place I can vote for this? Not sure what you mean?
> > > BTW is there any place I can vote for this? > Not sure what you mean? When there is an update for fedora, first it is pushed to updates-testing and people usually vote if they are happy with the update. I was asking if there is any similar thing for patches?! Now I guess not! Well this does not completely fix the issue, but it reduces the problem by great significance. With ra-driver the issue happens once every 2-3 days, with normal kernel every 10 mins, and with this patch once after 3 days heavy usage. Which is satisfactory for me!
(In reply to comment #18) > This kernel contains a patch, which is only a workaround. We can not apply it > as it would break other devices that could possibly share interrupt line with > rt2800pci device. Actually we can apply it. I thought patch can break shared interrupts, but that's not true, we can return IRQ_HANDLED and interrupts routines from other devices, which share irq line will still be called. > We need to find out how to read interrupt status on your > hardware. I tried to read sources of driver from ralink site, and could not > find such information. Ivo, Gertjan, do you have any hints? I found that, vendor driver just return IRQ_HANDLED, so we should we.
Patch was refused as incorrect fix, so moving back to assigned.
Can you please build that kernel once more, I am getting a feeling that something else in that version (not this patch) or maybe a combination of this patch and the previous kernel fixed the issue. Last few days, I have been using a 3.1 kernel with that patch applied and the problem is not fixed on this kernel.
I faced some new errors on 3.14 kernel [ 1802.202425] rt2800pci 0000:02:00.0: PCI INT A disabled [ 1807.421891] rt2800pci 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 1807.421901] rt2800pci 0000:02:00.0: setting latency timer to 64 [ 1807.429593] ieee80211 phy2: Selected rate control algorithm 'minstrel_ht' [ 1807.429762] Registered led device: rt2800pci-phy2::radio [ 1807.429775] Registered led device: rt2800pci-phy2::assoc [ 1807.429793] Registered led device: rt2800pci-phy2::quality [ 1808.553043] phy2 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy, aborting. [ 1808.553058] phy2 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5). After these errors the device was completely down and was not able to connect by any means
Created attachment 550937 [details] helmut.patch Here is kernel with Helmut's patch (still compiling): http://koji.fedoraproject.org/koji/taskinfo?taskID=3622159
Amir, does it fix the problem or maybe it make things worse?
Maybe worse! Does not fix it at least.
Created attachment 551891 [details] rt2800pci_zero_interrupts.patch Another patch to test ...
Could you build a kernel on koji with this patch please?
Uhh, forgot that, lunched build here: http://koji.fedoraproject.org/koji/taskinfo?taskID=3639199
Created attachment 552146 [details] rt2800pci_zero_interrupts_3.1.patch Previous patch backport to 3.1 kernel.
Amir, while you're be testing, please check if there is warning similar like in comment 2.
Again speed hardly rose above 200k! There is no error message in dmesg. Just this is the output from /var/log/messages, it seems to be loading time message. Jan 11 21:51:46 amir-client kernel: [ 17.367348] rt2800pci 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 If you could make it verbose that would be nice, also I noticed the firmware from kernel-firmware package differs from the RaLink own firmware (I tested them both). That driver also messes up a lot and every time that happens this error is generated (the numbers change each time). I thought these might give a hint on what is going on. Jan 12 01:22:29 amir-client kernel: [ 7164.989117] Rcv Wcid(1) AddBAReq Jan 12 01:22:29 amir-client kernel: [ 7164.989125] Start Seq = 00000076 Jan 12 01:22:29 amir-client kernel: [ 7165.465124] Rcv Wcid(1) AddBAReq Jan 12 01:22:29 amir-client kernel: [ 7165.465131] Start Seq = 000005d9 Jan 12 01:22:30 amir-client kernel: [ 7166.141133] Rcv Wcid(1) AddBAReq Jan 12 01:22:30 amir-client kernel: [ 7166.141140] Start Seq = 000009a4 Jan 12 01:22:30 amir-client kernel: [ 7166.647587] Rcv Wcid(1) AddBAReq Jan 12 01:22:30 amir-client kernel: [ 7166.647594] Start Seq = 00000ecc Jan 12 01:23:30 amir-client kernel: [ 7226.303458] Rcv Wcid(1) AddBAReq Jan 12 01:23:30 amir-client kernel: [ 7226.303466] Start Seq = 000000cc Jan 12 01:23:30 amir-client kernel: [ 7226.452453] Rcv Wcid(1) AddBAReq Jan 12 01:23:30 amir-client kernel: [ 7226.452461] Start Seq = 00000111 Jan 12 01:24:28 amir-client kernel: [ 7284.284007] Rcv Wcid(1) AddBAReq Jan 12 01:24:28 amir-client kernel: [ 7284.284015] Start Seq = 000009e6 Jan 12 01:24:28 amir-client kernel: [ 7284.366924] Rcv Wcid(1) AddBAReq Jan 12 01:24:28 amir-client kernel: [ 7284.366927] Start Seq = 0000005b Jan 12 01:24:29 amir-client kernel: [ 7285.100099] ===>rt_ioctl_giwscan. 5(5) BSS returned, data->length = 1023
(In reply to comment #32) > Again speed hardly rose above 200k! There is no error message in dmesg. Ok, so patch fix spurious interrupt problem. > Just this is the output from /var/log/messages, it seems to be loading time > message. > > Jan 11 21:51:46 amir-client kernel: [ 17.367348] rt2800pci 0000:02:00.0: PCI > INT A -> GSI 17 (level, low) -> IRQ 17 Yes, this is normal message telling which interrupt line is assigned to the device. > If you could make it verbose that would be nice, Hmm, I do not understand, what I should do? > also I noticed the firmware > from kernel-firmware package differs from the RaLink own firmware (I tested > them both). We will need to update firmware at some point. > That driver also messes up a lot and every time that happens this > error is generated (the numbers change each time). I thought these might give a > hint on what is going on. > > Jan 12 01:22:29 amir-client kernel: [ 7164.989117] Rcv Wcid(1) AddBAReq > Jan 12 01:22:29 amir-client kernel: [ 7164.989125] Start Seq = 00000076 These messages mean that peer (i.e. AP) want to start Block Ack session, send frames without ACK, and then ACK all of them at once. I do not see any problem here. Anyway, I assume patch fix problem originally reported here. I will test it on my rt2800pci hardware and post soon. For any other problems please open a separate bug report.
After upgrade to 3.2.2-1.fc16.x86_64 the module has stopped to work totally! Here is message log: [ 118.884618] rt2800pci 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 [ 118.884639] rt2800pci 0000:02:00.0: setting latency timer to 64 [ 118.894133] phy1 -> rt2800_init_eeprom: Error - Invalid RF chipset 0x0 detected. [ 118.894142] phy1 -> rt2x00lib_probe_dev: Error - Failed to allocate device. [ 118.894190] rt2800pci 0000:02:00.0: PCI INT A disabled
As per message on the linux-wireless mailing list, this is an issue with the fedora kernel build. See http://marc.info/?l=linux-wireless&m=132794937324423&w=2 for details.
Yes, bug 785393. This should be fixed in the latest builds: http://koji.fedoraproject.org/koji/taskinfo?taskID=3749626 If you'll get MCU request failures, try that one: http://koji.fedoraproject.org/koji/taskinfo?taskID=3751893
kernel-3.2.3-2.fc16 has spurious interrupt fix included, and also fix for bug 785393, closing bug report.
The issue has greatly reduced, maybe the rest are not this driver specific! I see this error while rebooting, seems to be a firmware bug maybe good to have a look at it: [Firmware Bug] CPU 1 try to use APLC500(LVT offset 0) for vector 0x400, ther register is already in use in vector 0xfg on an other CPU (the CPU number changes)
Yes, this is not rt2x00 related for sure. According to the message this is firmware issue, so if that cause troubles for you, ask for support BIOS vendor or maybe try to update BIOS first.