Bug 983738 - iwl4965 needs better rx allocation
Summary: iwl4965 needs better rx allocation
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 22
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Stanislaw Gruszka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-07-11 20:17 UTC by maverick.pt
Modified: 2016-07-19 18:58 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-07-19 18:58:26 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg after issue happened (77.60 KB, text/plain)
2013-07-15 21:16 UTC, maverick.pt
no flags Details
/var/log/kernel (6.30 MB, application/x-bzip)
2013-08-23 10:45 UTC, maverick.pt
no flags Details
kernel error log when card is powered on (even without trying to connect to any networks) (5.34 KB, text/plain)
2013-10-02 08:37 UTC, Hedayat Vatankhah
no flags Details

Description maverick.pt 2013-07-11 20:17:06 UTC
Hi,

I've detected that sometimes i loose connection for few seconds on my laptop when connected through wireless.

I checked the logs and see this errors:

Jul 11 21:03:51 ghost kernel: [ 1021.077890] iwl4965 0000:07:00.0: Microcode SW error detected.  Restarting 0x2000000.
Jul 11 21:03:51 ghost kernel: [ 1021.077925] iwl4965 0000:07:00.0: Loaded firmware version: 228.61.2.24
Jul 11 21:03:51 ghost kernel: [ 1021.077948] iwl4965 0000:07:00.0: Start IWL Error Log Dump:
Jul 11 21:03:51 ghost kernel: [ 1021.077958] iwl4965 0000:07:00.0: Status: 0x000213E4, count: 5
Jul 11 21:03:51 ghost kernel: [ 1021.078104] iwl4965 0000:07:00.0: Desc                                  Time       data1      data2      line
Jul 11 21:03:51 ghost kernel: [ 1021.078117] iwl4965 0000:07:00.0: FH49_ERROR                   (0x000C) 3639898623 0x00000008 0x02530000 208
Jul 11 21:03:51 ghost kernel: [ 1021.078125] iwl4965 0000:07:00.0: pc      blink1  blink2  ilink1  ilink2  hcmd
Jul 11 21:03:51 ghost kernel: [ 1021.078135] iwl4965 0000:07:00.0: 0x0046C 0x04780 0x004C2 0x006DE 0x04800 0x794001C
Jul 11 21:03:51 ghost kernel: [ 1021.078144] iwl4965 0000:07:00.0: FH register values:
Jul 11 21:03:51 ghost kernel: [ 1021.078164] iwl4965 0000:07:00.0:       FH49_RSCSR_CHNL0_STTS_WPTR_REG: 0X12776a00
Jul 11 21:03:51 ghost kernel: [ 1021.078184] iwl4965 0000:07:00.0:      FH49_RSCSR_CHNL0_RBDCB_BASE_REG: 0X01277690
Jul 11 21:03:51 ghost kernel: [ 1021.078204] iwl4965 0000:07:00.0:                FH49_RSCSR_CHNL0_WPTR: 0X00000000
Jul 11 21:03:51 ghost kernel: [ 1021.078224] iwl4965 0000:07:00.0:       FH49_MEM_RCSR_CHNL0_CONFIG_REG: 0X00819000
Jul 11 21:03:51 ghost kernel: [ 1021.078243] iwl4965 0000:07:00.0:        FH49_MEM_RSSR_SHARED_CTRL_REG: 0X0000003c
Jul 11 21:03:51 ghost kernel: [ 1021.078263] iwl4965 0000:07:00.0:          FH49_MEM_RSSR_RX_STATUS_REG: 0X02530000
Jul 11 21:03:51 ghost kernel: [ 1021.078282] iwl4965 0000:07:00.0:   FH49_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0X00000000
Jul 11 21:03:51 ghost kernel: [ 1021.078302] iwl4965 0000:07:00.0:              FH49_TSSR_TX_STATUS_REG: 0X07ff0003
Jul 11 21:03:51 ghost kernel: [ 1021.078321] iwl4965 0000:07:00.0:               FH49_TSSR_TX_ERROR_REG: 0X00000000
Jul 11 21:03:51 ghost kernel: [ 1021.078860] iwl4965 0000:07:00.0: Can't stop Rx DMA.
Jul 11 21:03:51 ghost kernel: [ 1021.079731] ieee80211 phy0: Hardware restart was requested
Jul 11 21:03:51 ghost kernel: [ 1021.347298] iwl4965 0000:07:00.0: Stopping AGG while state not ON or starting
Jul 11 21:03:51 ghost kernel: [ 1021.347308] iwl4965 0000:07:00.0: queue number out of range: 0, must be 7 to 14
Jul 11 21:04:06 ghost kernel: [ 1035.550206] iwl4965 0000:07:00.0: Microcode SW error detected.  Restarting 0x82000000.
Jul 11 21:04:06 ghost kernel: [ 1035.550240] iwl4965 0000:07:00.0: Loaded firmware version: 228.61.2.24
Jul 11 21:04:06 ghost kernel: [ 1035.550263] iwl4965 0000:07:00.0: Start IWL Error Log Dump:
Jul 11 21:04:06 ghost kernel: [ 1035.550273] iwl4965 0000:07:00.0: Status: 0x000212E4, count: 5
Jul 11 21:04:06 ghost kernel: [ 1035.550419] iwl4965 0000:07:00.0: Desc                                  Time       data1      data2      line
Jul 11 21:04:06 ghost kernel: [ 1035.550431] iwl4965 0000:07:00.0: FH49_ERROR                   (0x000C) 3654370921 0x00000008 0x02530000 208
Jul 11 21:04:06 ghost kernel: [ 1035.550440] iwl4965 0000:07:00.0: pc      blink1  blink2  ilink1  ilink2  hcmd
Jul 11 21:04:06 ghost kernel: [ 1035.550450] iwl4965 0000:07:00.0: 0x0046C 0x000F2 0x004C2 0x006DE 0x006DE 0x70D001C
Jul 11 21:04:06 ghost kernel: [ 1035.550459] iwl4965 0000:07:00.0: FH register values:
Jul 11 21:04:06 ghost kernel: [ 1035.550480] iwl4965 0000:07:00.0:       FH49_RSCSR_CHNL0_STTS_WPTR_REG: 0X12776a00
Jul 11 21:04:06 ghost kernel: [ 1035.550500] iwl4965 0000:07:00.0:      FH49_RSCSR_CHNL0_RBDCB_BASE_REG: 0X01277690
Jul 11 21:04:06 ghost kernel: [ 1035.550520] iwl4965 0000:07:00.0:                FH49_RSCSR_CHNL0_WPTR: 0X000000c8
Jul 11 21:04:06 ghost kernel: [ 1035.550540] iwl4965 0000:07:00.0:       FH49_MEM_RCSR_CHNL0_CONFIG_REG: 0X00819000
Jul 11 21:04:06 ghost kernel: [ 1035.550559] iwl4965 0000:07:00.0:        FH49_MEM_RSSR_SHARED_CTRL_REG: 0X0000003c
Jul 11 21:04:06 ghost kernel: [ 1035.550579] iwl4965 0000:07:00.0:          FH49_MEM_RSSR_RX_STATUS_REG: 0X02530000
Jul 11 21:04:06 ghost kernel: [ 1035.550598] iwl4965 0000:07:00.0:   FH49_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0X00000000
Jul 11 21:04:06 ghost kernel: [ 1035.550618] iwl4965 0000:07:00.0:              FH49_TSSR_TX_STATUS_REG: 0X07ff0003
Jul 11 21:04:06 ghost kernel: [ 1035.550638] iwl4965 0000:07:00.0:               FH49_TSSR_TX_ERROR_REG: 0X00000000
Jul 11 21:04:06 ghost kernel: [ 1035.551175] iwl4965 0000:07:00.0: Can't stop Rx DMA.
Jul 11 21:04:06 ghost kernel: [ 1035.552149] ieee80211 phy0: Hardware restart was requested
Jul 11 21:04:06 ghost kernel: [ 1035.817639] iwl4965 0000:07:00.0: Stopping AGG while state not ON or starting
Jul 11 21:04:06 ghost kernel: [ 1035.817646] iwl4965 0000:07:00.0: queue number out of range: 0, must be 7 to 14
Jul 11 21:04:11 ghost systemd: Reloading.
Jul 11 21:04:11 ghost LVM: Logical Volume autoactivation enabled.
Jul 11 21:04:11 ghost LVM: Activation generator successfully completed.
Jul 11 21:04:18 ghost kernel: [ 1047.844373] iwl4965 0000:07:00.0: Microcode SW error detected.  Restarting 0x82000000.
Jul 11 21:04:18 ghost kernel: [ 1047.844409] iwl4965 0000:07:00.0: Loaded firmware version: 228.61.2.24
Jul 11 21:04:18 ghost kernel: [ 1047.844432] iwl4965 0000:07:00.0: Start IWL Error Log Dump:
Jul 11 21:04:18 ghost kernel: [ 1047.844441] iwl4965 0000:07:00.0: Status: 0x000212E4, count: 5
Jul 11 21:04:18 ghost kernel: [ 1047.844589] iwl4965 0000:07:00.0: Desc                                  Time       data1      data2      line
Jul 11 21:04:18 ghost kernel: [ 1047.844602] iwl4965 0000:07:00.0: FH49_ERROR                   (0x000C) 3666664990 0x00000008 0x02530000 208
Jul 11 21:04:18 ghost kernel: [ 1047.844610] iwl4965 0000:07:00.0: pc      blink1  blink2  ilink1  ilink2  hcmd
Jul 11 21:04:18 ghost kernel: [ 1047.844621] iwl4965 0000:07:00.0: 0x0046C 0x0A8D0 0x004C2 0x006DE 0x0A94A 0x261001C
Jul 11 21:04:18 ghost kernel: [ 1047.844630] iwl4965 0000:07:00.0: FH register values:
Jul 11 21:04:18 ghost kernel: [ 1047.844652] iwl4965 0000:07:00.0:       FH49_RSCSR_CHNL0_STTS_WPTR_REG: 0X12776a00
Jul 11 21:04:18 ghost kernel: [ 1047.844672] iwl4965 0000:07:00.0:      FH49_RSCSR_CHNL0_RBDCB_BASE_REG: 0X01277690
Jul 11 21:04:18 ghost kernel: [ 1047.844692] iwl4965 0000:07:00.0:                FH49_RSCSR_CHNL0_WPTR: 0X00000000
Jul 11 21:04:18 ghost kernel: [ 1047.844712] iwl4965 0000:07:00.0:       FH49_MEM_RCSR_CHNL0_CONFIG_REG: 0X00819000
Jul 11 21:04:18 ghost kernel: [ 1047.844732] iwl4965 0000:07:00.0:        FH49_MEM_RSSR_SHARED_CTRL_REG: 0X0000003c
Jul 11 21:04:18 ghost kernel: [ 1047.844751] iwl4965 0000:07:00.0:          FH49_MEM_RSSR_RX_STATUS_REG: 0X02530000
Jul 11 21:04:18 ghost kernel: [ 1047.844771] iwl4965 0000:07:00.0:   FH49_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0X00000000
Jul 11 21:04:18 ghost kernel: [ 1047.844790] iwl4965 0000:07:00.0:              FH49_TSSR_TX_STATUS_REG: 0X07ff0002
Jul 11 21:04:18 ghost kernel: [ 1047.844809] iwl4965 0000:07:00.0:               FH49_TSSR_TX_ERROR_REG: 0X00000000
Jul 11 21:04:18 ghost kernel: [ 1047.846010] iwl4965 0000:07:00.0: Can't stop Rx DMA.
Jul 11 21:04:18 ghost kernel: [ 1047.847338] ieee80211 phy0: Hardware restart was requested

----------

Fedora 19 (x86_64)
Kernel 3.9.9-301.fc19.x86_64

Comment 1 Stanislaw Gruszka 2013-07-15 12:04:36 UTC
When the issue happen please do "dmesg > dmesg.txt" and attach dmesg.txt file here.

Comment 2 maverick.pt 2013-07-15 21:16:07 UTC
Created attachment 773906 [details]
dmesg after issue happened

I was making a speedtest using: http://www.speedtest.net/

Comment 3 Stanislaw Gruszka 2013-07-26 10:08:13 UTC
Unfortunately I can not find reason of a problem based on dmesg. Please provide log with verbose debugging enabled. Please do the following:

* add below line to /etc/rsyslog.conf
kern.*                                                  /var/log/kernel

* restart syslog daemon:
systemctl restart rsyslog.service 

* then reload module with debugging enabled:
modprobe -r iwl4965
echo > /var/log/kernel
modprobe iwl4965 debug=0x47ffffff

* reproduce the problem
* unload module
modprobe -r iwl4965

* provide compressed /var/log/kernel file

Comment 4 Stanislaw Gruszka 2013-08-23 07:44:35 UTC
Closing due to lack of response ...

Comment 5 maverick.pt 2013-08-23 10:45:20 UTC
Created attachment 789549 [details]
/var/log/kernel

Comment 6 maverick.pt 2013-08-23 10:46:44 UTC
Sorry, i've been on vacation.

I've reproduced the problem with debug activated.

I've attached the /var/log/kernel.

Comment 7 Josh Boyer 2013-09-18 20:38:07 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.11.1-200.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 8 maverick.pt 2013-09-23 19:57:58 UTC
The problem persists on 3.11.1-200.fc19.

Comment 9 Hedayat Vatankhah 2013-10-02 08:06:36 UTC
Today I also have encountered severe issue with my iwl4965 driver and it doesn't connect to any encrypted networks (but connects to non-encrypted networks). However, it does generate error messages even in this case.

Comment 10 Hedayat Vatankhah 2013-10-02 08:37:33 UTC
Created attachment 806296 [details]
kernel error log when card is powered on (even without trying to connect to any networks)

Comment 11 Hedayat Vatankhah 2013-10-05 06:21:16 UTC
Surprisingly, while Iā€Œ got this error that day and my Wireless card didn't work even after reboots (even in other OSes!), I don't get it today. :P

Comment 12 Stanislaw Gruszka 2013-10-15 10:06:04 UTC
Hedayat, seems that your device (hardware) just broke. 

mvrk, your problem is related with memory allocation for RX packets. Firmware expects that there are always buffers available for RX. This can be fixed, but require major rewrite of RX path in 4965 driver. I'm going to do this, but it can take some time.

Comment 13 Hedayat Vatankhah 2013-10-15 10:13:42 UTC
Thanks. Fortunately, the device is working properly since then. Have you any idea why this happens (and if I should do something to prevent my WiFi card become damaged)?

Comment 14 Justin M. Forbes 2014-01-03 22:11:47 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.

Comment 15 Stanislaw Gruszka 2014-03-05 07:09:47 UTC
mvrk, are you still using F-19 (or maybe F-20) with iwl4965 ? I have patches for iwl4965 RX allocation problems, I can provide you a test kernel.

Comment 16 maverick.pt 2014-03-05 09:22:01 UTC
Hi,

Yes, i'm still using F-19.

Comment 17 Stanislaw Gruszka 2014-03-06 13:16:26 UTC
Ok, here is kernel with the patches, please test it:
http://koji.fedoraproject.org/koji/taskinfo?taskID=6600205

Comment 18 maverick.pt 2014-03-10 17:59:19 UTC
Hi,

i've tested with this new kernel, but the problem persists.

Comment 19 Stanislaw Gruszka 2014-03-17 13:08:35 UTC
I'm quite sure Microcode errors happen due to some delay, but I'm not sure where the delay is. Perhaps interrupt line is shared with some other driver and interrupts are disabled for longer period, hence we can not handle 4965 interrupts in timely manner? What show /proc/interrupts ?

Comment 20 maverick.pt 2014-03-18 00:07:42 UTC
cat /proc/interrupts 
           CPU0       CPU1       
  0:   51928458    2108307   IO-APIC-edge      timer
  1:      33540       2766   IO-APIC-edge      i8042
  8:         11         16   IO-APIC-edge      rtc0
  9:     683262      64979   IO-APIC-fasteoi   acpi
 12:     323344        796   IO-APIC-edge      i8042
 14:     279294      26227   IO-APIC-edge      ata_piix
 15:          0          0   IO-APIC-edge      ata_piix
 16:          9          1   IO-APIC-fasteoi   uhci_hcd:usb3, firewire_ohci
 17:          0          0   IO-APIC-fasteoi   r592, mmc0
 18:    1950908      54516   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb7
 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb6
 21:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
 23:     419125     469228   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb5
 40:          0          0   PCI-MSI-edge      PCIe PME
 41:          0          0   PCI-MSI-edge      PCIe PME
 42:          0          0   PCI-MSI-edge      PCIe PME
 43:          0          0   PCI-MSI-edge      PCIe PME, pciehp
 44:          0          0   PCI-MSI-edge      PCIe PME, pciehp
 45:          0          0   PCI-MSI-edge      PCIe PME
 46:    3465088     211738   PCI-MSI-edge      ahci
 47:     950123    5074573   PCI-MSI-edge      i915
 48:        858        931   PCI-MSI-edge      snd_hda_intel
 49:   13796876         24   PCI-MSI-edge      iwl4965
 50:          0          0   PCI-MSI-edge      em1
NMI:         12         11   Non-maskable interrupts
LOC:  174902363  199756584   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:         12         11   Performance monitoring interrupts
IWI:    4692117    3272514   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:   13569116   16729210   Rescheduling interrupts
CAL:      64799     187718   Function call interrupts
TLB:    2293582    2468658   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:       1026       1023   Machine check polls
ERR:          0
MIS:          0

Comment 21 Stanislaw Gruszka 2014-03-18 08:41:08 UTC
4965 interrupt is not shared, still have to figure out where the delay come from ...

Comment 22 Jaroslav Reznik 2015-03-03 16:53:56 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 23 Justin M. Forbes 2015-10-20 19:43:30 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 22 kernel bugs.

Fedora 22 has now been rebased to 4.2.3-200.fc22.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 23, and are still experiencing this issue, please change the version to Fedora 23.

If you experience different issues, please open a new bug report for those.

Comment 24 maverick.pt 2015-10-21 22:31:03 UTC
Still happens on F22 (4.2.3-200.fc22.x86_64)

Comment 25 Fedora End Of Life 2016-07-19 18:58:26 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.