Bug 666646 - iwlagn Hard-Lock
Summary: iwlagn Hard-Lock
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: x86_64
OS: Linux
low
urgent
Target Milestone: ---
Assignee: Stanislaw Gruszka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 714547
TreeView+ depends on / blocked
 
Reported: 2011-01-01 20:48 UTC by James Cape
Modified: 2011-08-23 04:37 UTC (History)
9 users (show)

Fixed In Version: kernel-2.6.35.14-95.fc14
Clone Of:
: 714547 (view as bug list)
Environment:
Last Closed: 2011-05-19 05:10:57 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description James Cape 2011-01-01 20:48:16 UTC
Description of problem:

Roughly every 5-20 minutes or so (give or take) my Dell Adamo 13 (black) will hard lock.

The last messages in the syslog before the reboot (consistently) are:

Jan  1 14:03:33 emma kernel: [  350.187460] iwlagn 0000:04:00.0: BA scd_flow 0 does not match txq_id 10
Jan  1 14:03:35 emma kernel: [  352.917520] iwlagn 0000:04:00.0: low ack count detected, restart firmware
Jan  1 14:03:35 emma kernel: [  352.917531] iwlagn 0000:04:00.0: On demand firmware reload
Jan  1 14:03:35 emma kernel: [  352.972646] iwlagn 0000:04:00.0: Stopping AGG while state not ON or starting
Jan  1 14:03:35 emma kernel: [  352.972656] iwlagn 0000:04:00.0: queue number out of range: 0, must be 10 to 19
Jan  1 14:04:42 emma kernel: imklog 4.6.3, log source = /proc/kmsg started.
[boot continues as normal]

Version-Release number of selected component (if applicable):

2.6.35.10-74.fc14.x86_64

How reproducible:

Inconsistent (I believe without better evidence that it's a faulty response to some kind of palpable change in the wifi environment), but when it happens it always has the same/similar log messages (or sometimes the "AGG" line is cut-off halfway)

I'm willing to run a kdump kernel until I have a legit trace for this (having to randomly reboot my primary machine every 5 minutes makes this a priority for me), but will need instructions.


Steps to Reproduce:
1. Boot System
2. Wait.
3. Randomly loose the code you were working on for the last couple minutes.


Actual results:

Crashy fun time.


Expected results:

Boring work.


Additional info:

I'm a Leo.

Wait, did you mean additional info about the problem? In that case, if audio is playing it replays the last couple seconds of buffer on a loop until I button-for-7s the machine---audio is not always playing, however.

Comment 1 Stanislaw Gruszka 2011-01-03 09:43:08 UTC
There are two patches that may help:
https://bugzilla.kernel.org/attachment.cgi?id=38502
http://marc.info/?l=linux-wireless&m=129310430012942&w=2

I will prepare test kernel with them ...

Comment 2 Stanislaw Gruszka 2011-01-03 15:33:57 UTC
Please test both these kernels and share your impression:
http://koji.fedoraproject.org/koji/taskinfo?taskID=2697824
http://koji.fedoraproject.org/koji/taskinfo?taskID=2698228

Comment 3 Kari Hautio 2011-01-05 11:27:44 UTC
I'm affected by the same problem, installing kernel from http://koji.fedoraproject.org/koji/taskinfo?taskID=2698228 now.

Comment 4 Kari Hautio 2011-01-05 11:42:39 UTC
This kernel fixes the problem for me (didn't test the other build).

[khautio@kha ~]$ uname -a
Linux kha 2.6.35.10-75.irq.fc14.i686 #1 SMP Mon Jan 3 14:49:06 UTC 2011 i686 i686 i386 GNU/Linux

Comment 5 James Cape 2011-01-05 12:17:53 UTC
The irq build didn't fix my issue, I'm trying the low_ack kernel now.

Jan  4 16:01:07 emma kernel: [    0.000000] Linux version 2.6.35.10-75.irq.fc14.x86_64 (mockbuild.fedoraproject.org) (gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) ) #1 SMP Mon Jan 3 14:34:56 UTC 2011
[...]
Jan  4 17:37:37 emma kernel: [ 5810.255040] iwlagn 0000:04:00.0: iwlagn_tx_agg_start on ra = 00:25:9c:d2:4d:a0 tid = 0
Jan  4 17:37:44 emma kernel: [ 5816.994056] iwlagn 0000:04:00.0: BA scd_flow 0 does not match txq_id 10
Jan  4 17:37:45 emma kernel: [ 5818.095818] iwlagn 0000:04:00.0: BA scd_flow 0 does not match txq_id 10
Jan  4 17:37:46 emma kernel: [ 5819.064689] iwlagn 0000:04:00.0: low ack count detected, restart firmware
Jan  4 17:37:46 emma kernel: [ 5819.064701] iwlagn 0000:04:00.0: On demand firmware reload
Jan  4 17:37:46 emma kernel: [ 5819.119974] iwlagn 0000:04:00.0: Stopping AGG while state not ON or starting
Jan  4 17:37:46 emma kernel: [ 5819.119986] iwlagn 0000:04:00.0: queue number out of range: 0, must be 10 to 19
Jan  4 17:38:01 emma kernel: [ 5834.658906] iwlagn 0000:04:00.0: iwlagn_tx_agg_start on ra = 00:25:9c:d2:4d:a0 tid = 0
Jan  4 17:38:04 emma kernel: [ 5837.085846] iwlagn 0000:04:00.0: low ack count detected, restart firmware
Jan  4 17:38:04 emma kernel: [ 5837.085853] iwlagn 0000:04:00.0: On demand firmware reload
Jan  4 17:38:04 emma kernel: [ 5837.136528] iwlagn 0000:04:00.0: Stopping AGG while state not ON or starting
Jan  4 17:38:04 emma kernel: [ 5837.136535] iwlagn 0000:04:00.0: queue number out of range: 0, must be 10 to 19
Jan  4 17:42:51 emma kernel: imklog 4.6.3, log source = /proc/kmsg started.

Comment 6 Stanislaw Gruszka 2011-01-07 09:50:00 UTC
There is other similar bug 667459 report, that point the hard lock problem is on mac80211 layer. Please test this kernel and report back:
http://koji.fedoraproject.org/koji/taskinfo?taskID=2704610

Comment 7 Stanislaw Gruszka 2011-01-12 12:47:58 UTC
James, any news on comment 6 (also you can try official build with patch
http://koji.fedoraproject.org/koji/buildinfo?buildID=213595 if you wish).

Comment 8 James Cape 2011-01-12 13:16:29 UTC
I haven't tested the comment 6 build yet---it will be another week before I can, unfortunately---but I did see the same problem with the low_ack kernel.

Comment 9 Stanislaw Gruszka 2011-01-12 13:48:53 UTC
Please save packages you need from comment 6 as koji can remove these files automaticly. With low_ack kernel at least :low ack count detected, restart firmware" should gone.

Comment 10 James Cape 2011-01-12 13:55:43 UTC
Got it, thanks.

Comment 11 Kari Hautio 2011-01-12 14:01:32 UTC
I'm going to test comment 7 build now

Comment 12 Stanislaw Gruszka 2011-02-11 13:52:49 UTC
Kari and/or James could you test driver from upstream with some my patches:
https://bugzilla.redhat.com/show_bug.cgi?id=648732#c21

Comment 13 Stanislaw Gruszka 2011-05-09 19:28:53 UTC
Posted to fedora and stable.

http://lists.fedoraproject.org/pipermail/kernel/2011-May/003091.html

Comment 14 Fedora Update System 2011-05-16 04:46:11 UTC
kernel-2.6.38.6-27.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/kernel-2.6.38.6-27.fc15

Comment 15 Fedora Update System 2011-05-17 05:36:54 UTC
Package kernel-2.6.38.6-27.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-2.6.38.6-27.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/kernel-2.6.38.6-27.fc15
then log in and leave karma (feedback).

Comment 16 Fedora Update System 2011-05-19 05:10:45 UTC
kernel-2.6.38.6-27.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Fedora Update System 2011-08-17 17:38:29 UTC
kernel-2.6.35.14-95.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/kernel-2.6.35.14-95.fc14

Comment 18 Fedora Update System 2011-08-23 04:36:47 UTC
kernel-2.6.35.14-95.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.