Bug 648732

Summary: Intel wireless broken on 11n for many users
Product: [Fedora] Fedora Reporter: Pekka Pietikäinen <pp>
Component: kernelAssignee: Stanislaw Gruszka <sgruszka>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 14CC: acc-bugz-redhat, alex, arnaud.kleinveld, awilliam, benny+bugzilla, bos, boydjd, cbm, christopher.meiklejohn, chunnayya, dkovalsk, dsikora, extras-orphan, gansalmon, gholms, ilw, itamar, jguenther, jhenner, jisakiel, johannbg, joh, jonathan, joshua.bakerlepain, jvcelak, kernel-maint, madhu.chinakonda, mattdm, mdavis, mglantz, mishu, mkranz, mwringe, niklas.laxstrom+bro, pierre-bugzilla, pmrpla, redhat_bugzilla, sgruszka, the.ridikulus.rat, vendor-redhat, wey-yi.w.guy, woodard, wvega
Target Milestone: ---Keywords: CommonBugs
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: https://fedoraproject.org/wiki/Common_F14_bugs#intel_80211n
Fixed In Version: kernel-2.6.35.14-95.fc14 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 714590 (view as bug list) Environment:
Last Closed: 2011-08-23 00:36:54 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 714590    
Attachments:
Description Flags
relevant syslog bits
none
The new horkage.
none
"aggregation" log entries none

Description Pekka Pietikäinen 2010-11-01 19:41:40 EDT
Many iwlagn users need 11n_disable=1. This is a dupe, there's sooo many f12-14 kernel bugs around this that are currentversion/cantfix/whatnot (#571753, but many many others). Adding yet another bug to get a CommonBugs entry for this in the right place.

Rationale:
http://marc.info/?l=linux-wireless&m=128335670111847&w=2

My x200s  has been affected forever, f12 and early 13 were okish (machine usable, but eventually ran into problems), .34ish or so and it stared occurring within 15 mins vs. 5 days. Same root cause: uCode is broken, it just gets hidden by older drivers. Some kernel updates have tried to plaster around this including increasing timeouts, with little success.

So, basically, CANTFIX, but upstream should have a fix soonish, and many open kernel bugs can be made a duplicate of this one, not too bad for an extraneous bug report, eh?
Comment 1 Adam Williamson 2010-11-01 21:00:29 EDT

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 2 Pekka Pietikäinen 2011-01-12 07:38:30 EST
*** Bug 655450 has been marked as a duplicate of this bug. ***
Comment 3 Pekka Pietikäinen 2011-01-12 07:46:57 EST
Later kernels (.36 and .37-2 from rawhide) seem to make the situation a bit better, but I still hit this one (#655450 has some info). 

Symptoms are a bit different, instead of

iwlagn 0000:03:00.0: BA scd_flow 0 does not match txq_id 10 

it's now

Jan 10 21:33:27 laptop kernel: [13998.230626] iwlagn 0000:03:00.0: Received BA w
hen not expected
Jan 10 21:33:34 laptop kernel: [14005.230191] iwlagn 0000:03:00.0: Received BA w
hen not expected
Jan 10 21:33:39 laptop kernel: [14009.534892] iwlagn 0000:03:00.0: queue 10 stuc
k 3 time. Fw reload.
Jan 10 21:33:39 laptop kernel: [14009.534897] iwlagn 0000:03:00.0: On demand firmware reload
Jan 10 21:33:39 laptop kernel: [14009.676204] iwlagn 0000:03:00.0: Stopping AGG while state not ON or starting
Jan 10 21:33:39 laptop kernel: [14009.676216] iwlagn 0000:03:00.0: queue number out of range: 0, must be 10 to 19

Luckily upstream is making progress:

https://bugzilla.kernel.org/show_bug.cgi?id=16691
http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2214

Second one has a workaround patch (#25) that looks safe to merge as an interim fix, but looks like a real fix might be coming soon anyway.
Comment 4 Stanislaw Gruszka 2011-01-12 08:16:40 EST
*** Bug 646082 has been marked as a duplicate of this bug. ***
Comment 5 Stanislaw Gruszka 2011-01-13 04:46:33 EST
Experimental 5000 ucode was just announced
http://marc.info/?l=linux-wireless&m=129485622608672&w=2

Firmware is available here:
http://www.intellinuxwireless.org/?n=experimental

Thanks Wey!
Comment 6 Stanislaw Gruszka 2011-01-13 04:52:46 EST
Note: you probably have to use newer driver with that firmware. On older kernels uptodate iwlagn driver can by installed using compact wireless:
http://linuxwireless.org/en/users/Download

I'm working on packaging compact-wireless on on Fedora, some new packages can be find in koji using my user:
name: http://koji.fedoraproject.org/koji/tasks?owner=sgruszka&state=all)

Older packages and description are here: 
http://people.redhat.com/sgruszka/compact_wireless.html
Comment 7 Vinod Kutty 2011-01-20 00:41:10 EST
I escalated this with our Intel account team last week, and I'm impressed by and grateful for the quick turnaround (within a couple of days). Thanks folks!

Now, regarding testing this, I'm afraid I've never had to update firmware in this manner before so I'm somewhat ignorant of the options (in FC14 specifically). The Intel instructions suggest a patch + new kernel is necessary to, amongst other things, allow the use of experimental firmware (in this case, iwlwifi-5000-exp.ucode).

However, from reading comment #6, it sounds like I can save time and instead use the appropriate kmod RPM for Fedora. Is this true? If I want to test this on FC14 (2.6.35.10-74.fc14.x86_64), what needs to be done? The packages at http://koji.fedoraproject.org/koji/tasks?owner=sgruszka&state=all seem to be for slightly newer kernels.

I tried http://people.redhat.com/sgruszka/compat-wireless/kmod-2.6.35.10-74.fc14.x86_64-compat-wireless-2.6.37-4.fc14.1.x86_64.rpm

and the new firmware with no luck, so I assume I'm on the wrong track.

Appreciate some pointers in the right direction.
Comment 8 wey-yi.w.guy 2011-01-20 10:16:37 EST
I am not familar with fedora package, I can give a try, but based on number of feedbaCK, the experimantal uCode did help on this 11n issue and have much better performance.

Thanks
Wey
Comment 9 Stanislaw Gruszka 2011-01-20 10:32:36 EST
Sound good! What minimal upstream kernel version is needed to run experimental firmware?
Comment 10 wey-yi.w.guy 2011-01-20 10:40:28 EST
It's described in experimental release note. I believe 2.6.35 will do the job.

Thanks
Wey
Comment 11 Stanislaw Gruszka 2011-01-20 11:04:15 EST
2.6.35 and 2.6.36 need additional patches which are provided in exp firmware tarball, 2.6.37 have them.  But patches are not enough, we also need to turn on compilation option CONFIG_IWLWIFI_DEBUG_EXPERIMENTAL_UCODE . I will provide compat-wireless package working with experimental firmware soon.
Comment 12 Stanislaw Gruszka 2011-01-21 03:01:16 EST
Here are compat-wireless (2.6.37-4) packages with experimental support firmware enabled:
http://koji.fedoraproject.org/koji/taskinfo?taskID=2733294

After copy firmware to /lib/firmware and reload drivers,  I can see driver load new microcode:
> [  864.487779] iwlagn 0000:80:00.0: Detected Intel(R) Ultimate N WiFi Link 5300 AGN, REV=0x24
> [  864.519516] iwlagn 0000:80:00.0: device EEPROM VER=0x11f, CALIB=0x4
> [  864.519533] iwlagn 0000:80:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
> [  864.519590] iwlagn 0000:80:00.0: irq 97 for MSI/MSI-X
> [  864.522206] iwlagn 0000:80:00.0: loaded firmware version 8.83.5.1 build 33692 (EXP)

However new firmware does not help a 5Ghz N-only network issue, I reported here
http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2275

After a minute device stop working with messages:

> [ 1239.763082] wlan4: associated
> [ 1292.365941] iwlagn 0000:80:00.0: iwlagn_tx_agg_start on ra = 00:23:69:35:d1:3f tid = 0
> [ 1309.689780] iwlagn 0000:80:00.0: Received BA when not expected
> [ 1309.694747] iwlagn 0000:80:00.0: Received BA when not expected
> [ 1309.763600] iwlagn 0000:80:00.0: Received BA when not expected
> [ 1309.769565] iwlagn 0000:80:00.0: Received BA when not expected
> [ 1314.382955] iwlagn 0000:80:00.0: Received BA when not expected
> [ 1321.721932] iwlagn 0000:80:00.0: Received BA when not expected
> [ 1321.735569] iwlagn 0000:80:00.0: Received BA when not expected
> [ 1321.740970] iwlagn 0000:80:00.0: Received BA when not expected
> [ 1321.807432] iwlagn 0000:80:00.0: low ack count detected, restart firmware
> [ 1321.807436] iwlagn 0000:80:00.0: On demand firmware reload
> [ 1321.839553] iwlagn 0000:80:00.0: Stopping AGG while state not ON or starting
> [ 1321.839558] iwlagn 0000:80:00.0: queue number out of range: 0, must be 10 to 19
> [ 1325.801903] iwlagn 0000:80:00.0: Aggregation not enabled for tid 0 because load = 3
> [ 1338.908465] iwlagn 0000:80:00.0: Aggregation not enabled for tid 0 because load = 6
> [ 1357.132363] iwlagn 0000:80:00.0: Aggregation not enabled for tid 0 because load = 2
> [ 1369.349116] iwlagn 0000:80:00.0: Aggregation not enabled for tid 0 because load = 3
> [ 1404.019019] iwlagn 0000:80:00.0: Aggregation not enabled for tid 0 because load = 1

I hope other users will have more luck.
Comment 13 Stanislaw Gruszka 2011-01-21 03:05:28 EST
*** Bug 659415 has been marked as a duplicate of this bug. ***
Comment 14 Vinod Kutty 2011-01-24 14:18:33 EST
Thanks for the updated compat-wireless kmod. I wanted to burn this in for a few days of periodic usage before reporting results. I'm also connected to a 5GHz N only access point with my Intel 5300 client chipset/laptop.

There seem to be fewer errors with the new firmware, but I do still have issues with firmware reloads interrupting network connectivity.

> Jan 23 20:42:28 myhost kernel: [101592.562324] iwlagn 0000:0c:00.0: Aggregation not enabled for tid 6 because load = 1
> Jan 23 20:48:55 myhost kernel: [101979.788431] iwlagn 0000:0c:00.0: iwlagn_tx_agg_start on ra = c0:c1:c0:xx:yy:zz tid = 6
> Jan 23 21:09:49 myhost kernel: [103233.444425] iwlagn 0000:0c:00.0: low ack count detected, restart firmware
> Jan 23 21:09:49 myhost kernel: [103233.444436] iwlagn 0000:0c:00.0: On demand firmware reload
> Jan 23 21:09:49 myhost kernel: [103233.490809] iwlagn 0000:0c:00.0: Stopping AGG while state not ON or starting
> Jan 23 21:09:49 myhost kernel: [103233.490822] iwlagn 0000:0c:00.0: queue number out of range: 0, must be 10 to 19
> Jan 23 21:09:49 myhost kernel: [103233.490953] iwlagn 0000:0c:00.0: Stopping AGG while state not ON or starting
> Jan 23 21:09:49 myhost kernel: [103233.490962] iwlagn 0000:0c:00.0: queue number out of range: 0, must be 10 to 19

So, bottom line: this is still not stable.
Comment 15 Joshua Baker-LePain 2011-01-27 17:18:25 EST
To add a dissenting voice, the new microcode and compat-wireless kmod have helped me out immensely.  Stock 2.6.35.10-74 crashed on me all the time with what seemed like bug 667459.  Using kernel-2.6.35.10-77 from that bug cured the crashes, but led to horrible wireless performance, with rather frequent network dropouts.  I had to turn off 802.11n mode to make the machine usable.

I've been using the new microcode and the compat-wireless driver with 2.6.35.10-74 for a few days.  Performance is good with no network dropouts, and I've not had a single firmware reload.  The only kernel messages I see are the "Aggregation not enabled" ones.

Hardware: Thinkpad T400s with Intel 5300 connecting to a 5GHz 802.11n network from a Netgear WNDR3700.

In short, thanks.
Comment 16 Vinod Kutty 2011-01-27 22:30:17 EST
RE: Comment #15

I was trying to find something that would help tickle this. Although I can't be sure, I have tried streaming video from Hulu (via huludesktop for Linux) and that seems to significantly increase the probability of a firmware reload within around 15-20 mins or so of the start of a stream. During this time, there is nothing else really meaningful generating or receiving  traffic on my box.

Any chance you can check if this is the case with your setup?
Comment 17 Joshua Baker-LePain 2011-01-28 13:32:23 EST
I've run several tests since installing the test packages and have yet to be able to trigger a firmware reload.  Hulu doesn't generate a huge amount of traffic for me, as it's throttled by my home DSL connection (6 Mb/s tops).  Still I ran it for a while last night with no issues.

As for harsher tests, I've pushed and pulled several tens of GB via NFSv4 to my home server at speeds of around 10+ MB/s with no issues.  My nightly backups (using amanda on said home server) have also been running without causing any firmware reloads.
Comment 18 wey-yi.w.guy 2011-01-28 13:35:23 EST
great, thank you very much :-)

Wey
Comment 19 wey-yi.w.guy 2011-01-28 15:35:34 EST
(In reply to comment #16)
> RE: Comment #15
> I was trying to find something that would help tickle this. Although I can't be
> sure, I have tried streaming video from Hulu (via huludesktop for Linux) and
> that seems to significantly increase the probability of a firmware reload
> within around 15-20 mins or so of the start of a stream. During this time,
> there is nothing else really meaningful generating or receiving  traffic on my
> box.
> Any chance you can check if this is the case with your setup?

It does look like heavy traffic will cause the problem, we have not try Hulu yet, but we can do the similar streaming video test case and see if we see similar firmware case.

Thanks
Wey
Comment 20 Neil Underwood 2011-02-04 03:18:02 EST
Just wanna throw this out there, as I've been monitoring this bug because it's been affecting me too.  I'm running an Intel 3400 chipset and my wifi has been constantly falling out.  I assigned a static IP address to my laptop yesterday and I haven't had to reset once since.  For what it's worth....
Comment 21 Stanislaw Gruszka 2011-02-11 08:50:30 EST
Could everyone (both of you who have 11n working and not working) test this compact wireless package:
http://koji.fedoraproject.org/koji/taskinfo?taskID=2832390
with module option plcp_check=1 and plcp_check=0 and share results. 
This could be tested with old and new firmware. 

Description how to install this package is here:
http://people.redhat.com/sgruszka/compact_wireless.html

Thanks in advance.
Comment 22 Vinod Kutty 2011-02-16 14:38:20 EST
RE: Comment #21

I'm one of the folks using the 11n-only setup that's not stable. 

Haven't had a lot of time to test, but so far using this package, and the default setup with no module parameters explicitly supplied (which I assume implies plcp_check=1), it seems to be stable.

Firmware: 8.83.5.1 build 33692 (EXP)

Will burn in longer when I have more cycles.

What's in this new compat wireless build? Does it incorporate the patch from http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2228 comment #13 or is it something else?

Thanks
Comment 23 Stanislaw Gruszka 2011-02-17 07:31:52 EST
(In reply to comment #22)
> What's in this new compat wireless build? Does it incorporate the patch from
> http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2228 comment #13 or is
> it something else?
It's include similar patch, which disable "low ack count" check, but allow to enable it with ack_check=1 module option. Also there are tons of other patches that are currently in wireless-testing tree.

Could you be so kind, and also test performance with plcp_check=1 and 0 ?
Comment 24 Pekka Pietikäinen 2011-02-17 18:36:40 EST
Fixes 11n on my new T410s/6200AGN without any experimental firmware (well, there never was one for this one). plcp_check=1 (or rather no options at all) was rock solid for a few days with good performance, 5-6MB/s or so. Now running with plcp_check=0 and seems to be working just as well.

One thing that may have regressed a bit is latency, I'm seeing a 2-3 ms increase in ping times (8ms to the other side of my vdsl2 vs. 5ms), but really performance is soooo much better with 11n I don't really care, ssh latency when downloading heavily is much better, which is the one that matters.

Probably should try it out on the X200s/5300 I have around, but too lazy to do that right now :)
Comment 25 Vinod Kutty 2011-02-17 19:29:16 EST
RE: Comment #23

If I am testing with no module options, that's the same as plcp_check=1, right? If so, that test case is covered (with the experimental firmware). Over the last few days my SSH sessions have been stable and streaming video as well as a few file downloads have been fine. Hope this lasts 8-)

I can try the 2nd test case (plcp_check=0) next week when I have more time.

RE: Comment #24

I had not thought to check latency, but that's a good point. I'm also just happy to have stability first. 8-)
Comment 26 wey-yi.w.guy 2011-02-17 19:38:04 EST
Thank you very much for testing, we are in the process of making maintenance relase for 5000 series once we got the acknowledge from the community and confirm the experimental firmware is stable and fix the problem.

Wey
Comment 27 Joseph Pingenot 2011-02-17 21:58:47 EST
I'm seeing some pretty severe iwlagn (intel wireless 5300) firmware horkage.  I'm attaching relevant log messages.  This is up-to-date FC14.

I'll try the new packages when I get time.
Comment 28 Joseph Pingenot 2011-02-17 22:00:06 EST
Created attachment 479445 [details]
relevant syslog bits

Behold the horkage!
Comment 29 wey-yi.w.guy 2011-02-18 11:31:42 EST
looking at the log, you are using released 5000 firmware, could you give "experimental" firmware a try?

Thanks
Wey
Comment 30 Joseph Pingenot 2011-02-18 21:57:59 EST
Found more horkage.

The previous log was after hibernating and resuming a large number of times, and while attached to the campus network and to my home 802.11g (WRT54G w/ beta DD-WRT).

The log I'm uploading was captured after resuming from hibernation (hibernated while attached to my home network) and attaching to an 802.11n (D-Link DIR-601 Hardware Version: A1 Firmware Version : 1.00NA)
Comment 31 Joseph Pingenot 2011-02-18 21:58:50 EST
Created attachment 479640 [details]
The new horkage.
Comment 32 Joseph Pingenot 2011-02-18 22:01:29 EST
Urgh. sorry. looks like the firmware is the old firmware.
Comment 33 Joseph Pingenot 2011-02-18 22:47:23 EST
Created attachment 479642 [details]
"aggregation" log entries
Comment 34 Joseph Pingenot 2011-02-18 22:47:50 EST
There's some stuff about aggregation in the logs; so far so good, though.
Comment 35 Joseph Pingenot 2011-02-18 23:40:57 EST
FWIW, I'm also seeing a lot like:
Feb 18 22:33:10 ruth kernel: [ 3114.385442] iwlagn 0000:07:00.0: Received BA when not expected
Comment 36 wey-yi.w.guy 2011-02-19 15:40:41 EST
"experimental" uCode works better?

Wey
Comment 37 Joseph Pingenot 2011-02-19 16:50:43 EST
Yes,I think so.  This works much better than it has in the past on this network.  We'll see more when I can get to the university to test.
Comment 38 Joseph Pingenot 2011-02-23 18:21:09 EST
Is there any documentation on how to modify the firmware?
Comment 39 Stanislaw Gruszka 2011-02-24 07:21:56 EST
What you mean?
Comment 40 Joseph Pingenot 2011-02-24 10:04:16 EST
Documentation like Intel's provided on their graphics chips on how to program it. In this case, I'm interested in what's going on inside the box.
Comment 41 Joseph Pingenot 2011-02-24 10:07:38 EST
(background: I'm a theoretical physicist, engineer, and amateur radio operator; this stuff is what I love to work with. ;)
Comment 42 Joseph Pingenot 2011-02-24 20:00:04 EST
I've attached one; the opregions are identical before and after:
653b48084ddf6127e266c65b1aaf5ecc  i915_opregion_after.dat
653b48084ddf6127e266c65b1aaf5ecc  i915_opregion_before.dat
Comment 43 Joseph Pingenot 2011-02-24 20:00:56 EST
(sorry; that was to the wrong bug. Is there a way to delete a comment that I'm missing?)
Comment 44 Joseph Pingenot 2011-03-02 22:44:56 EST
I am unable to connect (over the wifi-n connection) to my new D-Link DAP-1350.
I can get a link (get an IP address from the router) and see some traffic (ARP who-has 192.168.0.50 and 192.168.0.50 is-at responses from the router) but anything else seems to go missing somewhere.  I can see my computer requesting things, but no response.  The router sees my notebook as connected, from the Status tab.  AP or router mode makes no difference; they both exhibit this behavior.

When I switch the DAP-1350 to only do 802.11b or g, things work perfectly again.
Comment 45 Joseph Pingenot 2011-03-02 22:46:05 EST
(I "see" things via tcpdump -ni wlan0)
Comment 46 Stanislaw Gruszka 2011-03-03 10:15:08 EST
(In reply to comment #44)
That looks like different bug. Is new router firmware uptodate? If so please open new bug report for that problem.
Comment 47 wey-yi.w.guy 2011-03-03 10:43:01 EST
Joe, Are you still working 5300 and with the new firmware? few questions
1) what band and channel
2) open or security
3) how you know the request go out and no response? from sniffer trace?

Thanks
Wey
Comment 48 Joseph Pingenot 2011-03-03 11:35:24 EST
Wei: yes, as far as I know
Channel is any (4 or 11 so far)
Either security setting, iirc (I know for certain that it doesn't work with open security)
Yes, from tcpdump.
Comment 49 Joseph Pingenot 2011-03-03 11:42:29 EST
Confirmed: this is with experimental firmware:
Mar  2 21:50:05 ruth kernel: [   14.045022] iwlagn 0000:07:00.0: loaded firmware version 8.83.5.1 build 33692 (EXP)
Comment 50 Joshua Baker-LePain 2011-04-20 12:20:17 EDT
Firmware version 8.83.5.1 was released to Fedora on 2/28.  However, the driver in all released kernels (up to and including 2.6.35.12-88) is still loading the old version (8.24.2.12).  The compat-wireless driver (which hasn't yet been updated for 2.6.35.12-88) doesn't seem to load it either.  I still need to have iwlwifi-5000-exp.ucode in /lib/firmware for compat-wireless to load anything but the old firmware.

So, how do I use the released firmware?
Comment 51 wey-yi.w.guy 2011-04-20 12:39:59 EDT
apply commit#41504cce240f791f1e16561db95728c5537fbad9 
Author: Fry, Donald H <donald.h.fry@intel.com>
Date:   Wed Feb 16 11:49:34 2011 -0800

    iwlagn: Support new 5000 microcode.
    
    New iwlwifi-5000 microcode requires driver support for API version 5.
    
    Signed-off-by: Don Fry <donald.h.fry@intel.com>
    Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>

this patch should already backport to stable. is not?

Wey
Comment 52 Stanislaw Gruszka 2011-04-22 08:20:30 EDT
Patch was Cc-ed to stable, but there is lack of "Cc: stable@kernel.org" in Sign-off patch log area, hence patch was not applied in -stable as nobody notice it was committed. I just posted it again.
Comment 53 Stanislaw Gruszka 2011-04-22 08:26:28 EDT
I updated compat-wireless packages. I did not check, but since packages are 2.6.38 or newer, they should load new 8.83.5.1 firmware if file /lib/firmware/iwlwifi-5000-5.ucode is accessible.
Comment 54 Stanislaw Gruszka 2011-04-22 08:28:56 EDT
Note "Received BA when not expected" is a driver bug, see
https://bugzilla.kernel.org/show_bug.cgi?id=16691
Comment 55 Pekka Pietikäinen 2011-05-03 10:58:20 EDT
Probably worth noting that F15 seems to have a kernel new enough to not have this issue (T410s with a 6200, YMMV). Not that F14 shouldn't get a fix if at all feasible...
Comment 56 Bryan O'Sullivan 2011-05-08 01:10:34 EDT
I upgraded my X200 (Intel 5300 wifi) to F15 beta 2 tonight, and wifi performance has gone from almost okay to disastrous. My system used to lose its association with my 11n AP after a few minutes to an hour, but now I get "low ack count" messages every few seconds, lots of packet loss, and terrible throughput.
Comment 57 wey-yi.w.guy 2011-05-08 12:44:36 EDT
could you try the following patch from Stanislaw? 

commit b7977ffaab5187ad75edaf04ac854615cea93828
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Mon Feb 28 14:33:15 2011 +0100

    iwlwifi: add {ack,plpc}_check module parameters
    
    Add module ack_check, and plcp_check parameters. Ack_check is disabled
    by default since is proved that check ack health can cause troubles.
    Plcp_check is enabled by default.
    
    Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>

Thanks
Wey
Comment 58 Stanislaw Gruszka 2011-05-09 07:00:10 EDT
I will post that patch to stable/fedora today.  

Bryan, what kernel version did you use before?
Comment 59 Bryan O'Sullivan 2011-05-09 14:03:09 EDT
Stanislaw, I'm using the current F-15 updates-testing kernel. I don't have the laptop handy to identify its exact version.
Comment 60 Stanislaw Gruszka 2011-05-09 14:23:50 EDT
Ok, but I'm more interested about working kernel version, that you had before.
Comment 61 Pierre Ossman 2011-05-21 13:44:07 EDT
Is this bug entry just for 5000 series or for all iwlagn? I have a 4965 and I've been getting more and more of these:

[  370.966521] iwlagn 0000:03:00.0: Received BA when not expected
Comment 62 Stanislaw Gruszka 2011-05-24 03:13:38 EDT
This should be fixed by:

commit 16b345d89686ca0482a9ca741a1167def1abdd7f
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Fri Apr 29 17:51:56 2011 +0200

    iwl4965: fix "Received BA when not expected"

Patch is already applied on 2.6.38 and 2.6.39, *not* in 2.6.35 . On F-14 you can try compat-wireless: http://people.redhat.com/sgruszka/compact_wireless.html
Comment 63 Pierre Ossman 2011-05-24 13:40:49 EDT
No dice:

~
[drzeus@mjolnir]$ dmesg | grep "Received BA"
[  225.239476] iwlagn 0000:03:00.0: Received BA when not expected
...

~
[drzeus@mjolnir]$ modinfo iwlagn
filename:       /lib/modules/2.6.35.13-91.fc14.x86_64/extra/compat-wireless/drivers/net/wireless/iwlwifi/iwlagn.ko
...
srcversion:     C83063819713AE3FAAA6A6C
...


~
[drzeus@mjolnir]$ rpm -qf /lib/modules/2.6.35.13-91.fc14.x86_64/extra/compat-wireless/drivers/net/wireless/iwlwifi/iwlagn.ko
kmod-compat-wireless-2.6.38-3.fc14.1.x86_64

~
[drzeus@mjolnir]$ cat /sys/module/iwlagn/srcversion 
C83063819713AE3FAAA6A6C
Comment 64 Stanislaw Gruszka 2011-05-25 07:24:22 EDT
(In reply to comment #63)
> No dice:
> kmod-compat-wireless-2.6.38-3.fc14.1.x86_64
I'm sorry, this version does not contain the fix. I've updated kmod-compat-wireless package to 2.6.39-1 version, please try that. If it does not help please open a new bug report and assign it to me.
Comment 65 Stanislaw Gruszka 2011-05-25 08:05:51 EDT
Assign this to kernel since except firmware update we need driver patches to fix 11n problems.
Comment 66 Stanislaw Gruszka 2011-05-25 08:06:21 EDT
*** Bug 702472 has been marked as a duplicate of this bug. ***
Comment 67 Magnus Glantz 2011-05-29 05:12:47 EDT
I hit this on F15 on Lenovo T400 running:

2.6.38.6-26.rc1.fc15.x86_64 #1 SMP Mon May 9 20:45:15 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

iwlagn 0000:03:00.0: loaded firmware version 8.83.5.1 build 33692

03:00.0 Network controller: Intel Corporation PRO/Wireless 5100 AGN [Shiloh] Network Connection
	Subsystem: Intel Corporation WiFi Link 5100 AGN
	Physical Slot: 1
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 50
	Region 0: Memory at f4300000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: [c8] Power Management version 3
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0100c  Data: 41c9
	Capabilities: [e0] Express (v1) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <128ns, L1 <32us
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Capabilities: [140 v1] Device Serial Number 00-21-6b-ff-ff-c9-96-1c
	Kernel driver in use: iwlagn
	Kernel modules: iwlagn
Comment 68 Magnus Glantz 2011-05-29 05:16:18 EDT
Please note, I did not have this issue on F14. It was introduced in F15 for me. Solved by disabling 802.11n..

# echo "options iwlagn 11n_disable=1" >/etc/modprobe.d/intel-80211n.conf
# chcon --reference /etc/modprobe.d/anaconda.conf /etc/modprobe.d/Intel-80211n.conf
# reboot
Comment 69 Pierre Ossman 2011-05-29 05:36:54 EDT
I have been running the updated compat module for my 4965 for a few days now, and so far I haven't seen any problem. Many thanks.
Comment 70 Stanislaw Gruszka 2011-06-01 10:06:37 EDT
Magnus, you hit different bug, open other bug report for it. This on is for tracking 11n regression introduced in 2.6.35 kernel.
Comment 71 Stanislaw Gruszka 2011-06-20 04:05:12 EDT
We need to backport 3 commits to satisfy this bug:

commit 42b70a5f6d18165a075d189d1bee82fad7cdbf29
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Thu May 26 17:14:22 2011 +0200

    iwlagn: use cts-to-self protection on 5000 adapters series

commit bfd36103ec26599557c2bd3225a1f1c9267f8fcb
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Fri Apr 29 17:51:06 2011 +0200

    iwlagn: fix "Received BA when not expected"

commit b7977ffaab5187ad75edaf04ac854615cea93828
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Mon Feb 28 14:33:15 2011 +0100

    iwlwifi: add {ack,plpc}_check module parameter
Comment 74 Doug SIkora 2011-06-21 18:58:27 EDT
F14 x86_64, lenovo X201, hotspot is a verizon MIFI 4GLTE
dmesg reported: "iwlagn: Received BA when not expected"

Connections were dropping often, one odd thing was that vpn and network would look like it was and stayed attached but networking was just dead, restarting network service sometimes helped as did reboot of box but only for a short time.

the tip in bugzilla 646082 seemed to help:
"
If this cause some functional problem, this workaround should help:
echo "options iwlagn 11n_disable50=1 11n_disable=1" >>
/etc/modprobe.d/iwlwifi.conf
"


Connection has been rock stable with vpn connected for about an hour -- more than I ever had with this device.

iwglan info:


[root@blueclaw2 ~]# modinfo iwlagn
filename:       /lib/modules/2.6.35.13-92.fc14.x86_64/kernel/drivers/net/wireless/iwlwifi/iwlagn.ko
alias:          iwl4965
license:        GPL
author:         Copyright(c) 2003-2010 Intel Corporation <ilw@linux.intel.com>
version:        in-tree:d
description:    Intel(R) Wireless WiFi Link AGN driver for Linux
firmware:       iwlwifi-4965-2.ucode
firmware:       iwlwifi-5150-2.ucode
firmware:       iwlwifi-5000-5.ucode
firmware:       iwlwifi-6000g2a-4.ucode
firmware:       iwlwifi-6050-4.ucode
firmware:       iwlwifi-6000-4.ucode
firmware:       iwlwifi-1000-3.ucode
srcversion:     06F8064A1860DB736174519
<snip> removed alias lines</snip>
depends:        iwlcore,cfg80211,mac80211
vermagic:       2.6.35.13-92.fc14.x86_64 SMP mod_unload 
parm:           debug50:50XX debug output mask (deprecated) (uint)
parm:           debug:debug output mask (uint)
parm:           swcrypto50:using crypto in software (default 0 [hardware]) (deprecated) (bool)
parm:           swcrypto:using crypto in software (default 0 [hardware]) (int)
parm:           queues_num50:number of hw queues in 50xx series (deprecated) (int)
parm:           queues_num:number of hw queues. (int)
parm:           11n_disable50:disable 50XX 11n functionality (deprecated) (int)
parm:           11n_disable:disable 11n functionality (int)
parm:           amsdu_size_8K50:enable 8K amsdu size in 50XX series (deprecated) (int)
parm:           amsdu_size_8K:enable 8K amsdu size (int)
parm:           fw_restart50:restart firmware in case of error (deprecated) (int)
parm:           fw_restart:restart firmware in case of error (int)
parm:           disable_hw_scan:disable hardware scanning (default 0) (int)
parm:           ucode_alternative:specify ucode alternative to use from ucode file (int)
Comment 75 Stanislaw Gruszka 2011-06-22 02:20:01 EDT
(In reply to comment #74)
> /lib/modules/2.6.35.13-92.fc14.x86_64/kernel/drivers/net/wireless/iwlwifi
Bug should to be fixed in 2.6.35.13-93.
Comment 76 Stanislaw Gruszka 2011-06-22 02:27:56 EDT
Note I'm considered this bug (11n problems introduced in 2.6.35 kernel) fixed, 11n on iwlwifi should more or less work now. If you still get some troubles with the driver check other bug reports and CC yourself or open a new bug report.
Comment 77 Matt Wringe 2011-07-11 11:13:37 EDT
(In reply to comment #75)
> (In reply to comment #74)
> > /lib/modules/2.6.35.13-92.fc14.x86_64/kernel/drivers/net/wireless/iwlwifi
> Bug should to be fixed in 2.6.35.13-93.

kernel-2.6.35.13-93.fc14 is essentially unusable when using wifi (extremely slow speeds, in terms of bytes/second). Tested on an Fedora 14 x86_64 laptop with intel 4965 wifi. kernel-2.6.35.13-92 works fine until the module crashes and the system needs to be rebooted.

I don't think this issue is resolved yet.
Comment 78 Stanislaw Gruszka 2011-07-11 11:29:01 EDT
Comment 76 still applies, please open separate bug for that problem.
Comment 79 Matt Wringe 2011-07-11 11:55:38 EDT
(In reply to comment #78)
> Comment 76 still applies, please open separate bug for that problem.

Its not a separate bug, the only difference between kernel-2.6.35.13-92 and kernel-2.6.35.13-93 is the patch for this issue. The 'fix' for this issue causes a major regression (wifi working for hours until a kernel crash, versus wifi not working at all).

I shouldn't have to create a new bug which states the patch applied for this bug is bad and needs to be reverted.
Comment 80 Matt Wringe 2011-07-11 12:13:21 EDT
Since I can't confirm if this fixes the problem for me (since it basically disables wifi on my system). Can anyone else confirm that the -93 update actually fixes the problem for them?
Comment 81 Stanislaw Gruszka 2011-07-12 08:43:08 EDT
FYI: I opened bug 720662 for tracking regression with Matt's wireless connection.
Comment 82 tuxor 2011-08-08 08:23:19 EDT
WLAN is really buggy for me, unless I disable 11n for iwlagn.

My Intel chip is: Intel Corporation Ultimate N WiFi Link 5300
(on Thinkpad T400s with x86_64)

Last messages before my wireless connection dies:

Aug  8 13:44:56 fedora kernel: [ 2632.719413] iwlagn 0000:03:00.0: Stopping AGG while state not ON or starting
Aug  8 13:44:56 fedora kernel: [ 2632.733584] cfg80211: Calling CRDA to update world regulatory domain
Aug  8 13:44:56 fedora NetworkManager[935]: <info> (wlan0): supplicant interface state: completed -> disconnected
Aug  8 13:44:56 fedora kernel: [ 2632.759633] cfg80211: World regulatory domain updated:
Aug  8 13:44:56 fedora kernel: [ 2632.759636] cfg80211:     (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Aug  8 13:44:56 fedora kernel: [ 2632.759641] cfg80211:     (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759644] cfg80211:     (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759646] cfg80211:     (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759649] cfg80211:     (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759651] cfg80211:     (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759663] cfg80211: Calling CRDA for country: DE
Aug  8 13:44:56 fedora kernel: [ 2632.766360] cfg80211: Regulatory domain changed to country: DE
Aug  8 13:44:56 fedora kernel: [ 2632.766362] cfg80211:     (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Aug  8 13:44:56 fedora kernel: [ 2632.766365] cfg80211:     (2400000 KHz - 2483500 KHz @ 40000 KHz), (N/A, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.766368] cfg80211:     (5150000 KHz - 5250000 KHz @ 40000 KHz), (N/A, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.766370] cfg80211:     (5250000 KHz - 5350000 KHz @ 40000 KHz), (N/A, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.766372] cfg80211:     (5470000 KHz - 5725000 KHz @ 40000 KHz), (N/A, 2698 mBm)
Aug  8 13:44:56 fedora NetworkManager[935]: <info> (wlan0): supplicant interface state: disconnected -> scanning
Comment 83 tuxor 2011-08-08 08:24:22 EDT
This is with Kernel 2.6.40-4.fc15.x86_64 on Fedora 15
Comment 84 Fedora Update System 2011-08-17 13:38:13 EDT
kernel-2.6.35.14-95.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/kernel-2.6.35.14-95.fc14
Comment 85 Fedora Update System 2011-08-17 22:36:25 EDT
Package kernel-2.6.35.14-95.fc14:
* should fix your issue,
* was pushed to the Fedora 14 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-2.6.35.14-95.fc14'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/kernel-2.6.35.14-95.fc14
then log in and leave karma (feedback).
Comment 86 Johannes H. Jensen 2011-08-19 20:47:27 EDT
Any chance the fix can be applied to 2.6.40.3 on Fedora 15 as well? I'm seeing the same issue there with the iwl4965 driver. The 11n_disable workaround also applies.
Comment 87 tuxor 2011-08-20 10:03:40 EDT
*PUSH* for Johannes' question, since I'm in the same situation (Fedora 15 with 2.6.40.3 and am using 11n_disable workaround). Or is there already a bug report concerning this for Fedora 15?
Comment 88 Fedora Update System 2011-08-23 00:36:38 EDT
kernel-2.6.35.14-95.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 89 Stanislaw Gruszka 2011-08-25 06:17:40 EDT
tuxor, Johannes, please open separate bug report for your problem.
Comment 90 tuxor 2011-08-25 06:31:06 EDT
Filed the respective bug here: https://bugzilla.redhat.com/show_bug.cgi?id=733269 Please contribute, if you are a user of Fedora 15. Thanks!