Bug 733269

Summary: [iwl5300] wireless broken on 11n (F15)
Product: [Fedora] Fedora Reporter: tuxor <acc-bugz-redhat>
Component: kernelAssignee: Stanislaw Gruszka <sgruszka>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 15CC: aquini, gansalmon, itamar, joh, jonathan, kernel-maint, madhu.chinakonda, mishu, sgruszka, tomspur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.40.4-5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-09-01 11:06:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Debug-Output in /var/log/kernel
none
debug-output as per comment #2
none
dmesg output when 11n is enabled on iwl4965 none

Description tuxor 2011-08-25 10:29:57 UTC
Description of problem:
As reported in this bug (https://bugzilla.redhat.com/show_bug.cgi?id=648732) for Fedora 14, there are still problems in Fedora 15 concerning the very same thing. Issue has been fixed in the respective F14-Bug.

Version-Release number of selected component (if applicable):
2.6.40-4.fc15

In my case, the "Intel Corporation Ultimate N WiFi Link 5300" has major connection problems (disconnects automatically and violently after a few seconds), as long as I don't add the line "options iwlagn 11n_disable=1" in /etc/modprobe.d/modprobe.conf. 

Last messages before my wireless connection dies:

Aug  8 13:44:56 fedora kernel: [ 2632.719413] iwlagn 0000:03:00.0: Stopping AGG
while state not ON or starting
Aug  8 13:44:56 fedora kernel: [ 2632.733584] cfg80211: Calling CRDA to update
world regulatory domain
Aug  8 13:44:56 fedora NetworkManager[935]: <info> (wlan0): supplicant
interface state: completed -> disconnected
Aug  8 13:44:56 fedora kernel: [ 2632.759633] cfg80211: World regulatory domain
updated:
Aug  8 13:44:56 fedora kernel: [ 2632.759636] cfg80211:     (start_freq -
end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Aug  8 13:44:56 fedora kernel: [ 2632.759641] cfg80211:     (2402000 KHz -
2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759644] cfg80211:     (2457000 KHz -
2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759646] cfg80211:     (2474000 KHz -
2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759649] cfg80211:     (5170000 KHz -
5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759651] cfg80211:     (5735000 KHz -
5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.759663] cfg80211: Calling CRDA for
country: DE
Aug  8 13:44:56 fedora kernel: [ 2632.766360] cfg80211: Regulatory domain
changed to country: DE
Aug  8 13:44:56 fedora kernel: [ 2632.766362] cfg80211:     (start_freq -
end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Aug  8 13:44:56 fedora kernel: [ 2632.766365] cfg80211:     (2400000 KHz -
2483500 KHz @ 40000 KHz), (N/A, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.766368] cfg80211:     (5150000 KHz -
5250000 KHz @ 40000 KHz), (N/A, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.766370] cfg80211:     (5250000 KHz -
5350000 KHz @ 40000 KHz), (N/A, 2000 mBm)
Aug  8 13:44:56 fedora kernel: [ 2632.766372] cfg80211:     (5470000 KHz -
5725000 KHz @ 40000 KHz), (N/A, 2698 mBm)
Aug  8 13:44:56 fedora NetworkManager[935]: <info> (wlan0): supplicant
interface state: disconnected -> scanning

Comment 1 Stanislaw Gruszka 2011-08-26 08:37:54 UTC
All bug 648732 fixes are applied in F-15. So this is a different problem. I don't think it is common. Regarding bug 648732 I was able to reproduce the problem, now all my 11n APs works with iwl5300 and 5100.

Please provide me information about your AP vendor and type. I will see if I can get the same AP and hence if I can reproduce problem locally, if not I will ask you for more verbose debug messages.

Comment 2 tuxor 2011-08-26 08:55:00 UTC
I just tested it one more time and again the same problem: At first NM connects fine, but after a few seconds the connection dies and /var/log/messages has the following log: http://pastebin.com/fCLn5xFs

This doesn't occur with 11n_disable=1 for iwlagn. I'm living in a household together with three other persons using the same AP without problems. It's only my "Intel Corporation Ultimate N WiFi Link 5300" that's behaving wrong.

We are using a Telekom Speedport W504V WLAN-Router/DSL-Modem with WPA-PSK. Unfortunately, there is no other AP or another OS to check at the moment.

Comment 3 Stanislaw Gruszka 2011-08-26 10:05:09 UTC
These Telecom is quite exotic device, I don't thing some of my friends have it.

Ok, please configure rsyslog like described in "Configure syslog to log kernel debug messages" in  https://fedoraproject.org/wiki/DebugWireless .

Then do the following:

modprobe -r iwlagn
echo > /var/log/kernel
modprobe iwlagn debug=0x47ffffff

wait until connection fail.

modprobe -r iwlagn

and attach /var/log/kernel here. Thanks.

Comment 4 tuxor 2011-08-26 10:33:22 UTC
Created attachment 520060 [details]
Debug-Output in /var/log/kernel

As Stanislaw Gruszka requested, here is the debug-output when error occurs.

Comment 5 Thomas Spura 2011-08-31 12:06:53 UTC
Is your access point still alive, when you are disconnected?

I have a W503 V and this kernel kills the access point... After a restart of it, it works for maybe a minute (depends on the load of the wifi) and then it's killed again.

Will try the 11n_disable=1 workaround.

Comment 6 tuxor 2011-08-31 12:15:50 UTC
Well, it doesn't _always_ kill my AP, but in most cases. And yes, after a restart, it works again for a minute and so on, like you describe.

We seem to have exactly the same problem, me having a W504V and you having the W503V. So this bug seems to take shape now.

Comment 7 Stanislaw Gruszka 2011-08-31 12:38:27 UTC
Try some new kernel i.e that one:
http://koji.fedoraproject.org/koji/buildinfo?buildID=261186
It include fix/workaround discussed in bug 708747

Comment 8 Thomas Spura 2011-08-31 12:38:51 UTC
Created attachment 520811 [details]
debug-output as per comment #2

(In reply to comment #6)
> Well, it doesn't _always_ kill my AP, but in most cases. And yes, after a
> restart, it works again for a minute and so on, like you describe.
> 
> We seem to have exactly the same problem, me having a W504V and you having the
> W503V. So this bug seems to take shape now.

Yes, hopefully it's poosible to resolve it...


(In reply to comment #5)
> Will try the 11n_disable=1 workaround.

Works here too!

Attached the debugoutput in my case, hope it helps.

Comment 9 Thomas Spura 2011-08-31 13:02:05 UTC
(In reply to comment #7)
> Try some new kernel i.e that one:
> http://koji.fedoraproject.org/koji/buildinfo?buildID=261186
> It include fix/workaround discussed in bug 708747

Works like before 2.6.40.

Thanks!

Comment 10 tuxor 2011-08-31 19:31:53 UTC
2.6.40.4-5 works for me, too!

Thanks!

Comment 11 Johannes H. Jensen 2011-09-01 20:09:34 UTC
Tried 2.6.40.4-5 and unfortunately the problem persists. I'm on a different card though (4965AGN) which uses a different driver (iwl4965). Attaching dmesg output when 11n is enabled and performance is severely degraded. Disabling 11n with 11n_disable=1 results in good performance.

The only difference in dmesg output between the two are the lines:

[  802.507084] iwl4965 0000:03:00.0: Aggregation not enabled for tid 0 because load = 6
[  815.870043] iwl4965 0000:03:00.0: iwl4965_tx_agg_start on ra = 00:25:9c:ca:7b:bc tid = 0

which appear only when 11n is enabled. Let me know if you think this is a different issue and should be reported as a new bug.

Comment 12 Johannes H. Jensen 2011-09-01 20:10:23 UTC
Created attachment 521094 [details]
dmesg output when 11n is enabled on iwl4965

Comment 13 Thomas Spura 2011-09-02 08:54:59 UTC
(In reply to comment #11)
> Tried 2.6.40.4-5 and unfortunately the problem persists. I'm on a different
> card though (4965AGN) which uses a different driver (iwl4965). Attaching dmesg
> output when 11n is enabled and performance is severely degraded. Disabling 11n
> with 11n_disable=1 results in good performance.
> 
> The only difference in dmesg output between the two are the lines:
> 
> [  802.507084] iwl4965 0000:03:00.0: Aggregation not enabled for tid 0 because
> load = 6
> [  815.870043] iwl4965 0000:03:00.0: iwl4965_tx_agg_start on ra =
> 00:25:9c:ca:7b:bc tid = 0
> 
> which appear only when 11n is enabled. Let me know if you think this is a
> different issue and should be reported as a new bug.

Sounds like another issue.

Could be bug #732592, he has a "Dell Latitude D830 with an Intel Wireless 4965 chipset".

Comment 14 Stanislaw Gruszka 2011-09-02 09:00:40 UTC
(In reply to comment #13)
> Sounds like another issue.
Yes, we do not set max_tx_aggregation_subframes parameter in 4965 driver, what was issue here.

> Could be bug #732592, he has a "Dell Latitude D830 with an Intel Wireless 4965
> chipset".
Rather no, because #732592 does not work in G (and N) mode, Johannes have only problems in N mode. I'll prefer separate bug report for Johannes issue.