Bug 981445 - WiFi fails after some time
WiFi fails after some time
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
x86_64 Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Stanislaw Gruszka
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-04 13:56 EDT by Steven Stern
Modified: 2013-08-19 15:08 EDT (History)
13 users (show)

See Also:
Fixed In Version: kernel-3.10.4-100.fc18
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-09 13:12:26 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Snippet from /var/log/messages when problem starts (11.08 KB, text/plain)
2013-07-04 13:56 EDT, Steven Stern
no flags Details
mac80211_print_change_bandwidth.patch (2.50 KB, text/plain)
2013-07-08 04:43 EDT, Stanislaw Gruszka
no flags Details
dmesg output following Wifi disconnection (22.11 KB, application/gzip)
2013-07-11 08:56 EDT, Steven Stern
no flags Details
dmesg output following Wifi disconnection (22.19 KB, application/gzip)
2013-07-11 19:13 EDT, Steven Stern
no flags Details
mac80211_print_change_bandwidth_v2.patch (3.42 KB, text/plain)
2013-07-16 10:47 EDT, Stanislaw Gruszka
no flags Details
dmesg output folloiwng wifi confusion (20.61 KB, application/gzip)
2013-07-18 14:24 EDT, Steven Stern
no flags Details
dmesg output - disconnect following channel change from 3 to 3,+1 (21.72 KB, application/gzip)
2013-07-22 18:00 EDT, Steven Stern
no flags Details

  None (edit)
Description Steven Stern 2013-07-04 13:56:14 EDT
Created attachment 768934 [details]
Snippet from /var/log/messages when problem starts

Description of problem:

After some time in operation, the WiFi network is dropped. Network Manager attempts to connect, but reconnection is unsuccessful.  Reboot is required to rejoin the network.


Version-Release number of selected component (if applicable):


How reproducible:
Use computer and wait


Additional info:

Router is a DLink DIR-825 at firmware 2.06NA
Comment 1 Steven Stern 2013-07-04 13:58:46 EDT
$ uname -a
Linux sds-desk-2.sterndata.local 3.9.8-300.fc19.x86_64 #1 SMP Thu Jun 27 19:24:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Comment 2 Steven Stern 2013-07-05 09:38:09 EDT
Network controller: Ralink corp. RT3090 Wireless 802.11n 1T/1R PCIe
Comment 3 Stanislaw Gruszka 2013-07-08 04:39:49 EDT
There is problem with changing bandwidth i.e. 40MHz to 20MHz:

Jul  4 12:40:54 sds-desk-2 kernel: [50180.113727] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth, new config is 2422 MHz, width 2 (2432/0 MHz)
Jul  4 12:41:00 sds-desk-2 kernel: [50185.860848] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth, new config is 2422 MHz, width 1 (2422/0 MHz)
Jul  4 12:41:00 sds-desk-2 kernel: [50185.860854] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth in a way we can't support - disconnect

But I'm not sure why. I'll provide provide a patch, which print more information and hopefully allow to find a fix ...
Comment 4 Stanislaw Gruszka 2013-07-08 04:43:03 EDT
Created attachment 770356 [details]
mac80211_print_change_bandwidth.patch

Debug patch, please reproduce the problem with this patch and attach output of dmesg .

Kernel build with the patch is here (compiling currently):
http://koji.fedoraproject.org/koji/taskinfo?taskID=5583914
Comment 5 Steven Stern 2013-07-08 09:38:58 EDT
Comment on attachment 770356 [details]
mac80211_print_change_bandwidth.patch

Sorry, I don't know how to apply this patch.
Comment 6 Stanislaw Gruszka 2013-07-08 10:03:20 EDT
Just download kernel from 
http://koji.fedoraproject.org/koji/taskinfo?taskID=5583914
and install it by "rpm -ivh kernel-VERSION.rpm . It include patch.
Comment 7 Steven Stern 2013-07-08 10:09:54 EDT
I get an error: 

$ sudo rpm -ivh kernel-3.9.9-201.bz981445_debug.fc18.x86_64.rpm 
Preparing...                          ################################# [100%]
	package kernel-3.9.9-301.fc19.x86_64 (which is newer than kernel-3.9.9-201.bz981445_debug.fc18.x86_64) is already installed



I can force it, but should I be installing an F18 kernel?
Comment 8 Stanislaw Gruszka 2013-07-08 13:06:25 EDT
Oh, I missed this is F19 bug. You can force to install f18 kernel and try to boot it, if that fail I'll prepare f19 version.
Comment 9 Steven Stern 2013-07-08 13:37:47 EDT
OK, it loaded:  
$ uname -a
Linux sds-desk-2.sterndata.local 3.9.9-201.bz981445_debug.fc18.x86_64 #1 SMP Mon Jul 8 08:39:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

It takes usually several hours to fail, so I'll post a comment when it does. I assume you want /var/log/mesages. Anything else?
Comment 10 Stanislaw Gruszka 2013-07-09 05:48:27 EDT
Either attaching dmesg output or snippet of /var/log/messages is fine.
Comment 11 Steven Stern 2013-07-09 15:59:16 EDT
No problems over 24 hours.  Because I didn't have this problem with F18 and it started with F19, could  you compile an F19 kernel for me when you have the chance? Thanks.
Comment 12 Stanislaw Gruszka 2013-07-10 09:15:06 EDT
Ok, I started F19 build here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=5590914
Comment 13 Steven Stern 2013-07-11 08:56:25 EDT
Created attachment 772211 [details]
dmesg output following Wifi disconnection

Wifi disconnected during the night.
Comment 14 Steven Stern 2013-07-11 19:13:17 EDT
Created attachment 772472 [details]
dmesg output following Wifi disconnection

another disconnect - this time I could not reconnect without a power cycle
Comment 15 Steven Stern 2013-07-13 10:51:07 EDT
On the WiFi access point, disabling the option "Channel Width Auto" and restricting it to a 20MHz channel width seems to keep the connection alive. Previously, it was set to "auto - 20/40MHz".  This option worked well with previous versions of the F18 kernel.
Comment 16 Stanislaw Gruszka 2013-07-16 10:21:06 EDT
Yes, there is something wrong when AP want to switch bandwidth. Unfortunately I still don't know what, I'll prepare some more verbose debug patch ...
Comment 17 Stanislaw Gruszka 2013-07-16 10:47:08 EDT
Created attachment 774357 [details]
mac80211_print_change_bandwidth_v2.patch

This patch add same more prints. Please configure AP to switch between 20/40 MHz bandwidth , install below kernel and provide dmesg.

http://koji.fedoraproject.org/koji/taskinfo?taskID=5613084
Comment 18 Steven Stern 2013-07-16 16:43:05 EDT
After enabling auto bandwidth, I changed the channel from 2 to 3. At that point, I could no longer access the access point, even after several 30 second power cycles using the debug kernel. I had to reboot to  3.9.9-302.fc19.x86_64 to get back online.
Comment 19 Stanislaw Gruszka 2013-07-17 02:33:38 EDT
Debug kernel does not change from standard one except it add some more print messages, so I don't know why it does not associate. This could be new bug or another way of manifest of this investigeted bug. Anyway I would also like to see logs from that failure attempt. Also why do you changed channel, I thought before you changed only "Channel Width Auto" option ?
Comment 20 Steven Stern 2013-07-17 10:00:19 EDT
Back on Channel 3 in auto bandwidth mode with the debug kernel.  I'll upload a log next time it loses the network.
Comment 21 Steven Stern 2013-07-18 14:24:12 EDT
Created attachment 775436 [details]
dmesg output folloiwng wifi confusion

It looked like I lost the network, then NetworkManager seemed to recover it.  I had an IP address, etc. I was, however, unable to ping the router or any other address on the network. I restarted both Network Manager and wpa supplicant.  Again, NM thought it was connected, but the network was effectively dead.  I had to reboot to recover.  DMESG output attached.
Comment 22 Steven Stern 2013-07-22 18:00:01 EDT
Created attachment 777091 [details]
dmesg output - disconnect following channel change from 3 to 3,+1

Here's var/log/messages at that moment.

Jul 22 16:00:05 sds-desk-2 kernel: [355441.537592] wlp3s0: ieee80211_determine_chantype ht_oper ffff8801806f414e sband->ht_cap.ht_supported 1
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537594] wlp3s0: channel->center_freq 2422, ht_cfreq 2422
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537596] wlp3s0: 1 flags: 00000800
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537597] wlp3s0: 2 flags: 00000800
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537597] wlp3s0: chan: ffff8802113ba870 , ffff8802113ba870
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537598] wlp3s0: chan->center_freq: 2422 , 2422
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537599] wlp3s0: width: 2 , 1
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537599] wlp3s0: center_freq1: 2432 , 2422
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537600] wlp3s0: center_freq2: 0 , 0
Jul 22 16:00:05 sds-desk-2 kernel: [355441.537601] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth, new config is 2422 MHz, width 2 (2432/0 MHz)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787387] wlp3s0: ieee80211_determine_chantype ht_oper ffff8802106f614f sband->ht_cap.ht_supported 1
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787390] wlp3s0: channel->center_freq 2422, ht_cfreq 2442
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787390] wlp3s0: Wrong control channel: center-freq: 2422 ht-cfreq: 2442 ht->primary_chan: 7 band: 0 - Disabling HT
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787392] wlp3s0: 1 flags: 00000810
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787393] wlp3s0: 2 flags: 00000810
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787393] wlp3s0: chan: ffff8802113ba870 , ffff8802113ba870
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787394] wlp3s0: chan->center_freq: 2422 , 2422
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787395] wlp3s0: width: 1 , 2
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787395] wlp3s0: center_freq1: 2422 , 2432
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787396] wlp3s0: center_freq2: 0 , 0
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787397] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth, new config is 2422 MHz, width 1 (2422/0 MHz)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787398] wlp3s0: flags 00000810 , 00000800
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787398] wlp3s0: chandef_valid 1
Jul 22 16:00:08 sds-desk-2 kernel: [355443.787399] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth in a way we can't support - disconnect
Jul 22 16:00:08 sds-desk-2 NetworkManager[730]: <warn> Connection disconnected (reason -3)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.829095] cfg80211: Calling CRDA to update world regulatory domain
Jul 22 16:00:08 sds-desk-2 kernel: [355443.967079] cfg80211: World regulatory domain updated:
Jul 22 16:00:08 sds-desk-2 kernel: [355443.967081] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.967082] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.967083] cfg80211:   (2457000 KHz - 2482000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.967083] cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.967084] cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.967085] cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.967179] cfg80211: Calling CRDA for country: US
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968363] cfg80211: Regulatory domain changed to country: US
                                                                                                                                                                                            4828,1        13%
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968363] cfg80211: Regulatory domain changed to country: US
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968364] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968365] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2700 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968366] cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 1700 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968366] cfg80211:   (5250000 KHz - 5330000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968367] cfg80211:   (5490000 KHz - 5600000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968368] cfg80211:   (5650000 KHz - 5710000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968368] cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 3000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355443.968369] cfg80211:   (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 4000 mBm)
Jul 22 16:00:08 sds-desk-2 NetworkManager[730]: <info> (wlp3s0): supplicant interface state: completed -> disconnected
Jul 22 16:00:08 sds-desk-2 NetworkManager[730]: <info> (wlp3s0): supplicant interface state: disconnected -> scanning
Jul 22 16:00:08 sds-desk-2 kernel: [355444.671691] wlp3s0: authenticate with 00:18:e7:f7:50:2a
Jul 22 16:00:08 sds-desk-2 kernel: [355444.671696] wlp3s0: ieee80211_determine_chantype ht_oper ffff8801f77968e4 sband->ht_cap.ht_supported 1
Jul 22 16:00:08 sds-desk-2 kernel: [355444.671697] wlp3s0: channel->center_freq 2422, ht_cfreq 2422
Jul 22 16:00:08 sds-desk-2 kernel: [355444.688940] wlp3s0: send auth to 00:18:e7:f7:50:2a (try 1/3)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.698196] wlp3s0: authenticated
Jul 22 16:00:08 sds-desk-2 kernel: [355444.699189] wlp3s0: associate with 00:18:e7:f7:50:2a (try 1/3)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.703250] wlp3s0: RX AssocResp from 00:18:e7:f7:50:2a (capab=0x431 status=0 aid=2)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.703410] wlp3s0: associated
Jul 22 16:00:08 sds-desk-2 NetworkManager[730]: <info> (wlp3s0): supplicant interface state: scanning -> authenticating
Jul 22 16:00:08 sds-desk-2 NetworkManager[730]: <info> (wlp3s0): supplicant interface state: authenticating -> associating
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709490] wlp3s0: ieee80211_determine_chantype ht_oper ffff880197c9e14e sband->ht_cap.ht_supported 1
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709492] wlp3s0: channel->center_freq 2422, ht_cfreq 2442
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709493] wlp3s0: Wrong control channel: center-freq: 2422 ht-cfreq: 2442 ht->primary_chan: 7 band: 0 - Disabling HT
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709494] wlp3s0: 1 flags: 00000810
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709495] wlp3s0: 2 flags: 00000810
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709496] wlp3s0: chan: ffff8802113ba870 , ffff8802113ba870
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709497] wlp3s0: chan->center_freq: 2422 , 2422
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709498] wlp3s0: width: 1 , 2
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709499] wlp3s0: center_freq1: 2422 , 2432
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709500] wlp3s0: center_freq2: 0 , 0
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709501] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth, new config is 2422 MHz, width 1 (2422/0 MHz)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709502] wlp3s0: flags 00000810 , 00000800
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709503] wlp3s0: chandef_valid 1
Jul 22 16:00:08 sds-desk-2 kernel: [355444.709504] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth in a way we can't support - disconnect
Jul 22 16:00:08 sds-desk-2 NetworkManager[730]: <info> (wlp3s0): supplicant interface state: associating -> associated
Jul 22 16:00:08 sds-desk-2 kernel: [355444.729309] cfg80211: Calling CRDA to update world regulatory domain
Jul 22 16:00:08 sds-desk-2 kernel: [355444.730396] cfg80211: World regulatory domain updated:
Jul 22 16:00:08 sds-desk-2 kernel: [355444.730398] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.730399] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.730400] cfg80211:   (2457000 KHz - 2482000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.730401] cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.730402] cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.730402] cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.730410] cfg80211: Calling CRDA for country: US
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731388] cfg80211: Regulatory domain changed to country: US
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731390] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731391] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2700 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731391] cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 1700 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731392] cfg80211:   (5250000 KHz - 5330000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731393] cfg80211:   (5490000 KHz - 5600000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731394] cfg80211:   (5650000 KHz - 5710000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731394] cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 3000 mBm)
Jul 22 16:00:08 sds-desk-2 kernel: [355444.731395] cfg80211:   (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 4000 mBm)
Comment 23 Stanislaw Gruszka 2013-07-26 05:39:29 EDT
Here is the interesting part:

> [357926.090967] wlp3s0: ieee80211_determine_chantype ht_oper ffff8801806566e4 sband->ht_cap.ht_supported 1
> [357926.090969] wlp3s0: channel->center_freq 2422, ht_cfreq 2422
> [357926.109318] wlp3s0: send auth to 00:18:e7:f7:50:2a (try 1/3)
> [357926.110794] wlp3s0: authenticated
> [357926.111321] wlp3s0: associate with 00:18:e7:f7:50:2a (try 1/3)
> [357926.115030] wlp3s0: RX AssocResp from 00:18:e7:f7:50:2a (capab=0x431 status=0 aid=6)
> [357926.115142] wlp3s0: associated
> [357926.134397] wlp3s0: ieee80211_determine_chantype ht_oper ffff88008542614e sband->ht_cap.ht_supported 1
> [357926.134402] wlp3s0: channel->center_freq 2422, ht_cfreq 2442
> [357926.134404] wlp3s0: Wrong control channel: center-freq: 2422 ht-cfreq: 2442 ht->primary_chan: 7 band: 0 - Disabling HT
> [357926.134408] wlp3s0: 1 flags: 00000810
> [357926.134410] wlp3s0: 2 flags: 00000810
> [357926.134412] wlp3s0: chan: ffff8802113ba870 , ffff8802113ba870
> [357926.134415] wlp3s0: chan->center_freq: 2422 , 2422
> [357926.134416] wlp3s0: width: 1 , 2
> [357926.134418] wlp3s0: center_freq1: 2422 , 2432
> [357926.134419] wlp3s0: center_freq2: 0 , 0
> [357926.134422] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth, new config is 2422 MHz, width 1 (2422/0 MHz)
> [357926.134424] wlp3s0: flags 00000810 , 00000800
> [357926.134425] wlp3s0: chandef_valid 1
> [357926.134427] wlp3s0: AP 00:18:e7:f7:50:2a changed bandwidth in a way we can't support - disconnect

So in HT information element AP send us wrong primary_channel (channel 7 instead of channel 3). Problem is that on old kernel this worked.

This check was added by:

commit f2d9d270c15ae0139b54a7e7466d738327e97e03
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Thu Nov 22 14:11:39 2012 +0100

    mac80211: support VHT association

and there is comment that some AP send wrong information:


        if (channel->center_freq != ht_cfreq) {
                /*
                 * It's possible that some APs are confused here;
                 * Netgear WNDR3700 sometimes reports 4 higher than
                 * the actual channel in association responses, but
                 * since we look at probe response/beacon data here
                 * it should be OK.
                 */ 
                if (verbose)
                        sdata_info(sdata,
                                   "Wrong control channel: center-freq: %d ht-cfreq: %d ht->primary_chan: %d band: %d - Disabling HT\n",
                                   channel->center_freq, ht_cfreq,
                                   ht_oper->primary_chan, channel->band);
                ret = IEEE80211_STA_DISABLE_HT | IEEE80211_STA_DISABLE_VHT;
                goto out;

Johannes, can we remove that check i.e. add warning that ht->primary_chan is wrong and continue ?
Comment 24 Johannes Berg 2013-07-26 06:21:42 EDT
I don't think that'd be a good idea. Also, if you look closely at the commit you quoted, it had the same check before, and also disabled HT in that case (and VHT wasn't supported).

I think the problem here is that when the AP switches, it suddenly has an invalid HT IE, that was previously valid. Or maybe it's valid in the probe response, but invalid in the beacon.

Previously we didn't really react properly to bandwidth changes at all, or didn't parse it all, so I think the reason for this surfacing is actually a different commit.
Comment 25 Stanislaw Gruszka 2013-07-26 07:27:09 EDT
Johannes on chat pointed that problem is caused because AP send different HT information in PROBE RESPONSE frame and different in BEACON frame. This worked before with mac80211 (STA mode) because we did not validate channel parameters when changing bandwitch. Johannes proposed disabling validation on broken APs to restore previous kernel behaviour, patch is here:

http://p.sipsolutions.net/9d1dd0734d2c3a7a.txt

Kernel build with a patch is here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=5659623

Steven, please test it.
Comment 26 Paul Wouters 2013-07-29 13:14:51 EDT
this bug seems to be hitting people hard at IETF87 right now. Both Fedora and Debian users. Reports I see are based on 3.10 kernels
Comment 27 Paul Wouters 2013-07-29 13:16:02 EDT
additionally, the combination of kernel and NM seem to be causing giant log files to be created, resulting in machines with a load > 25 and gigabytes of logfiles within a few minutes.
Comment 28 Steven Stern 2013-07-29 14:20:27 EDT
I'm still running the debug kernel. Does anyone know how to force an AP to go from 20MHz to 40MHz?
Comment 29 Hugo Salgado 2013-07-30 16:47:34 EDT
I confirm the debug still persists after the kernel-3.10.3-300.test_bz981445 upgrade.
Also I think there's some relation with the density of APs. I have this same problem reported here https://bugs.archlinux.org/task/15363

% iwlist wlan0 scan
print_scanning_info: Allocation failed
Comment 30 Johannes Berg 2013-07-31 05:51:02 EDT
Let's do this instead: http://p.sipsolutions.net/06427e0e6847cf9b.txt
Comment 31 Stanislaw Gruszka 2013-07-31 07:15:28 EDT
Build with above patch is here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=5683174
Please test it , if it does not fix the problem please provide dmesg.
Comment 32 Stanislaw Gruszka 2013-07-31 07:18:59 EDT
(In reply to Hugo Salgado from comment #29)
> % iwlist wlan0 scan
> print_scanning_info: Allocation failed
This is different issue. Wireless-tools, which iwlist belongs to, is no longer actively maintained. If problem happens with iw i.e. "iw dev wlan0 scan" fail to allocate enough memory, please open separate bug for iw component.
Comment 33 Chris Wright 2013-07-31 15:39:57 EDT
Build with 2 additional patches:

http://article.gmane.org/gmane.linux.kernel.wireless.general/111396
http://article.gmane.org/gmane.linux.kernel.wireless.general/111394

Launched here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=5686291

I believe this is will fix the issues observed IETF87 that Paul mentioned in comment #26
Comment 34 Hugo Salgado 2013-08-01 02:54:50 EDT
(In reply to Chris Wright from comment #33)
> Build with 2 additional patches:
> 
> http://article.gmane.org/gmane.linux.kernel.wireless.general/111396
> http://article.gmane.org/gmane.linux.kernel.wireless.general/111394
> 
> Launched here:
> 
> http://koji.fedoraproject.org/koji/taskinfo?taskID=5686291
> 
> I believe this is will fix the issues observed IETF87 that Paul mentioned in
> comment #26

Yes, problem solved.

Confirmed with the same AP that fails using the 3.9.11-200.fc18.x86_64 kernel.

Thanks a lot.
Comment 36 Josh Boyer 2013-08-01 08:50:55 EDT
Applied to F18-rawhide.  Thanks!
Comment 37 Steven Stern 2013-08-01 09:18:01 EDT
Thank you all very much!
Comment 38 Fedora Update System 2013-08-02 17:47:13 EDT
kernel-3.10.4-100.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/kernel-3.10.4-100.fc18
Comment 39 Fedora Update System 2013-08-03 19:57:21 EDT
Package kernel-3.10.4-100.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.10.4-100.fc18'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-14177/kernel-3.10.4-100.fc18
then log in and leave karma (feedback).
Comment 40 Hugo Salgado 2013-08-06 10:07:34 EDT
I've tried kernel-3.10.4-100.fc18 and its working now, but sadly I can't test it with the same setup that failed before (it was a conference now ended).
Thanks.
Comment 41 Fedora Update System 2013-08-07 16:47:56 EDT
kernel-3.10.5-201.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/kernel-3.10.5-201.fc19
Comment 42 Fedora Update System 2013-08-09 13:12:26 EDT
kernel-3.10.5-201.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 43 Fedora Update System 2013-08-10 16:10:01 EDT
kernel-3.10.4-100.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 44 Steven Stern 2013-08-10 16:34:12 EDT
I have changed channels 3 times since installing 3.10.5-201.fc19.x86_64.  The wifi connection is solid.  Thanks.

Note You need to log in before you can comment on or make changes to this bug.