Bug 1001041 - NetworkManager disconnects after cfg80211 calls CRDA to update world regulatory domain
NetworkManager disconnects after cfg80211 calls CRDA to update world regulato...
Status: CLOSED WORKSFORME
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
x86_64 Linux
unspecified Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-26 07:46 EDT by johnny.westerlund
Modified: 2013-11-19 02:18 EST (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 811054
Environment:
Last Closed: 2013-11-19 02:18:26 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description johnny.westerlund 2013-08-26 07:46:05 EDT
+++ This bug was initially created as a clone of Bug #811054 +++

Description of problem:

I have one location where I cannot get wireless to work for my laptop. After talk with the network admins for this location, it appears that the disconnection is happening actively on my side.

A fuller log is attached to this bug report, but below this descrption is the part of the log is where the oddness seems to occur. NetworkManager is able to get a full connection, and then suddenly the kernel module cfg80211 calls for "CRDA to update world regulatory domain" and NM immediately drops the connection.

This cycle continues until I give up and go looking for an available wired connection.

This office has three Atheros access points all broadcasting the same SSID. The idea AIUI is the client will choose the AP with the strongest signal and switch to that one when roaming around the office. (I don't yet have the exact model of Atheros AP.) If that is affecting me, I cannot yet see how since I haven't been connected longer than 5 seconds.

I cannot tell if this is happening in NetworkManager or if NM is responding correctly to something odd from the cfg80211 module or rtlwifi driver. If I've falsely accused NM, mea culpa, and please pass me along to the correct component.

I should have the latest packages for this version of Fedora. I plan to do the jump to F17, if that contains an updated rtlwifi driver. Otherwise, if there is a way to test an updated driver, I'm all for it.

Version-Release number of selected component (if applicable):

Lenovo ThinkPad - X220 Tablet (4299-WG1)

kernel-2.6.42.12-1.fc15.x86_64

NetworkManager-0.9.1.90-3.git20110927.fc15.x86_64

03:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8188CE 802.11b/g/n WiFi Adapter (rev 01)


How reproducible: Every time.


Steps to Reproduce:
1. Attempt to join 'NextSpaceSC' wifi
2. Connection works but mysteriously drops when cfg80211 starts messing about
3. Rinse, repeat.
  
Actual results:

No network connection.

Expected results:

Network connection maintained until I choose to disconnect (or some other force majeur I cannot control.)

Additional info:

Here is the log snippet I think is most relevant, with the rest attached:

<pre>
Apr  9 14:58:32 terpsichore kernel: [791944.201959] cfg80211: Calling CRDA to update world regulatory domain
Apr  9 14:58:32 terpsichore NetworkManager[902]: <info> (wlan1): supplicant interface state: completed -> disconnected
Apr  9 14:58:32 terpsichore kernel: [791944.221809] cfg80211: World regulatory domain updated:
Apr  9 14:58:32 terpsichore kernel: [791944.221812] cfg80211:     (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Apr  9 14:58:32 terpsichore kernel: [791944.221813] cfg80211:     (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.221815] cfg80211:     (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.221816] cfg80211:     (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.221818] cfg80211:     (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.221819] cfg80211:     (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.221829] cfg80211: Calling CRDA for country: US
Apr  9 14:58:32 terpsichore kernel: [791944.228413] cfg80211: Regulatory domain changed to country: US
Apr  9 14:58:32 terpsichore kernel: [791944.228417] cfg80211:     (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Apr  9 14:58:32 terpsichore kernel: [791944.228419] cfg80211:     (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2700 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.228422] cfg80211:     (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 1700 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.228424] cfg80211:     (5250000 KHz - 5330000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.228426] cfg80211:     (5490000 KHz - 5600000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.228427] cfg80211:     (5650000 KHz - 5710000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr  9 14:58:32 terpsichore kernel: [791944.228429] cfg80211:     (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 3000 mBm)
Apr  9 14:58:35 terpsichore NetworkManager[902]: <info> (wlan1): roamed from BSSID 00:18:0A:31:13:54 (NextSpaceSC) to (none) ((none))
Apr  9 14:58:47 terpsichore NetworkManager[902]: <warn> (wlan1): link timed out.
Apr  9 14:58:47 terpsichore NetworkManager[902]: <info> (wlan1): device state change: activated -> disconnected (reason 'supplicant-timeout') [100 30 11]
Apr  9 14:58:47 terpsichore NetworkManager[902]: <info> (wlan1): deactivating device (reason 'supplicant-timeout') [11]
Apr  9 14:58:47 terpsichore NetworkManager[902]: <info> (wlan1): canceled DHCP transaction, DHCP client pid 7547
Apr  9 14:58:47 terpsichore avahi-daemon[885]: Withdrawing address record for 192.168.100.73 on wlan1.
</pre>

--- Additional comment from Rick van Ek on 2012-04-24 09:40:29 EDT ---

Got the same error here, on linksys router I get disconnected every 30 seconds with the same messages appearing in the log file:

Apr 24 15:32:10 dumbledore kernel: [ 2349.259190] cfg80211: Calling CRDA to update world regulatory domain
Apr 24 15:32:10 dumbledore kernel: [ 2349.267240] cfg80211: World regulatory domain updated:
Apr 24 15:32:10 dumbledore kernel: [ 2349.267255] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Apr 24 15:32:10 dumbledore kernel: [ 2349.267258] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr 24 15:32:10 dumbledore kernel: [ 2349.267260] cfg80211:   (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Apr 24 15:32:10 dumbledore kernel: [ 2349.267263] cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
Apr 24 15:32:10 dumbledore kernel: [ 2349.267267] cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr 24 15:32:10 dumbledore kernel: [ 2349.267270] cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
Apr 24 15:32:10 dumbledore kernel: [ 2349.267289] cfg80211: Calling CRDA for country: NL
Apr 24 15:32:10 dumbledore kernel: [ 2349.271767] cfg80211: Regulatory domain changed to country: NL
Apr 24 15:32:10 dumbledore kernel: [ 2349.271769] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
Apr 24 15:32:10 dumbledore kernel: [ 2349.271771] cfg80211:   (2402000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm)
Apr 24 15:32:10 dumbledore kernel: [ 2349.271773] cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (N/A, 2000 mBm)
Apr 24 15:32:10 dumbledore kernel: [ 2349.271775] cfg80211:   (5250000 KHz - 5330000 KHz @ 40000 KHz), (N/A, 2000 mBm)
Apr 24 15:32:10 dumbledore kernel: [ 2349.271776] cfg80211:   (5490000 KHz - 5710000 KHz @ 40000 KHz), (N/A, 2700 mBm)
Apr 24 15:32:10 dumbledore iscsi[13246]: Stopping iscsi: [  OK  ]
Apr 24 15:32:10 dumbledore chronyd[1283]: Source 194.109.64.200 offline
Apr 24 15:32:10 dumbledore chronyd[1283]: Source 176.34.228.173 offline
Apr 24 15:32:10 dumbledore chronyd[1283]: Source 91.198.174.197 offline
Apr 24 15:32:10 dumbledore chronyd[1283]: Source 81.171.44.131 offline
Apr 24 15:32:11 dumbledore dhclient[13294]: Internet Systems Consortium DHCP Client 4.2.3-P2
Apr 24 15:32:11 dumbledore dhclient[13294]: Copyright 2004-2012 Internet Systems Consortium.
Apr 24 15:32:11 dumbledore dhclient[13294]: All rights reserved.
Apr 24 15:32:11 dumbledore dhclient[13294]: For info, please visit https://www.isc.org/software/dhcp/
Apr 24 15:32:11 dumbledore dhclient[13294]: 
Apr 24 15:32:12 dumbledore dhclient[13294]: Listening on LPF/wlan0/74:2f:68:ae:d6:4e
Apr 24 15:32:12 dumbledore dhclient[13294]: Sending on   LPF/wlan0/74:2f:68:ae:d6:4e
Apr 24 15:32:12 dumbledore dhclient[13294]: Sending on   Socket/fallback
Apr 24 15:32:12 dumbledore dhclient[13294]: DHCPREQUEST on wlan0 to 255.255.255.255 port 67
Apr 24 15:32:12 dumbledore dhclient[13294]: DHCPACK from 192.168.178.1
Apr 24 15:32:12 dumbledore dhclient[13294]: bound to 192.168.178.59 -- renewal in 423777 seconds.

On the other router Fritz box I do not have this problem. Still it is om my system as the same configuration was running without problems before. It apeared a few days back and is now getting worse. 
System is up to date with Fedora 16 and has been rebooted.

--- Additional comment from Dan Williams on 2012-04-24 17:41:04 EDT ---

NM is reporting that the supplicant wasn't able to reconnect to the AP after it got disconnected, presumably due to kernel driver behavior when the regulatory domain gets updated.  If somebody can add "-dddt" to the end of the Exec= line in /usr/share/dbus-1/system-services/fi.w1.wpa_supplicant1.service and then 'killall -TERM wpa_supplicant', reproduce the problem, then attach /var/log/wpa_supplicant.log that'll help a lot.

--- Additional comment from Rick van Ek on 2012-04-25 06:33:45 EDT ---

Added the -dddt flag in the file, rebooted the system ( to run on the latest version of kernel)
Switched to the 192.168.123 network which is having the most problems. 
After a few errors switched back to the 192.168.178 network, which does not have these problems. 
copied the wpa_supplicant.log for attachment.

--- Additional comment from Dan Winship on 2012-05-02 13:11:54 EDT ---

Not convinced the regulatory domain message is actually relevant to the bug; that's *always* the first thing you see in /var/log/messages when NM starts a reconnection attempt.

--- Additional comment from  on 2012-05-10 10:11:58 EDT ---

I'm having similar problems. I'm at a location who also has a couple of AP's with the same SSID. I get a working connection for a while and when it drops i need to restart my machine to get it working again. But after another 5-10 minutes i loose it again.

Computer Lenovo T510

lspci
03:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8191SEvB Wireless LAN Controller (rev 10)

uname -r
3.3.4-3.fc16.x86_64

cat /etc/fedora-release 
Fedora release 16 (Verne)

rpm -qa | grep -i networkmanager
NetworkManager-vpnc-0.9.0-1.fc16.x86_64
NetworkManager-openconnect-0.9.0-3.fc16.x86_64
NetworkManager-glib-0.9.4-3.git20120403.fc16.x86_64
NetworkManager-pptp-0.9.0-1.fc16.x86_64
NetworkManager-0.9.4-3.git20120403.fc16.x86_64
evolution-NetworkManager-3.2.3-3.fc16.x86_64
NetworkManager-gtk-0.9.4-3.git20120403.fc16.x86_64
NetworkManager-openvpn-0.9.0-1.fc16.x86_64
NetworkManager-gnome-0.9.4-3.git20120403.fc16.x86_64

Will attach wpa_supplicant / messages / kernellog with rtl8192se kernel module with debug=5

File logs.tar

--- Additional comment from  on 2012-05-10 10:14:07 EDT ---



--- Additional comment from Karsten Wade on 2012-07-09 18:16:05 EDT ---

I just completed a preupgrade to Fedora 17, and have new kernel and firmware on the same Thinkpad X220T. I'm at the location where I experience this problem, and it seems to be exactly the same as before.

03:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8188CE 802.11b/g/n WiFi Adapter (rev 01)

3.4.4-3.fc17.x86_64

NetworkManager-gtk-0.9.4.0-9.git20120521.fc17.x86_64
NetworkManager-gnome-0.9.4.0-9.git20120521.fc17.x86_64
NetworkManager-vpnc-0.9.3.997-1.fc17.x86_64
NetworkManager-glib-0.9.4.0-9.git20120521.fc17.x86_64
NetworkManager-openvpn-0.9.3.997-1.fc17.x86_64
NetworkManager-openconnect-0.9.4.0-3.fc17.x86_64
NetworkManager-pptp-0.9.3.997-1.fc17.x86_64
NetworkManager-0.9.4.0-9.git20120521.fc17.x86_64

Attached: wpa_supplicant log, messages log

--- Additional comment from Karsten Wade on 2012-07-09 18:17:05 EDT ---



--- Additional comment from Karsten Wade on 2012-07-09 18:18:24 EDT ---

Covers the time from suspending, attempting to connect, and killing the radio via the manual killswitch.

--- Additional comment from Fedora End Of Life on 2012-08-07 12:06:54 EDT ---

This message is a notice that Fedora 15 is now at end of life. Fedora
has stopped maintaining and issuing updates for Fedora 15. It is
Fedora's policy to close all bug reports from releases that are no
longer maintained. At this time, all open bugs with a Fedora 'version'
of '15' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we were unable to fix it before Fedora 15 reached end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora, you are encouraged to click on
"Clone This Bug" (top right of this page) and open it against that
version of Fedora.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 1 johnny.westerlund 2013-08-26 07:53:10 EDT
Reopening this bug since i can still reproduce it on fedora 19.


https://bugzilla.redhat.com/show_bug.cgi?id=864906
More info on the problem in the above bug report, although i'm only reopening this one.

In trying to debug this in prior fedora versions i encountered a problem with the debug options on the RTL cards. 

Since then debug kernels are supposed to have RTL_WIFI_DEBUG option compiled, but it still doesnt. 

cat config-3.10.9-200.fc19.x86_64.debug | grep RTL
CONFIG_USB_RTL8150=m
CONFIG_USB_RTL8152=m
CONFIG_RTL8180=m
CONFIG_RTL8187=m
CONFIG_RTL8187_LEDS=y
CONFIG_RTLWIFI=m
# CONFIG_RTLWIFI_DEBUG is not set

Acording to this bugzilla entry debug kernels should contain the relevant options to debug.

https://bugzilla.redhat.com/show_bug.cgi?id=889425
Comment 2 Michele Baldessari 2013-11-16 16:34:04 EST
Something like this I guess:
diff --git a/Makefile b/Makefile
index 178f4f2..93dbd66 100644
--- a/Makefile
+++ b/Makefile
@@ -92,6 +92,7 @@ debug:
        @perl -pi -e 's/# CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set/CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y/' config-nodebug
        @perl -pi -e 's/# CONFIG_DEBUG_KMEMLEAK is not set/CONFIG_DEBUG_KMEMLEAK=y/' config-nodebug
        @perl -pi -e 's/# CONFIG_X86_DEBUG_STATIC_CPU_HAS is not set/CONFIG_X86_DEBUG_STATIC_CPU_HAS=y/' config-nodebug
+       @perl -pi -e 's/# CONFIG_RTLWIFI_DEBUG is not set/CONFIG_RTLWIFI_DEBUG=y/' config-nodebug
 
        @# just in case we're going from extremedebug -> debug
        @perl -pi -e 's/CONFIG_DEBUG_PAGEALLOC=y/# CONFIG_DEBUG_PAGEALLOC is not set/' config-nodebug


Josh/Kernel folks, I noticed a couple of other discrepancies (i.e. in debug we are missing MAC80211_MESSAGE_TRACING). Would a patch fixing them all + sorting 
them in both Makefile and Makefile.release be ok?
Comment 3 Josh Boyer 2013-11-18 11:00:51 EST
Adding John to CC.

I'll look at the RTLWIFI_DEBUG today.  For MAC80211_MESSAGE_TRACING, I think we left that off on purpose because it can be very spammy and make things even slower on debug kernels than they already are.  I could be misremembering though.  John, any thoughts on this option for debug/rawhide kernels?
Comment 4 John W. Linville 2013-11-18 11:58:43 EST
Josh, I think you have it right re: MAC80211_MESSAGE_TRACING.
Comment 5 Josh Boyer 2013-11-18 13:37:21 EST
OK, I looked at this.  The config options are correct on 3.11 based kernels.  RTLWIFI_DEBUG is set on -debug kernels and not on release kernels.  There's nothing to really fix there.
Comment 6 Michele Baldessari 2013-11-18 14:04:48 EST
Hohum not sure at which branch I looked then. Sorry about that.

Johnny, can you close this BZ please?
Comment 7 johnny.westerlund 2013-11-19 02:18:26 EST
Closing since the debug options are now present in the debug kernel. Thanks all for the help.

Note You need to log in before you can comment on or make changes to this bug.