Bug 996323 - Broadcom WLAN fails to reauthenticate [NEEDINFO]
Broadcom WLAN fails to reauthenticate
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: fedora-kernel-wireless-brcm80211
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-12 20:13 EDT by scd
Modified: 2014-06-23 10:40 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-23 10:40:42 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
jforbes: needinfo?


Attachments (Terms of Use)
900X3A dmesg (83.53 KB, text/plain)
2013-08-21 15:54 EDT, scd
no flags Details
debug log from wpa_supplicant (45.82 KB, text/x-log)
2013-09-02 16:34 EDT, scd
no flags Details
Driver trace from kernel 3.13.5-103 with CONFIG_BRCM_TRACING enabled over ca. 15 minutes. (622.36 KB, application/x-xz)
2014-03-14 06:01 EDT, scd
no flags Details

  None (edit)
Description scd 2013-08-12 20:13:57 EDT
Description of problem:

After succeeding to establish a connection after booting, the kernel is unable to reauthenticate the WLAN interface, looping forever with error messages (MAC address anonymized):

wlp1s0: authenticate with 00:ff:ff:ff:ff:ff
wlp1s0: direct probe to 00:ff:ff:ff:ff:ff (try 1/3)
wlp1s0: direct probe to 00:ff:ff:ff:ff:ff (try 2/3)
wlp1s0: direct probe to 00:ff:ff:ff:ff:ff (try 3/3)
wlp1s0: authentication with 00:ff:ff:ff:ff:ff timed out

Version-Release number of selected component (if applicable):

All Fedora 19 kernels up to 3.10.5-201

How reproducible:

Configure WLAN connection, start Fedora.

Actual results:

Reauthentication fails, rendering the WLAN interface useless (therefore I consider this a high priority bug).

Expected results:

Kernel should reauthenticate.
This worked fine on the same laptop with Fedora 15 -- 17

Additional info:

Samsung 900X3A laptop, Broadcom BCM43225 802.11b/g/n WLAN interface.

This bug might be a regression, I think I saw the same behaviour with a former Fedora release.
wpa_supplicant logfile contains error messages:

wlp1s0: No keys have been configured - skip key clearing

so the problme might also come from NetworkManager failing to provide the WPA secret key.
Comment 1 scd 2013-08-21 11:25:33 EDT
Hm... a week over and nobody cared to comment.
Am I the only one experiencing this problem??
If anyone else should be concerned, upgrading to kernel version 3.10.7 didn't change anything.
Comment 2 Arend van Spriel 2013-08-21 15:10:21 EDT
Hmm... my bad. So would you be able to provide the usual info, ie. lspci, dmesg, etc. What kernel versions are F15 and F17 running?
Comment 3 Josh Boyer 2013-08-21 15:13:54 EDT
F15 ended with 2.6.43.8.  F16 ended with 3.6.11.  F17 just recently went EOL with 3.9.10.

F18 and F19 are both now on 3.10.7, moving to 3.10.9 soon.
Comment 4 scd 2013-08-21 15:54:52 EDT
Created attachment 788989 [details]
900X3A dmesg

dmesg output of 900X3A, containing error messaegs of failed WLAN reauthentification
Comment 5 scd 2013-08-21 15:57:38 EDT
And the output of lspci -vvv for the WLAN NIC below.
The device worked fine with all F17 kernels, and AFAIK showed a similar error with one of the F15 kernels (sorry can't remember exactly which one it was).

01:00.0 Network controller: Broadcom Corporation BCM43225 802.11b/g/n (rev 01)
	Subsystem: Askey Computer Corp. Device 7181
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at c0600000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: bcma-pci-bridge
Comment 6 John Greene 2013-08-23 10:31:58 EDT
Wonder if the recent fix in this BZ might help you to: not too wild a guess.
Can you try this?

https://bugzilla.redhat.com/show_bug.cgi?id=989269
The net of it is:
kernel-3.10.9-200.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.


A lot of BZs..might be a quick data point if not a fix.
Comment 7 scd 2013-08-23 12:21:02 EDT
Sorry, upgrading to 3.10.9-200 didn't change anything, same failure.
Comment 8 scd 2013-09-02 14:04:28 EDT
3.10.10-200, still broken.

Do you need any more information to fix this?(In reply to scd from comment #7)
> Sorry, upgrading to 3.10.9-200 didn't change anything, same failure.
Comment 9 Arend van Spriel 2013-09-02 14:31:42 EDT
Looking at dmesg more closely it does not get connected at all. Can you make a wpa_supplicant log? Or wireless capture using a sniffer?
Comment 10 scd 2013-09-02 16:34:22 EDT
Created attachment 792977 [details]
debug log from wpa_supplicant

Sure -- here the wpa_supplicant output
Comment 11 John Greene 2013-09-03 10:10:25 EDT
From your logs: Wondering if the AP you are having issues with has a MAC filter?  It's not allowing even auth..
Comment 12 scd 2013-09-03 10:32:28 EDT
The AP has a MAC filter, but the 900X3A's WLAN NIC is explicitly white-listed there.

The problem occurs only during reauthentication: on bootup (or NetworkManager service restart) the AP connects fine, but after some minutes the connections breaks down due to the reauthentication failure.
And again -- same AP, same 900X3A worked completely fine together before the noteebook was upgraded to F19.
Comment 13 scd 2013-09-03 17:37:16 EDT
After restarting NetworkManager, the following appeared in dmesg

brcmsmac bcma0:0: brcms_ops_bss_info_changed: qos enabled: false (implement)
brcmsmac bcma0:0: brcms_ops_config: change power-save mode: false (implement)
IPv6: ADDRCONF(NETDEV_UP): wlp1s0: link is not ready

followed by the usual failed authentication attempts.
Comment 14 scd 2013-09-04 17:44:32 EDT
And another message from the Broadcom driver (also after restarting NetworkManager):

brcmsmac bcma0:0: wl0: brcms_c_d11hdrs_mac80211:  txop exceeded phylen 146/256 dur 1674/1504
brcmsmac bcma0:0: wl0: brcms_c_d11hdrs_mac80211:  txop exceeded phylen 146/256 dur 1674/1504
Comment 15 Josh Boyer 2013-09-18 16:51:31 EDT
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.11.1-200.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.
Comment 16 John Greene 2013-09-20 11:23:05 EDT
I have seen a number of issues with PHY problems that BCM is trying to isolate (They grabbed your data above).  They seem to have begun around the 3.9 time frame, and I've posted some patches upstream to reduce log spam for APs that are 802.11n enabled.  

The issues are PHY related apparently, not sure yet what the resolution will be.  Till that time, I'd suggest pre-3.9 kernel that worked for you may be best bet if possible to keep you functioning.  
Is this confined to the one AP or more that one?

Sorry I can't be of more help as yet..
Comment 17 scd 2013-09-30 17:17:11 EDT
Sorry for the delay.. unfortunately, 3.11.1-200 doesn't fix it,

On further investigation, it turns out to be a range problem -- if the 900X3a is within 1,5m from the AP, the Broadcom NIC works O.k. (but very slow, see below).
From 3-4m it gets flaky, further away and reauthentication fails (but only reauthentication, the first handshake after the AP is detected is always o.k.).

This happens with 2 APs (one is a Fritzfon, the other a standard German Telekom AP), and obviously it was not the case with pre fc19 kernels -- the Broadcom NIC used to work fine also in much greater distance from the APs, and with much more obstacles between the notebook and the AP.

Here a few measurements with iwconfig (look at the ridiculous bit rates):

# 3m from AP 1
wlp1s0    IEEE 802.11bgn  ESSID:"oenet"  
          Mode:Managed  Frequency:2.442 GHz  Access Point: 00:12:BF:7F:39:42   
          Bit Rate=6 Mb/s   Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=30/70  Signal level=-80 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:57   Missed beacon:0

# 40cm from AP 1
wlp1s0    IEEE 802.11bgn  ESSID:"oenet"  
          Mode:Managed  Frequency:2.442 GHz  Access Point: 00:12:BF:7F:39:42   
          Bit Rate=18 Mb/s   Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=38/70  Signal level=-72 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:57   Missed beacon:0

# 20cm from AP 1
wlp1s0    IEEE 802.11bgn  ESSID:"oenet"  
          Mode:Managed  Frequency:2.442 GHz  Access Point: 00:12:BF:7F:39:42   
          Bit Rate=54 Mb/s   Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=46/70  Signal level=-64 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:59   Missed beacon:0
Comment 18 scd 2013-10-05 13:36:55 EDT
Same bug with 3.11.2.-201
Comment 19 scd 2013-10-06 09:28:17 EDT
Same bug with kernel 3.11.3-201.
Comment 20 John Greene 2013-10-15 11:57:25 EDT
I do see a number of changes of upstream dated Aug 20 and later from vendor that address radio power control issues.  Checking to see what you have already tested with the above releases.
Comment 21 John Greene 2013-10-15 13:50:42 EDT
I see a number of fixes that landed for this driver in 3.11.5 that address radio and phy sections, and provide one enabling more debug info in the area if these don't fix your problem. Specifically, they are:

c7515d2 brcmsmac: call bcma_core_pci_power_save() from non-atomic context
aa51e59 brcmsmac: use bcma PCIe up and down functions
20c7d42 brcmsmac: add support for BCM4313 iPA variant
d37c8f0 brcmsmac: reinitialize TSSI power control upon channel switch
118e545 brcmsmac: correct phy registers for TSSI-based power control
02fcc75 brcmsmac: rework switch control table init including iPA BT-combo
3e72ef7 brcmsmac: avoid calling set_txpwr_by_index() twice
d50ec00 brcmsmac: fix TSSI idle estimation
7de6468 brcmsmac: change lcnphy receive i/q calibration routine
ab9a50e brcmsmac: update transmit gain table for lcn phy
67e39c5 brcmsmac: add debug info message providing phy and radio info
d6b81da brcmsmac: use ARRAY_SIZE in phytbl_lcn.c
acf97e9 brcmsmac: change pa_gain for bcm4313 iPA

Can you please try that kernel?
Comment 22 scd 2013-10-23 16:52:29 EDT
Thanks for the hints!

Unfortunately, neither 3.11.5 nor 3.11.6 provide a fix.

But with 3.11.6, additional Broadcom related messages appears when booting:

Support for cores revisions 0x17 and 0x18 disabled by module param allhwsupport=0. Try b43.allhwsupport=1
b43: probe of bcma0:0 failed with error -524

I treid reloading b43 with the suggested parameter setting, but the bug persists.
Comment 23 Arend van Spriel 2013-10-24 04:07:15 EDT
(In reply to scd from comment #22)
> Thanks for the hints!
> 
> Unfortunately, neither 3.11.5 nor 3.11.6 provide a fix.
> 
> But with 3.11.6, additional Broadcom related messages appears when booting:
> 
> Support for cores revisions 0x17 and 0x18 disabled by module param
> allhwsupport=0. Try b43.allhwsupport=1
> b43: probe of bcma0:0 failed with error -524
> 
> I treid reloading b43 with the suggested parameter setting, but the bug
> persists.

Now you got me going bonkers. The report is all about brcmsmac and now you switch to b43 driver? Anyway, so regardless the driver you are having the same issue. Probably need to look higher up in the stack, ie. mac80211, cfg80211, wpa_supplicant, or even NetworkManager.
Comment 24 Justin M. Forbes 2014-01-03 17:08:12 EST
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.
Comment 25 scd 2014-01-18 10:05:09 EST
Sorry -- as this bug effectively puts an end to mobile computing with my notebook, I installed all kernel updates for FC19 as soon as they became available, but none of them fixed the bug so far up to now (3.12.7-200).

Right now I'm connected via WLAN, but this works only because I can place the 900X3A ca. 10" from the router: iwlist reports the connection as Quality=54/70 with a signal level of -54dBm, whereas 7-8m distance brings this down to 16/70 with below -70dBm and constant reauthentication failures resulting in Network Manager dropping the connection.
Comment 26 scd 2014-01-20 17:24:15 EST
3.12.8.200 is no fix either.
Comment 27 scd 2014-02-11 14:56:43 EST
Unfortunately kernel 3.12.9 is no fix either.

More on the contrary: right now, iwlist wlp1s0 scan reported only quality 15/70 with the notebook mere 20cm from the WLAN AP -- but reloading the brcmsmac driver brought this up to 51/70, resulting in a stable connection.
Comment 28 scd 2014-02-18 14:59:50 EST
3.12.11 is no fix either.
Comment 29 scd 2014-03-01 19:02:33 EST
3.13.5 is no fix either.

First I thought that the bug was gone, but after 10 minutes reauthentication failures started again.
6 whole months w/o a fix -- this is really getting frustrating...
Comment 30 scd 2014-03-09 18:08:10 EDT
Same failure for 3.13.5-103.
Comment 31 Justin M. Forbes 2014-03-10 10:46:46 EDT
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.13.5-100.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.
Comment 32 scd 2014-03-11 13:18:55 EDT
(In reply to Justin M. Forbes from comment #31)

As written above on 2014-03-03 and 09, unfortunately neither 3.13.5.100.fc19 nor 3.13.5-103.fc19 make any difference: the buggy behaviour is the same as with the older kernels.
Comment 33 Arend van Spriel 2014-03-11 13:59:25 EDT
Can you do the following:
1. make sure no driver uses your card
    so no 'kernel driver in use:' entry when doing 'lspci -vs 1:0.0'.
2. assure your kernel config has CONFIG_BRCM_TRACING enabled.
3. execute following command:

$ modprobe brcmsmac && sudo trace-cmd record -e brcmsmac:*

4. after a while press ctrl-c and upload the trace.dat file.
Comment 34 Arend van Spriel 2014-03-11 15:50:15 EDT
Another thing to look at. Can you give contents /sys/kernel/debug/brcmsmac/bcma*/hardware.

Also try to add monitor interface:

$ sudo iw dev wlp1s0 interface add mon0 type monitor
$ sudo ifconfig mon0 up

and capture packets using wireshark.
Comment 35 scd 2014-03-14 05:59:44 EDT
(In reply to Arend van Spriel from comment #33)

Here you are -- BRCM driver traces over ca. 15 minutes, should contain dozens of failed reconnects.
Comment 36 scd 2014-03-14 06:01:54 EDT
Created attachment 874304 [details]
Driver trace from kernel 3.13.5-103 with CONFIG_BRCM_TRACING enabled over ca. 15 minutes.
Comment 37 Arend van Spriel 2014-03-14 06:14:47 EDT
(In reply to scd from comment #36)
> Created attachment 874304 [details]
> Driver trace from kernel 3.13.5-103 with CONFIG_BRCM_TRACING enabled over
> ca. 15 minutes.

Well, that seems to have failed somehow. I mean making the trace:

version = 6
CPU 1 is empty
CPU 2 is empty
CPU 3 is empty
cpus=4
     kworker/0:0-17338 [000] 26066.939694: brcms_macintstatus:   [bcma0:0] in_isr=1 macintstatus=0x4004 mask=0xb027a864
 abrt-action-che-17882 [000] 26067.042074: brcms_macintstatus:   [bcma0:0] in_isr=1 macintstatus=0x4004 mask=0xb027a864
     kworker/0:2-16215 [000] 26067.144470: brcms_macintstatus:   [bcma0:0] in_isr=1 macintstatus=0x4004 mask=0xb027a864
 abrt-action-not-17898 [000] 26067.246829: brcms_macintstatus:   [bcma0:0] in_isr=1 macintstatus=0x4004 mask=0xb027a864

The trace should have data from all 4 cpus. What trace-cmd version did you use? Running trace-cmd without any option will tell you.
Comment 38 scd 2014-03-14 06:20:14 EDT
Sorry... just followed your instructions, didn't check tracefile content.
trace-cmd is version 2.1.0-1.fc19, installed via yum.
Which version should I use?
Comment 39 Arend van Spriel 2014-03-16 04:44:18 EDT
I had a similar issue because the kernel requires signed modules and taints when a unsigned module is loaded. Can you check /proc/sys/kernel/tainted contents?
Comment 40 Justin M. Forbes 2014-05-21 15:30:08 EDT
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.14.4-100.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.
Comment 41 Justin M. Forbes 2014-06-23 10:40:42 EDT
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.

Note You need to log in before you can comment on or make changes to this bug.