Bug 1102365
Summary: | Qualcomm Atheros AR93xx disconnects immediately | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Eric Griffith <EGriffith92> |
Component: | NetworkManager | Assignee: | fedora-kernel-wireless-ath |
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 20 | CC: | dcbw, EGriffith92, gansalmon, itamar, jogreene, jonathan, kernel-maint, madhu.chinakonda, mchehab, sujith |
Target Milestone: | --- | Keywords: | Desktop |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-06-29 20:51:14 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Eric Griffith
2014-05-28 20:20:32 UTC
Accidentally hit enter. Lets try this again... ==VERSIONS== Kernel version: Linux erics-desktop 3.14.4-200.fc20.x86_64 #1 SMP Tue May 13 13:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux NetworkManager version: 0.9.9.0-38.git20131003.fc20 WPA Supplicant version: wpa_supplicant v2.0 ==HARDWARE IN QUESTION== 02:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01) Subsystem: Qualcomm Atheros Device 3112 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 36 Region 0: Memory at fe900000 (64-bit, non-prefetchable) [size=128K] Expansion ROM at fe920000 [disabled] [size=64K] Capabilities: <access denied> Kernel driver in use: ath9k Kernel modules: ath9k ==MODULE OPTIONS== options ath9k nohwcrypt=1 options ath9k nohwcrypt=0 Neither setting changes outcome of problem. ==Problem== Finally got around to installing Fedora 20 on a spare hard drive for my desktop. I'm currently connected to the network via tethering from my phone via usb. If I unplug my phone NetworkManager jumps in and tries to connect to wifi.. it eventually does. It will stay connected (visibly) but in the background its total chaos and dmesg is getting flooded. Proof of connection: [egriffith@erics-desktop ~]$ iwconfig wlp2s0 IEEE 802.11abgn ESSID:"WhoKnows-5Ghz" Mode:Managed Frequency:5.785 GHz Access Point: 40:16:7E:59:A7:44 Bit Rate=6.5 Mb/s Tx-Power=30 dBm Retry short limit:7 RTS thr:off Fragment thr:off Power Management:off Link Quality=59/70 Signal level=-51 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:25 Invalid misc:1 Missed beacon:0 p3p1 no wireless extensions. lo no wireless extensions. enp0s16f0u1 no wireless extensions. And a few seconds later, if I re-run it... Proof of disconnect: [egriffith@erics-desktop ~]$ iwconfig wlp2s0 IEEE 802.11abgn ESSID:off/any Mode:Managed Access Point: Not-Associated Tx-Power=30 dBm Retry short limit:7 RTS thr:off Fragment thr:off Power Management:off p3p1 no wireless extensions. lo no wireless extensions. enp0s16f0u1 no wireless extensions. Meanwhile in dmesg... [ 1402.417110] wlp2s0: deauthenticated from 40:16:7e:59:a7:44 (Reason: 15) [ 1402.436890] cfg80211: Calling CRDA to update world regulatory domain [ 1402.441232] cfg80211: World regulatory domain updated: [ 1402.441238] cfg80211: DFS Master region: unset [ 1402.441241] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [ 1402.441244] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm) [ 1402.441247] cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm) [ 1402.441250] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm) [ 1402.441252] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz), (N/A, 2000 mBm) [ 1402.441254] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm) [ 1402.441257] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm) [ 1402.441290] cfg80211: Calling CRDA for country: US [ 1402.443796] cfg80211: Regulatory domain changed to country: US [ 1402.443800] cfg80211: DFS Master region: unset [ 1402.443802] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [ 1402.443804] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 3000 mBm) [ 1402.443806] cfg80211: (5170000 KHz - 5250000 KHz @ 80000 KHz), (N/A, 1700 mBm) [ 1402.443807] cfg80211: (5250000 KHz - 5330000 KHz @ 80000 KHz), (N/A, 2300 mBm) [ 1402.443809] cfg80211: (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 3000 mBm) [ 1402.443810] cfg80211: (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 4000 mBm) [ 1404.194907] wlp2s0: authenticate with 40:16:7e:59:a7:44 [ 1404.200138] wlp2s0: send auth to 40:16:7e:59:a7:44 (try 1/3) [ 1405.204415] wlp2s0: authenticated [ 1405.219087] wlp2s0: associate with 40:16:7e:59:a7:44 (try 1/3) [ 1406.214360] wlp2s0: associate with 40:16:7e:59:a7:44 (try 2/3) [ 1407.219249] wlp2s0: associate with 40:16:7e:59:a7:44 (try 3/3) [ 1407.222431] wlp2s0: RX AssocResp from 40:16:7e:59:a7:44 (capab=0x11 status=0 aid=2) [ 1407.222487] wlp2s0: associated [ 1415.409345] wlp2s0: deauthenticated from 40:16:7e:59:a7:44 (Reason: 15) And it loops like that, over and over. Always "Reason: 15" Attached is a log file of everything "journalctl -b" gave. Thoughts? "Use Ethernet" isn't a valid solution. For the record, same issues happened on 3.11 (default kernel for installation media) so this is not a recent regression unless it got fixed and then re-broken sometime inbetween. All currently available updates are installed. Created attachment 900132 [details]
First journal log file.
Reason 15 is 4 way handshake timeout. THere are several places and reasons why this could happen. Are you willing to generate a debug log? If so, here's how: /etc/sysconfig/wpa_supplicant:OTHER_ARGS="-u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid" Can you could add -dd to the wpa_supplicant command line during the issue: it's usually in: /etc/sysconfig/wpa_supplicant add -dd to the OTHER_ARGS list then either reboot, or disable wifi and restart. and then send me a copy of /var/log/messages/wpa_supplicant.log while the connection is made and then drops off. Be sure to go back and take the -dd out after as the log will get huge. We can at least see more where the issue is. Sorry, been busy the last couple days. I've got no problem debugging this as much as needed, will post the requested logs late tonight (EST). I will say: Whatever is going on is a driver / firmware issue of some kind that is restricted to Linux. Under Windows 7 this adapter is working perfectly fine and I couldn't be happier. So its not buggy hardware, buggy router, or interference. Created attachment 901030 [details]
"-dd" output of wpa_supplicant
Requested Info.
Thanks for the log. And the info that hardware seems fine under windows. Digesting.. So far it seems to be connecting then just redoing the group key periodically which may be normal if you have a key rotation enabled for the AP. Without timestamps, it's a bit hard to tell. And an error on my part: you have good logs here with -dd. We *may* need to redo this with -dd -t to get timestamps in the logs to help decipher them with time. Unless you have nothing better to do... :) Sorry to miss that. Created attachment 901574 [details]
"-dd -t" output of wpa_supplicant
Requested (redone) log file is below, I let it run for about 20 seconds I think. If thats not enough time just give a shout and I can let it run longer again.
I care much more about getting this resolved than I do having to redo log files, so its not a big deal, John :)
Looks like the system gets all the way through the EAP 4 way handshake succesfully. 4-WAY-HANDSHAKE -> GROUP-HANDSHAKE -> Completed state. Link is up. Then 1401761119.836062: wlp2s0: State: GROUP_HANDSHAKE -> COMPLETED 1401761119.836069: wlp2s0: CTRL-EVENT-CONNECTED - Connection to 40:16:7e:59:a7:40 completed [id=0 id_str=] 1401761119.836075: wpa_driver_nl80211_set_operstate: operstate 0->1 (UP) 1401761119.836082: netlink: Operstate: linkmode=-1, operstate=6 1401761119.836136: EAPOL: External notification - portValid=1 1401761119.836144: EAPOL: External notification - EAP success=1 1401761119.836150: EAPOL: SUPP_PAE entering state AUTHENTICATING 1401761119.836157: EAPOL: SUPP_BE entering state SUCCESS 1401761119.836163: EAP: EAP entering state DISABLED 1401761119.836168: EAPOL: SUPP_PAE entering state AUTHENTICATED 1401761119.836174: EAPOL: Supplicant port status: Authorized 1401761119.836202: EAPOL: SUPP_BE entering state IDLE 1401761119.836208: EAPOL authentication completed successfully 1401761119.836234: RTM_NEWLINK: operstate=1 ifi_flags=0x11043 ([UP][RUNNING][LOWER_UP]) 1401761119.836241: RTM_NEWLINK, IFLA_IFNAME: Interface 'wlp2s0' added 1401761119.836257: nl80211: if_removed already cleared - ignore event /// LOOKS HAPPY Till here then... 1401761119.840911: dbus: flush_object_timeout_handler: Timeout - sending changed properties of object /fi/w1/wpa_supplicant1/Interfaces/1 // Unless this is a normal key rotation interval, not sure why we'd go back to // RX EAPOL frames this quickly: just got done and this is less than a second // later. 1401761120.837078: wlp2s0: RX EAPOL from 40:16:7e:59:a7:40 The system just keeps redoing the key phase and as a consequence, timeout is likely the result. This cycle repeats over and over. dbus: flush_object_timeout_handler: Timeout ... not sure what this is doing, but some sort of timeout is engaging that appears out of sorts. Some sort of network mananger -> supplicant issue? Not my best space here..so I'll reassign to netmanager/wpa_supplicant folks to assess. If I can help further, please LMK. You can try a WEP AP setup to validate the hardware/driver isn't the problem (WEP doesn't use the supplicant). But at this point I think the issue is above driver level. I'm testing out various network settings to see what works and what doesn't. WPA2-Personal. Key lasts 3600secs: Current situation. Key exchange is messed up. WPA2-Personal. Key lasts 0secs: About to test. Open system. ath9k hwcrypt=1: This is interesting. Open system I stay connected but something is throttling my network connection. My card is getting 6.5 Mb/s. My laptop next to it, on the same network, is getting 144.5 Mb/s. (Speeds come via iwconfig, also tested by trying to load some websites... im gonna say 6.5 Mb/s is accurate.) Open system. ath9k hwcrypt=0: Constantly disconnects, LIKE the current situation. WEP: Cannot test. Router depreciated WEP. If my problem with the key exchange is higher in the stack, what are the chances that the throttling is also higher in the stack vs in-driver? Open systems does no keys at all. wpa_supplicant not involved nor is hw encyption. This data point helps though. On the AP, try turning of key rotation on WPA/WPA2 if its an option. Then what do you get? As to speed issue, nothing should be throttling it at this point. Let's defer that for now, focus on the connection stability first. A thousand apologies John, I haven't been home much lately. I will post the necessary info in the morning. Hopefully that will bring us one step closer to resolving this. Once more, very sorry (In reply to John Greene from comment #8) > dbus: flush_object_timeout_handler: Timeout ... > not sure what this is doing, but some sort of timeout is engaging that > appears out of sorts. > Some sort of network mananger -> supplicant issue? Not my best space > here..so I'll reassign to netmanager/wpa_supplicant folks to assess. If I > can help further, please LMK. This is just an internal timeout for coalescing property change notifications and doesn't affect actual operation. It's only a debug message. The issue would most likely be in the kernel or in the supplicant. Some stuff that looks odd to me: 1401761118.757169: wlp2s0: RX EAPOL from 40:16:7e:59:a7:40 1401761118.757211: RX EAPOL - hexdump(len=121): <stuff> 1401761118.757256: wlp2s0: Not associated - Delay processing of received EAPOL frame (state=ASSOCIATING bssid=00:00:00:00:00:00) <short time passes> 1401761118.760649: nl80211: MLME event 38 1401761118.760668: nl80211: MLME event frame - hexdump(len=145): <stuff> 1401761118.760722: nl80211: Associate event 1401761118.760741: wlp2s0: Event ASSOC (0) received The supplicant is missing the first few RX EAPOL frames because mac80211 or the driver isn't passing up the ASSOC event before it allows RX through? Curious, but it ultimately doesn't matter because another RX EAPOL comes through after the ASSOC event at 1401761118.761591. But it looks like the real problem is that that AP's WPA rekey time is like 2 seconds???? Every two seconds another RX EAPOL frame comes in and the supplicant has to process a rekey. The AP is clearly processing the rekeys correctly, so it clearly can hear the STA, but it still requests a continuous rekey. This continues until 1401761125.859718 when the supplicant sends a rekey response, but something doesn't get through and the AP drops the client because it couldn't complete rekey within the timeout. So I think that's the basic problem: something in the supplicant or driver stack isn't fast enough. But the real question is, why is the AP requesting rekeys every 2 seconds? That's the actual problem. Created attachment 906124 [details]
"-dd -t" Output of wpa_supplicant with a Network Key Rotation Time of 0
Time on this one was a bit longer, near the end you might see some ethernet / tethering events because I plugged my phone in before I killed wifi.
Does the network intentionally use a short key rotation time? My router lists the adjustable Network Key Rotation Time as 3600seconds-- one hour-- by default, and it has only had 2 settings applied: 3600seconds, and 0 seconds (for testing purposes). Both settings result in the above situation. No other clients are having problems with connecting, whether they be iOS, Android, Windows 7, or Fedora (Intel wireless on that machine). It is only that specific machine under Linux-- under Windows it's fine. When setting it to 3600 seconds, do you reboot the router to make sure it starts using the new interval? 2 points on your question... 1) The answer: As soon as I hit apply the router pops up a page saying it is detecting a reboot-requiring change and is automatically rebooting to make it take effect. So -personally-, no I did not reboot the router myself, I've assumed that it actually IS rebooting like its saying it is. Though, given that all wireless clients disconnect during the time it says its rebooting, I'm tempted to believe that it did infact reboot. 2) Its been set to 3600 seconds from the beginning and I have only, momentarily and by request set it to 0 seconds. Whether its set to 3600 seconds or 0 seconds doesn't seem to change anything. The only way I've been able to stay connected (excluding tethering) is by setting my wireless network to open, but then something in the stack is throttling my connection down to 6.5Mb/s. If the card has to process even a SINGLE key request.. everythings crap. To clarify: The router is supposed to be sending out rekey requests every hour. For whatever reason the router seems to be sending out multiple rekeys initially, OR the networking stack is getting one rekey request and processing it multiple times. (Those are strictly my thoughts at this point). The problem is.. if the router is sending out multiple rekeys, why isn't anyone else on the network affected? Is there a part of the spec that says "If the client receives multiple rekeys within xyz seconds, they should be disregarded?" that isn't implemented on the Atheros driver? It seems odd that an equally updated Fedora 20 x64 machine on the same network, but with an Intel wireless card, isn't having problems if this is something router-side. Due to me having to RMA this particular computer's RAM, I will not be able to provide any further logs or updates until I receive the new modules. This bug report's been quiet for a few days, if you guys have just been busy thats totally fine, just don't want to see this forgotten about. Even if the answer is "We don't have a damn clue why this is happening", it is at least AN answer. Eric, You make so good points in comment 18. When you get your system restored, can you upload timestamped supplicant log with the Intel card that is working ok? It would interesting to see the rekey interval being so short and compare that with the ath one. I'm wondering a couple things: >2) Its been set to 3600 seconds from the beginning and I have only, >momentarily >and by request set it to 0 seconds. Whether its set to 3600 >seconds or 0 >seconds doesn't seem to change anything. Is AP firmware up to date? If setting rekey interval to 3600 generated rekeys at same 1 or 2 second rate (please confirm my understand that right). If the setting doesn't change the observed rate in the log, that's on the AP. Might be an easy fix for you. >The problem is.. if the router is sending out multiple rekeys, why isn't anyone >else on the network affected? Is there a part of the spec that says "If the >client receives multiple rekeys within xyz seconds, they should be >disregarded?" that isn't implemented on the Atheros driver? A guess: give the delayed processing of the EAPOL frame noted in comment 13 it kinda begs the question if there is other delay in the process. It's not likely as the rekeys come after associated and port is open. But if Intel works.. something is different with the driver, not sure what. One further question if Intel works and atheros doesn't..and throughput is slow. Have / Can you tried a different card? Might be bad card. Just so we're all on the same page to avoid any confusion. Desktop system: Atheros PCI-E card (http://www.amazon.com/dp/B007GMPZ0A/ref=wl_it_dp_o_pC_nS_ttl?_encoding=UTF8&colid=2TNVXPD6PTYFU&coliid=IAPP9VDETHW07&psc=1) Laptop system: Network controller: Intel Corporation Centrino Advanced-N 6230 [Rainbow Peak] (rev 34) Both are / are capable of Fedora 20 x64. The laptop is the one thats working totally fine, so I can throw the timestamped wpa-supplicant log from that system in a little bit. The desktop system is dualbooted between Windows 7 x64 and Fedora 20 x64. Under Windows 7 the card is totally fine with perfect speeds and the best connectivity in the entire house. The card is ONLY buggy as far as connection speed & connectivity when I am booted into Fedora. So the odds of it being buggy hardware.... eeeeeeh. Possible? Sure, if the driver is hitting different codepaths, but it seems unlikely, no? Will post back in a bit with the answer to the other questions / logs. Created attachment 910108 [details]
INTEL: "-dd -t" output of wpa_supplicant, with a Network Key Rotation Time of 3600
The attachment above is from the Intel based laptop. Its wifi is working perfectly.
Router is an ASUS-RTN66U with firmware version: 3.0.0.4.374_5517-g302e4dc. According to the router, that firmware is the latesta available version. As far as trying a different card... I have a small USB wireless-N adapter that i can plug into the desktop-- once I get the new RAM. Last time I checked the adapter was working "fine." The chipset in the mini USB adapter is a Ralink chipset. Just to recap... Desktop system: --->Ethernet: Perfectly working --->Tethering over USB: Pefectly working --->Mini-USB Receiver (This is IIRC, I can retest once I get the new RAM) ------> Windows: Doesn't always connect on boot, but ONCE connected: works fine. ------> Fedora: Doesn't always connect on boot, but ONCE connected: works fine. --->Atheros PCIE Card: ------> Windows 7: Works beyond expectations. ------> Fedora Can't get/stay connected, and when it can... throttled. Laptop System: ---> Ethernet: N/A ---> Tethered over USB: Perfectly working ---> Mini USB Receiver: ------> Windows: Doesn't always connect on boot, but ONCE connected: works fine. ------> Fedora: Doesn't always connect on boot, but ONCE connected: works fine. ---> Internal Wireless (Intel) Card: ------> Windows: Works perfectly ------> Fedora: Works perfectly. Systems are on the same networks, in the same rooms. I'm back in town from vacation, new RAM is installed on the desktop so I'm back in business to debug this further. Given the comments I posted above and the info I provided in the Intel log file-- any thoughts? Eric, Hope vacation was fun. I'm envious. The intel log is perfect, as it should be. In review, it appears that the failing device gets associated, and then has either 4 way handshake failure: it occurs to me it isn't even starting the 4 way handshake: it starts it but shuts it down with error code before anything in the handshake even occurs. May 28 16:31:01 erics-desktop NetworkManager[690]: <info> (wlp2s0): supplicant interface state: authenticating -> associating May 28 16:31:01 erics-desktop kernel: wlp2s0: associate with 40:16:7e:59:a7:44 (try 1/3) May 28 16:31:02 erics-desktop kernel: wlp2s0: associate with 40:16:7e:59:a7:44 (try 2/3) May 28 16:31:03 erics-desktop kernel: wlp2s0: associate with 40:16:7e:59:a7:44 (try 3/3) May 28 16:31:03 erics-desktop kernel: wlp2s0: RX AssocResp from 40:16:7e:59:a7:44 (capab=0x11 status=0 aid=2) May 28 16:31:03 erics-desktop kernel: wlp2s0: associated >>>ASSOCIATION to start 4 way. May 28 16:31:03 erics-desktop NetworkManager[690]: <info> (wlp2s0): supplicant interface state: associating -> 4-way handshake May 28 16:31:04 erics-desktop NetworkManager[690]: <info> (wlp2s0): supplicant interface state: 4-way handshake -> completed <<<4 way done, should have 4 parts, but we don't even do that: WHY? May 28 16:31:04 erics-desktop NetworkManager[690]: <info> (wlp2s0): roamed from BSSID (none) ((none)) to 40:16:7E:59:A7:44 (WhoKnows-5Ghz) May 28 16:31:11 erics-desktop kernel: wlp2s0: deauthenticated from 40:16:7e:59:a7:44 (Reason: 15) May 28 16:31:11 erics-desktop NetworkManager[690]: <warn> Connection disconnected (reason 15) Not sure what occurs with that. Smacks of a either a frame delayed as Dan noted in Comment 13. I'll look upstream. Have you tried different kernels? Is this a regression? Did it work at one point and an upgrade killed something? I love how prompt Comcast is to fix an area-wide service outage... NOT. -.- (On again / off again over the last week...) As far as I know, this has "always been." I'd be happy to test a few kernels now that I have internet access restored but I can't go too far back as my APU is an AMD Kaveri, so too far back and I won't even have a working desktop to test -with.- How far back does Fedora keep kernels in-repo? I can always just snag the earliest available kernel and do a 3.x -> 3.x+1 over and over and see if / where it starts working. Other than kernels, is there anything else I can help test? Can't use the Intel system as its currently of of commission, so don't ask for any logs off that. Alright so I finally had a day to sit down and test various kernels for a couple hours. I grabbed a few kernels from every major kernel series (3.12.x, 3.13.x, 3.14.x, 3.15.x, 3.16.x) And this is what I got so far... 3.11: Kaveri support shot. Can't hit desktop. Calling it an N/A. 3.12.x: No change, still broken wireless 3.13.x: No change, still broken wireless. 3.14.x: No change, still broken wireless. 3.15.x: No change, still broken wireless. None of them are particularly better or worse than any others, they all still fail in seemingly the same ways. I can provide logs if you really need them, Any luck with looking upstream, John? (In reply to Eric Griffith from comment #29) > Alright so I finally had a day to sit down and test various kernels for a > couple hours. I grabbed a few kernels from every major kernel series > (3.12.x, 3.13.x, 3.14.x, 3.15.x, 3.16.x) > > None of them are particularly better or worse than any others, they all > still fail in seemingly the same ways. I can provide logs if you really need > them. Can you try with the latest kernel ? These options need to be enabled: CONFIG_ATH_DEBUG=y CONFIG_ATH9K_DEBUGFS=y Please load the ath9k driver with the module param "debug=0xf01" and post the kernel log. Also, the output of these files would be good to have: /sys/kernel/debug/ieee80211/phy*/ath9k/base_eeprom /sys/kernel/debug/ieee80211/phy*/ath9k/modal_eeprom /sys/kernel/debug/ieee80211/phy*/ath9k/interrupt /sys/kernel/debug/ieee80211/phy*/ath9k/recv /sys/kernel/debug/ieee80211/phy*/ath9k/phy_err /sys/kernel/debug/ieee80211/phy*/ath9k/xmit /sys/kernel/debug/ieee80211/phy*/ath9k/reset /sys/kernel/debug/ieee80211/phy*/ath9k/misc Apologies for being quiet on this bug. Midterms, Thanksgiving and Finals all hit pretty much back to back for me so it was kind of a mess and I didn't have time to be able to mess with my desktop for free of breaking it. Now that I'm on break I can give it a go and we'll see what happens, im gonna be switching over to F21 probably this weekend anyway so that'll get me latest. So after quite a bit of fighting with this over the last month and a half I think I figured out whats going on... Faulty antennae. In moving into my apartment I had to disassemble my desktop, in the process of doing so I noticed one of the antennae's didn't and would not screw down all the way. Swapped antennae's around and no matter which of the three I used it would always fail on that slot. I left that antennae unscrewed, rebooted into Fedora... lo and behold it worked perfectly fine. Now, the question is, why does Windows work with the broken antennae in but Linux does not? Does the Linux stack not have fault tolerance? Can it not test & mark antennae's before using them? Why does Windows just work around the problem mean while Linux just keeps trying something that doesn't work? > I left that antennae unscrewed, rebooted into Fedora... lo and behold it
> worked perfectly fine. Now, the question is, why does Windows work with the
> broken antennae in but Linux does not? Does the Linux stack not have fault
> tolerance? Can it not test & mark antennae's before using them? Why does
> Windows just work around the problem mean while Linux just keeps trying
> something that doesn't work?
Nice bit of detective work. I don't know the exact details of this implementation but the one's I've dealt with often support a number of ways to do antenna management, but it's often left up to the firmware to control with no exposed API to driver to make changes to how that's done. A quick look at your driver cmd parameters with modinfo ath9k show these parameters for you:
parm: debug:Debugging mask (uint)
parm: nohwcrypt:Disable hardware encryption (int)
parm: blink:Enable LED blink on activity (int)
parm: btcoex_enable:Enable wifi-BT coexistence (int)
parm: bt_ant_diversity:Enable WLAN/BT RX antenna diversity (int)
parm: ps_enable:Enable WLAN PowerSave (int)
parm: use_chanctx:Enable channel context for concurrency (int)
*If* it's just an antenna diversity problem (it might be something else) changing the bt_ant_diversity cmd line variable may help. Bluetooth, if supported and enabled, on your machine, will time share the antenna with wifi.
If you need/use bt this doesn't appear to be something doable (see below).
You'd have to turn off blue tooth coexistence with btcoex=0. Then add bt_ant_diversity=1 to enable the driver or firmware to -hopefully- work around a bad antenna, without making your range more limited.
I don't think you'll damage anything, but you might also get things in a worse state than now too.
Here's a quick cod note I saw in regards to this: there are a number of gotcha's here: does your machine support this in it's antenna design? I dunno.
/*
* Enable WLAN/BT RX Antenna diversity only when:
*
* - BTCOEX is disabled.
* - the user manually requests the feature.
* - the HW cap is set using the platform data.
*/
But you seem to want to play with this stuff. It appears to be a bit of a swamp, but might give your a workaround also. Good luck.
This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |