Bug 606393
Summary: | [ath9k] Network software dropping connections, terminating downlaods | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Paul Lambert <eb30750> | ||||||
Component: | kernel | Assignee: | John W. Linville <linville> | ||||||
Status: | CLOSED WORKSFORME | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 13 | CC: | anton, dcbw, dougsland, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mcgrof, michel | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-01-18 15:04:12 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Paul Lambert
2010-06-21 14:22:18 UTC
Highly likely that this network issue is tied to this kernel oops cpu crash. Wlan0 dropping data frames. But again, this was an issue on previous versions but resolved in FE-12. Bugs that were fixed have been reintroduced into FE-13. I have used the automatic bug filer to report the kernel oops bug. WARNING: at net/mac80211/tx.c:553 invoke_tx_handlers+0x59a/0xbfe [mac80211]() Hardware name: HP Pavilion dv7 Notebook PC wlan0: Dropped data frame as no usable bitrate found while scanning and associated. Target station: 00:1d:7e:16:e9:16 on 5 GHz band Modules linked in: bluetooth tun aes_x86_64 aes_generic fuse ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand powernow_k8 freq_table xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_amd kvm uinput snd_hda_codec_atihdmi arc4 ecb snd_hda_codec_idt ath9k ath9k_common mac80211 snd_hda_intel ath9k_hw snd_hda_codec uvcvideo snd_hwdep videodev ath v4l1_compat snd_seq sdhci_pci snd_seq_device snd_pcm hp_wmi sdhci v4l2_compat_ioctl32 cfg80211 snd_timer jmb38x_ms rfkill r8169 memstick mmc_core joydev snd shpchp edac_core microcode mii edac_mce_amd soundcore wmi snd_page_alloc k10temp i2c_piix4 hp_accel lirc_ene0100 lis3lv02d lirc_dev input_polldev ata_generic pata_acpi pata_atiixp video output radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan] Pid: 835, comm: phy0 Not tainted 2.6.33.5-124.fc13.x86_64 #1 Call Trace: [<ffffffff8104b54c>] warn_slowpath_common+0x77/0x8f [<ffffffff8104b5b1>] warn_slowpath_fmt+0x3c/0x3e [<ffffffffa02bac68>] invoke_tx_handlers+0x59a/0xbfe [mac80211] [<ffffffffa02aae8f>] ? sta_info_get+0x2f/0x44 [mac80211] [<ffffffffa02ba622>] ? ieee80211_tx_prepare+0x2dc/0x323 [mac80211] [<ffffffffa02bb49f>] ieee80211_tx+0x6d/0x1d3 [mac80211] [<ffffffff81381edc>] ? skb_release_data+0xc4/0xc9 [<ffffffff813828db>] ? pskb_expand_head+0xed/0x170 [<ffffffffa02bb7ec>] ieee80211_xmit+0x1e7/0x206 [mac80211] [<ffffffff81381fe4>] ? __alloc_skb+0x7b/0x16b [<ffffffffa02bb855>] ieee80211_tx_skb+0x4a/0x51 [mac80211] [<ffffffffa02b0375>] ieee80211_send_nullfunc+0xfc/0x105 [mac80211] [<ffffffffa02b03c9>] ieee80211_dynamic_ps_enable_work+0x4b/0x84 [mac80211] [<ffffffff81060d31>] worker_thread+0x1a4/0x232 [<ffffffffa02b037e>] ? ieee80211_dynamic_ps_enable_work+0x0/0x84 [mac80211] [<ffffffff8106480b>] ? autoremove_wake_function+0x0/0x34 [<ffffffff81060b8d>] ? worker_thread+0x0/0x232 [<ffffffff810643bb>] kthread+0x7a/0x82 [<ffffffff8100a924>] kernel_thread_helper+0x4/0x10 [<ffffffff81064341>] ? kthread+0x0/0x82 [<ffffffff8100a920>] ? kernel_thread_helper+0x0/0x10 Clearly the driver. If it's not passing traffic properly, it may certainly be responsible for frequently dropping network connections. Can you recreate this issue _after_ issuing the following command? iwconfig wlan0 power timeout 0 Any word on this? Ping? I am currently running kernel version kernel.x86_64 2.6.33.6-147.2.4.fc13. I did not encounter is issue for several weeks and thought one of the recent upgrades eliminated it. However, I was performing a substantial batch of upgrades last week and encountered one dropped packet error so the bug still lives. Based on the backtrace, I surmise that the device is exiting dynamic power saving mode and attempting to transmit a frame to notify the AP that it is awake. Instead, it is hitting this check in ieee80211_tx_h_rate_ctrl: /* * Lets not bother rate control if we're associated and cannot * talk to the sta. This should not happen. */ if (WARN(test_bit(SCAN_SW_SCANNING, &tx->local->scanning) && (sta_flags & WLAN_STA_ASSOC) && !rate_usable_index_exists(sband, &tx->sta->sta), "%s: Dropped data frame as no usable bitrate found while " "scanning and associated. Target station: " "%pM on %d GHz band\n", tx->sdata->name, hdr->addr1, tx->channel->band ? 5 : 2)) return TX_DROP; Unfortunately, I don't completely understand what this check is doing -- hopefully Johannes has some advice. Aside from the warning message, are you experiencing any other problems (e.g. dropped connections)? Luis added that check, it was something about catching frames trying to be transmitted on the wrong band. I've never seen this but I guess it's related to scanning. Indeed. The warning is designed to catch bugs on drivers or the wireless core when a frame is trying to be transmitted for an incorrect band, this typically was happening when scanning. I remember I added the check but prior to this fixed the original issue that was causing but I do forget where the issue was happening exactly. The reason for the detailed print is so you can see if the bit rate is indeed valid, and if the target peer is on the right band that it says you are trying to communicate under. I'm experiencing similar problems on a Lenovo Z61t. According to lspci, it has a: 03:00.0 Network controller: Atheros Communications Inc. AR5008 Wireless Network Adapter (rev 01) Most of the time when this occurs I see NetworkManager try to reconnect and after a minute or so it just gives up and while it shows wireless still active but never shows any AP's available. Suspending/resuming the machine seems to make it occur more frequently. I also had this problem in F12, usually disabling networking and re-enabling it in NetworkManager would bring it back. Now with F13 I find I have to stop NetworkManager, unload the ath9k module, re-load ath9k then start NetworkManager and I'll have wireless again that lasts anywhere from 30 min. to a few hours. Similar problem here on a Sony Vaio W. With F-13 the problem occurs -- frequent disconnects etc. -- but at a manageable level. Now that the machine has been upgraded to F-14 (starting with alpha and updated with yum since) it's currently at a point where pings from the machine is reliable enough, but downloads stall rather frequently, and pings from the outside world to the machine suffer > 50% packet loss. I attempted to run SSH on the machine and the response is really sluggish, even for keyboard input -- consistent with the packet loss rate. $ lspci | grep Atheros | grep Wireless 02:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01) $ uname -a Linux iris.localdomain 2.6.35.4-28.fc14.x86_64 #1 SMP Wed Sep 15 01:56:54 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux Been a long time...does this problaem still happen with current updated Fedora kernels? I have not lost all wireless ethernet connections for some time. I am now using F-14 and it seems to very stable. However, after upgrading to kernel 10-72 network performance has taken a dive and there are times the Firefox freezes for over a minute before the network is again handling network traffic. Can you provide more detailed instructions on what error logs would yield the necessary information to pinpoint just the bug related to this bug report? Difficult to say...of course, dmesg output (and/or /var/log/messages contents) is usually a good place to start. :-) There are many aborts regarding nsplugwrapper in the messages log. I have attached the output of dmesg and /var/log/messages Created attachment 473415 [details]
output of dmesg
Created attachment 473418 [details]
outpur of /var/log/messages
I see a lot of flashplayer crashes in those logs. Do you have problems when you are not using flash? Since I have not observed this bug recently, I believe we can assume that the degraded network performance when using FF was related to the flashplayer crashes. This being the case we can close this bug report since neither network issue has been experienced of late. Closing on basis of comment 19... |