Hide Forgot
Description of problem: With recent kernels, various kworker threads consume lots of CPU, causing various short hangs, audio skips, keyboard input lags, etc. They do not consume CPU continuously, but in several seconds (typically) long bursts during which one or more kworkers essentially monopolize the CPU. This happens immediately after a fresh boot and logging in, no suspend, service restarts, etc. involved. Version-Release number of selected component (if applicable): kernel-2.6.40.4-5.fc15.x86_64 kernel-2.6.40.6-0.fc15.x86_64 possibly others How reproducible: Always. Steps to Reproduce: 1. Boot Fedora 15 with the specified kernels on a MSI PR210 notebook. Actual results: It runs smoothly. Expected results: It hangs for a short while occasionally with kworker/* eating the CPU. Additional info: I have read Bug 638912, Bug 683156 and Bug 729047 that seem to be related but not identical and tried the suggested debugging methods. Printing /proc/THE_OFFENDING_KWORKER/stack does not show anything interesting, it always looks like [<ffffffff8104fbd6>] schedule+0x5a/0x5c [<ffffffff8106cb66>] worker_thread+0x158/0x15d [<ffffffff8106ff5b>] kthread+0x84/0x8c [<ffffffff8148fe24>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff However, the printing is also delayed so cat probably never actually had any chance to run during the hangs (on a two-core CPU). Doing echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event cat /sys/kernel/debug/tracing/trace_pipe >out.txt reveals nothing interesting either. During the hangs perf top shows typically something like PerfTop: 32 irqs/sec kernel:62.5% exact: 0.0% [1000Hz cycles], (all, 2 CPUs) ------------------------------------------------------------------------------- samples pcnt function DSO _______ _____ ______________________________ ___________________ 2469.00 68.1% delay_tsc [kernel.kallsyms] 372.00 10.3% native_read_tsc [kernel.kallsyms] 200.00 5.5% paravirt_read_tsc [kernel.kallsyms] 163.00 4.5% ktime_get [kernel.kallsyms] 30.00 0.8% sched_clock_tick [kernel.kallsyms] 28.00 0.8% clockevents_program_event [kernel.kallsyms] 24.00 0.7% arch_local_irq_restore [kernel.kallsyms] 23.00 0.6% update_ts_time_stats [kernel.kallsyms] 22.00 0.6% tick_broadcast_oneshot_control [kernel.kallsyms] 21.00 0.6% tg_load_down [kernel.kallsyms] The problem appears to be related to wireless because it is correlated to large number of messages that have started appearing in the logs with the recent kernels: Oct 28 22:13:43 smut kernel: [28859.103257] ath5k phy0: gain calibration timeout (2484MHz) Oct 28 22:13:43 smut kernel: [28859.564545] ath5k phy0: gain calibration timeout (2412MHz) Oct 28 22:14:37 smut kernel: [28913.594614] ath5k phy0: gain calibration timeout (2417MHz) Oct 28 22:14:38 smut kernel: [28914.056594] ath5k phy0: gain calibration timeout (2422MHz) Oct 28 22:14:38 smut kernel: [28914.517244] ath5k phy0: gain calibration timeout (2427MHz) Oct 28 22:14:39 smut kernel: [28914.978610] ath5k phy0: gain calibration timeout (2432MHz) Oct 28 22:14:39 smut kernel: [28915.438136] ath5k phy0: gain calibration timeout (2437MHz) Oct 28 22:14:40 smut kernel: [28915.896755] ath5k phy0: gain calibration timeout (2442MHz) Oct 28 22:14:40 smut kernel: [28916.356033] ath5k phy0: gain calibration timeout (2447MHz) Oct 28 22:14:40 smut kernel: [28916.817086] ath5k phy0: gain calibration timeout (2452MHz) Oct 28 22:14:41 smut kernel: [28917.274720] ath5k phy0: gain calibration timeout (2457MHz) Oct 28 22:14:41 smut kernel: [28917.733986] ath5k phy0: gain calibration timeout (2462MHz) Also, switching off the wifi (with a notebook button) seems to eliminate the problem (with wired network still on). lcpci output: 00:00.0 Host bridge: ATI Technologies Inc RS690 Host Bridge 00:01.0 PCI bridge: ATI Technologies Inc RS690 PCI to PCI Bridge (Internal gfx) 00:04.0 PCI bridge: ATI Technologies Inc Device 7914 00:06.0 PCI bridge: ATI Technologies Inc RS690 PCI to PCI Bridge (PCI Express Port 2) 00:07.0 PCI bridge: ATI Technologies Inc RS690 PCI to PCI Bridge (PCI Express Port 3) 00:12.0 SATA controller: ATI Technologies Inc SB600 Non-Raid-5 SATA 00:13.0 USB Controller: ATI Technologies Inc SB600 USB (OHCI0) 00:13.1 USB Controller: ATI Technologies Inc SB600 USB (OHCI1) 00:13.2 USB Controller: ATI Technologies Inc SB600 USB (OHCI2) 00:13.3 USB Controller: ATI Technologies Inc SB600 USB (OHCI3) 00:13.4 USB Controller: ATI Technologies Inc SB600 USB (OHCI4) 00:13.5 USB Controller: ATI Technologies Inc SB600 USB Controller (EHCI) 00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 14) 00:14.1 IDE interface: ATI Technologies Inc SB600 IDE 00:14.2 Audio device: ATI Technologies Inc SBx00 Azalia (Intel HDA) 00:14.3 ISA bridge: ATI Technologies Inc SB600 PCI to LPC Bridge 00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:05.0 VGA compatible controller: ATI Technologies Inc RS690M [Radeon X1200 Series] 01:05.2 Audio device: ATI Technologies Inc Radeon X1200 Series Audio Controller 02:00.0 Ethernet controller: Atheros Communications Inc. AR242x / AR542x Wireless Network Adapter (PCI-Express) (rev 01) 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01) 06:04.0 FLASH memory: ENE Technology Inc ENE PCI Memory Stick Card Reader Controller 06:04.1 SD Host controller: ENE Technology Inc ENE PCI SmartMedia / xD Card Reader Controller 06:04.2 FLASH memory: ENE Technology Inc Memory Stick Card Reader Controller 06:04.3 FLASH memory: ENE Technology Inc ENE PCI Secure Digital / MMC Card Reader Controller Please tell me what more information/testing is needed.
> Oct 28 22:14:40 smut kernel: [28916.356033] ath5k phy0: gain calibration timeout (2447MHz) Calibration does a lot of busy loops.. I think we can change some mdelays into sleeps without problems now that reset is called from process context.
Old logs do not contain the calibration timeout messages. Has the ath5k calibration changed and this is a consequence? I can try to find the exact kernel package where the problem appeared (taking older kernels pkgs from koji) if it helps. The only method to test a kernel seems to be to boot it and wait a few minutes to see if anything bad happens so bisection will take time but it's possible.
Can you try using fast channel switching ? There is a module option for that fastchanswitch, set it to true when loading the module and see if you get better results. As for the udelay, I think we should switch to usleep_range for most stuff we use and in general we don't need accuracy e.g. when waiting for analog parts to settle, just a rough estimate. I already have some patches for calibration pending so I'll do the switch and see how it goes...
I added `options ath5k fastchanswitch', enabled wireless, rebooted (to get to a defined state) and it seemed fixed. But then I removed the option again, rebooted and still did not observe any problems - they miraculously disappeared. All with the same 2.6.40.6-0.fc15.x86_64 kernel. So I don't know what's going on here. Can the calibration be influenced by other wifi devices around? If it is so it makes the problem non-detereministic (when I experienced it it persisted through several reboots).
I'm seeing this too. Related to bug 638943?
Is this still an issue on 2.6.43/3.3?
Fedora 15 has reached it's end of life as of June 26, 2012. As a result, we will not be fixing any remaining bugs found in Fedora 15. In the event that you have upgraded to a newer release and the bug you reported is still present, please reopen the bug and set the version field to the newest release you have encountered the issue with. Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered. Thank you for taking the time to file a report. We hope newer versions of Fedora suit your needs.