Created attachment 461934 [details] top.out Description of problem: */ On my Thinkpad T400 [kslowd???] kernel processes take too much CPU (see top.out attached file). This issue make control the desktop session impossible. */ It is reproducible on distribution F14 kernel. Fedora13 kernel on F14 is OK without this issue. Version-Release number of selected component (if applicable): */ kernel-2.6.35.6-48.fc14.x86_64 How reproducible: */ Boot to F14 and wait few minutes. Steps to Reproduce: 1. Boot to F14 2. Login to Gnome (desktop) session 3. Wait few minutes. Actual results: */ desktop control is impossible (mouse is skipping, ...) */ CPU usage is high Expected results: */ smooth control of my desktop system */ CPU idle
Created attachment 461936 [details] top.out
The same problem with Lenovo Thinkpad T400. Unusable desktop for 2 or 3 minutes. Fedora 13. Tail of Xorg.0.log: [ 991.012] (II) intel(0): Manufacturer: LEN Model: 4033 Serial#: 0 [ 991.012] (II) intel(0): Year: 2005 Week: 0 [ 991.012] (II) intel(0): EDID Version: 1.3 [ 991.012] (II) intel(0): Digital Display Input [ 991.012] (II) intel(0): Max Image Size [cm]: horiz.: 30 vert.: 19 [ 991.012] (II) intel(0): Gamma: 2.20 [ 991.012] (II) intel(0): DPMS capabilities: StandBy Suspend Off [ 991.012] (II) intel(0): Supported color encodings: RGB 4:4:4 YCrCb 4:4:4 [ 991.012] (II) intel(0): First detailed timing is preferred mode [ 991.012] (II) intel(0): redX: 0.569 redY: 0.332 greenX: 0.312 greenY: 0.544 [ 991.012] (II) intel(0): blueX: 0.149 blueY: 0.132 whiteX: 0.313 whiteY: 0.329 [ 991.012] (II) intel(0): Manufacturer's mask: 0 [ 991.012] (II) intel(0): Supported detailed timing: [ 991.012] (II) intel(0): clock: 101.6 MHz Image Size: 303 x 190 mm [ 991.012] (II) intel(0): h_active: 1440 h_sync: 1488 h_sync_end 1520 h_blank_end 1792 h_border: 0 [ 991.012] (II) intel(0): v_active: 900 v_sync: 903 v_sync_end 909 v_blanking: 945 v_border: 0 [ 991.012] (II) intel(0): Supported detailed timing: [ 991.012] (II) intel(0): clock: 81.5 MHz Image Size: 303 x 190 mm [ 991.012] (II) intel(0): h_active: 1440 h_sync: 1488 h_sync_end 1520 h_blank_end 1760 h_border: 0 [ 991.012] (II) intel(0): v_active: 900 v_sync: 903 v_sync_end 909 v_blanking: 926 v_border: 0 [ 991.012] (II) intel(0): Unknown vendor-specific block f [ 991.012] (II) intel(0): LTN141WD-L05 [ 991.012] (II) intel(0): EDID (in hex): [ 991.012] (II) intel(0): 00ffffffffffff0030ae334000000000 [ 991.012] (II) intel(0): 000f0103801e1378eacd7591554f8b26 [ 991.012] (II) intel(0): 21505400000001010101010101010101 [ 991.012] (II) intel(0): 010101010101b027a06051842d303020 [ 991.012] (II) intel(0): 36002fbe10000019d51fa04051841a30 [ 991.012] (II) intel(0): 302036002fbe100000190000000f0090 [ 991.012] (II) intel(0): 0a32900a281401004ca34244000000fe [ 991.012] (II) intel(0): 004c544e31343157442d4c30350a0032 [ 991.012] (II) intel(0): EDID vendor "LEN", prod id 16435 [ 991.012] (II) intel(0): Printing DDC gathered Modelines: [ 991.012] (II) intel(0): Modeline "1440x900"x0.0 101.60 1440 1488 1520 1792 900 903 909 945 -hsync -vsync (56.7 kHz) [ 991.012] (II) intel(0): Modeline "1440x900"x0.0 81.49 1440 1488 1520 1760 900 903 909 926 -hsync -vsync (46.3 kHz) [ 991.012] (II) intel(0): Not using default mode "320x240" (doublescan mode not supported) [ 991.012] (II) intel(0): Not using default mode "400x300" (doublescan mode not supported) [ 991.012] (II) intel(0): Not using default mode "400x300" (doublescan mode not supported) [ 991.012] (II) intel(0): Not using default mode "512x384" (doublescan mode not supported) [ 991.012] (II) intel(0): Not using default mode "640x480" (doublescan mode not supported) [ 991.012] (II) intel(0): Not using default mode "640x512" (doublescan mode not supported) 18393,1 99%
Please attach output of something like this "for ((i=0;i<10;i++)) do echo STACK ; cat /proc/$PID/stack ; done", where $PID is kslowd process id.
I have the same here. Kernels tried : 2.6.35.6-48.fc14.i686 and 2.6.35.6-45.fc14.i686. Problem occured in both. Result of the command you asked for, for one of the processes : [root@alfred ~]# PID=149 ; for i in $(seq 1 10) ; do echo STACK ; cat /proc/$PID/stack ; done STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff STACK [<c04a0058>] slow_work_thread+0xf5/0x24a [<c044db61>] kthread+0x64/0x69 [<c04038fe>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff uptime : 10:02:34 up 29 min, 4 users, load average: 1.63, 1.46, 1.21 top : 149 root 15 -5 0 0 0 S 17.6 0.0 4:00.59 kslowd001 1765 root 15 -5 0 0 0 S 16.3 0.0 3:59.39 kslowd002 2275 root 15 -5 0 0 0 S 15.3 0.0 1:22.66 kslowd000 2276 root 15 -5 0 0 0 S 11.3 0.0 1:22.21 kslowd003
my output (each PID, each iteration): [<ffffffff810ca0d8>] slow_work_thread+0x12a/0x2a4 [<ffffffff81065f29>] kthread+0x7f/0x87 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff
Same thing here on X200. Although sometimes the problem goes away, but not fully, I still see kslowd on top way too often. And it only happens in X, which leads me to connect this with the VGA monitor polling code problem. STACK [<ffffffff810ca14c>] slow_work_thread+0x12a/0x2a4 [<ffffffff81065fd5>] kthread+0x7f/0x87 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff
I don't know if there are any back side for this but I tried this in order to slow percentage of CPU slow-work can use. kernel.slow-work.vslow-percentage = 1 initial setting was 50.
Setting "kernel.slow-work.vslow-percentage" to "1", help with system's responses. Desktop is now controllable (but not best), and the the load is still high. top - 13:10:13 up 16 min, 9 users, load average: 1.39, 1.28, 0.85 Tasks: 194 total, 3 running, 191 sleeping, 0 stopped, 0 zombie Cpu(s): 1.7%us, 14.4%sy, 0.0%ni, 82.8%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4017224k total, 1798856k used, 2218368k free, 87200k buffers Swap: 4718588k total, 0k used, 4718588k free, 944964k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3588 root 15 -5 0 0 0 R 13.3 0.0 0:03.07 slowd002 3590 root 15 -5 0 0 0 S 12.6 0.0 0:01.52 kslowd000
Hi Jaroslav, Could you try the rawhide 2.6.36/2.6.37-rc5 kernel and see if its fixed there so we can backport it? --Kyle
I rebuild F15 (devel.) kernel 2.6.36.1 from rpm src package (http://koji.fedoraproject.org/koji/buildinfo?buildID=207513) for F14, installed it and used it. The issue seems to "disappear" (there are none kslowd processes). I will send some status later this day after the up-time will be higher.
Unfortunately issue is back :(, uptime ~ 1h 14min. Now it is in [kworker/0:0] and [kworker/1:1] processes. top - 09:29:36 up 1:17, 9 users, load average: 1.26, 1.16, 1.09 Tasks: 193 total, 2 running, 191 sleeping, 0 stopped, 0 zombie Cpu(s): 2.9%us, 29.0%sy, 0.0%ni, 68.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4017316k total, 2273412k used, 1743904k free, 119660k buffers Swap: 4718588k total, 0k used, 4718588k free, 1111300k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27 root 20 0 0 0 0 S 33.7 0.0 1:07.69 [kworker/1:1] 4191 root 20 0 0 0 0 R 21.6 0.0 0:56.44 [kworker/0:0] Fortunately stack is of PIDs 27 a 4191 is useful now: # cat /proc/27/stack [<ffffffffa0014088>] i2c_transfer+0x85/0x101 [i2c_core] [<ffffffffa003b8a4>] drm_do_probe_ddc_edid+0x55/0x62 [drm] [<ffffffffa003bf12>] drm_get_edid+0xb7/0x1bd [drm] [<ffffffffa0097669>] intel_hdmi_detect+0x31/0x6e [i915] [<ffffffffa006c730>] output_poll_execute+0x9c/0x120 [drm_kms_helper] [<ffffffff81062ddc>] process_one_work+0x18e/0x289 [<ffffffff81063d27>] worker_thread+0x104/0x19b [<ffffffff810672ee>] kthread+0x82/0x8a [<ffffffff8100ab24>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff [root@cz-r8lgkkw ~]# cat /proc/4191/stack [<ffffffff81059fc2>] msleep+0x1b/0x22 [<ffffffffa0093a67>] intel_crt_detect+0x214/0x30c [i915] [<ffffffffa006c730>] output_poll_execute+0x9c/0x120 [drm_kms_helper] [<ffffffff81062ddc>] process_one_work+0x18e/0x289 [<ffffffff81063d27>] worker_thread+0x104/0x19b [<ffffffff810672ee>] kthread+0x82/0x8a [<ffffffff8100ab24>] kernel_thread_helper+0x4/0x10
Same behavior with kernel 2.6.37.0-rc5 (up-time 1h 11min).
stack for 2.6.37.0-rc5 [<ffffffffa009c90b>] gmbus_xfer+0x34e/0x4d2 [i915] [<ffffffffa0011f97>] i2c_transfer+0x88/0xf6 [i2c_core] [<ffffffffa0039795>] drm_do_probe_ddc_edid+0x4f/0x5c [drm] [<ffffffffa0039e0f>] drm_get_edid+0xc3/0x248 [drm] [<ffffffffa0097b99>] intel_hdmi_detect+0x46/0xa5 [i915] [<ffffffffa006a8de>] output_poll_execute+0xa2/0x159 [drm_kms_helper] [<ffffffff81069e96>] process_one_work+0x1f4/0x361 [<ffffffff8106b2af>] worker_thread+0x104/0x1a4 [<ffffffff8106ecee>] kthread+0xa0/0xa8 [<ffffffff8100bb24>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff
Just as Jaroslav, setting the parameter to 1 seemed to help, but problem occurs again.
I switched cables to external monitor => DVI disconnected, VGA connected. Issue is still here, stack is different: Stack from kworker process, kernel 2.6.36.1 # cat /proc/27/stack [<ffffffff81049838>] sys_sched_yield+0x4f/0x55 [<ffffffffa0022568>] try_address.clone.2+0x5a/0x79 [i2c_algo_bit] [<ffffffffa002270c>] bit_xfer+0x185/0x3c9 [i2c_algo_bit] [<ffffffffa0014088>] i2c_transfer+0x85/0x101 [i2c_core] [<ffffffffa003b8a4>] drm_do_probe_ddc_edid+0x55/0x62 [drm] [<ffffffffa003be8d>] drm_get_edid+0x32/0x1bd [drm] [<ffffffffa0097669>] intel_hdmi_detect+0x31/0x6e [i915] [<ffffffffa006c730>] output_poll_execute+0x9c/0x120 [drm_kms_helper] [<ffffffff81062ddc>] process_one_work+0x18e/0x289 [<ffffffff81062ef8>] process_scheduled_works+0x21/0x2f [<ffffffff81063d41>] worker_thread+0x11e/0x19b [<ffffffff810672ee>] kthread+0x82/0x8a [<ffffffff8100ab24>] kernel_thread_helper+0x4/0x10 # cat /proc/27/stack [<ffffffff8103c49a>] need_resched+0x23/0x2d [<ffffffff8103c4b2>] should_resched+0xe/0x2e [<ffffffffa002212a>] sclhi+0x55/0x76 [i2c_algo_bit] [<ffffffffa00224b4>] i2c_outb.clone.0+0x4c/0xa6 [i2c_algo_bit] [<ffffffffa0022542>] try_address.clone.2+0x34/0x79 [i2c_algo_bit] [<ffffffffa002270c>] bit_xfer+0x185/0x3c9 [i2c_algo_bit] [<ffffffffa0014088>] i2c_transfer+0x85/0x101 [i2c_core] [<ffffffffa003b8a4>] drm_do_probe_ddc_edid+0x55/0x62 [drm] [<ffffffffa003be8d>] drm_get_edid+0x32/0x1bd [drm] [<ffffffffa0097669>] intel_hdmi_detect+0x31/0x6e [i915] [<ffffffffa006c730>] output_poll_execute+0x9c/0x120 [drm_kms_helper] [<ffffffff81062ddc>] process_one_work+0x18e/0x289 [<ffffffff81062ef8>] process_scheduled_works+0x21/0x2f [<ffffffff81063d41>] worker_thread+0x11e/0x19b [<ffffffff810672ee>] kthread+0x82/0x8a [<ffffffff8100ab24>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff # cat /proc/27/stack [<ffffffff81048ee0>] __cond_resched+0x2a/0x36 [<ffffffffa002212a>] sclhi+0x55/0x76 [i2c_algo_bit] [<ffffffffa00224b4>] i2c_outb.clone.0+0x4c/0xa6 [i2c_algo_bit] [<ffffffffa0022542>] try_address.clone.2+0x34/0x79 [i2c_algo_bit] [<ffffffffa002270c>] bit_xfer+0x185/0x3c9 [i2c_algo_bit] [<ffffffffa0014088>] i2c_transfer+0x85/0x101 [i2c_core] [<ffffffffa003b8a4>] drm_do_probe_ddc_edid+0x55/0x62 [drm] [<ffffffffa003be8d>] drm_get_edid+0x32/0x1bd [drm] [<ffffffffa0097669>] intel_hdmi_detect+0x31/0x6e [i915] [<ffffffffa006c730>] output_poll_execute+0x9c/0x120 [drm_kms_helper] [<ffffffff81062ddc>] process_one_work+0x18e/0x289 [<ffffffff81062ef8>] process_scheduled_works+0x21/0x2f [<ffffffff81063d41>] worker_thread+0x11e/0x19b [<ffffffff810672ee>] kthread+0x82/0x8a [<ffffffff8100ab24>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff
It looks like this issue is less "visible" when I connect my external monitor with both cables (VGA and DVI) after X is started (otherwise X will not start). Sometimes kworker process take CPU and desktop is uncontrollable for a ~1-2 second (repeatedly in few minutes), stack of kworker show i2c_transfer function. I can use my desktop and do work with this workaround, but again is not best experience...
Same symptoms. Setup: Lenovo ThinkPad R400; up-to-date Fedora 14: kernel-2.6.35.9-64.fc14.x86_64, xorg-x11-drv-intel-2.12.0-6.fc14.1.x86_64 00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) However, on Lenovo ThinkPad T510, w/ the same up-to-date Fedora 14, I don't experience this (need to check PCI id of VGA controller to make sure they're the same).
Created attachment 470077 [details] while true; do for pid in `pgrep -f kslowd`; do cat /proc/$pid/stack; echo; done; echo; echo; sleep 1; done
I also notice these messages in /var/log/messages: [drm:drm_kms_helper_poll_enable] *ERROR* delayed enqueue failed -125
Created attachment 470079 [details] Xorg.0.log piece Finally, this sequence of log messages is repeating all over again. Seems like they have something to do with intel_{hdmi,crt}_detect functions calls that can be seen in kslowd stack traces.
If you boot with drm_kms_helper.poll=0 on the command line, does it help at all?
As promised, here's Lenovo T510 PCI id of VGA: 00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 02)
(In reply to comment #22) > Created attachment 470079 [details] > Xorg.0.log piece > > Finally, this sequence of log messages is repeating all over again. > > Seems like they have something to do with intel_{hdmi,crt}_detect functions > calls that can be seen in kslowd stack traces. Could we get whole /var/log/Xorg.0.log, /etc/X11/xorg.conf (if one exists), dmesg from the moment when the system is slow, attached to this bug as separate uncompressed attachments to this bug please? Thank you
I will attach it at Monday, I'm out now and without access to my Laptop.
(In reply to comment #23) > If you boot with drm_kms_helper.poll=0 on the command line, does it help at > all? Kyle, it actually does seem to help: several hours uptime and no sign of kslowd ruling the CPUs.
Created attachment 470752 [details] Xorg.0.log
Created attachment 470753 [details] dmesg output
No xorg.conf, kslowd* seen after $ uptime 20:35:03 up 21 min, 3 users, load average: 1.04, 0.41, 0.36
Created attachment 470812 [details] Xorg.0.log (Lenovo T400)
Created attachment 470813 [details] dmesg (Lenovo T400)
Created attachment 470814 [details] lspci (Lenovo T400)
Created attachment 470815 [details] powertop dump (Lenovo T400)
Created attachment 470816 [details] xorg.conf.d tar $ echo; for file in /usr/share/X11/xorg.conf.d/*; do echo -n "File $file from package "; rpm -qf $file; done File /usr/share/X11/xorg.conf.d/10-evdev.conf from package xorg-x11-server-Xorg-1.9.3-3.fc14.x86_64 File /usr/share/X11/xorg.conf.d/10-quirks.conf from package xorg-x11-server-Xorg-1.9.3-3.fc14.x86_64 File /usr/share/X11/xorg.conf.d/50-fpit.conf from package xorg-x11-drv-fpit-1.3.0-11.fc14.x86_64 File /usr/share/X11/xorg.conf.d/50-synaptics.conf from package xorg-x11-drv-synaptics-1.3.0-1.fc14.x86_64 File /usr/share/X11/xorg.conf.d/50-vmmouse.conf from package xorg-x11-drv-vmmouse-12.6.9-5.fc14.x86_64 File /usr/share/X11/xorg.conf.d/50-wacom.conf from package xorg-x11-drv-wacom-0.10.8-2.fc14.x86_64 [root@cz-r8lgkkw log]#
(In reply to comment #23) > If you boot with drm_kms_helper.poll=0 on the command line, does it help at > all? Hi This helps GREATLY Thanks!
Yes, drm_kms_helper.poll=0 kernel parameter helps for me too. I don't know what this parameter do in drm. Is there some documentation about drm_kms_helper.* kernel parameters or about drm parameters in general?
We are seeing a similar issue on RHEL 6.0, should I open a separate ticket for this?
(In reply to comment #38) > We are seeing a similar issue on RHEL 6.0, should I open a separate ticket for > this? You should contact redhat support I think.
(In reply to comment #39) > (In reply to comment #38) > > We are seeing a similar issue on RHEL 6.0, should I open a separate ticket for > > this? > You should contact redhat support I think. Yes, definitively.
This message is a notice that Fedora 14 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 14. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '14' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 14 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping