Red Hat Bugzilla – Bug 462180
radeonhd: kernel-184.108.40.206-29 has high CPU, kernel-220.127.116.11-108 is OK
Last modified: 2009-07-14 13:06:31 EDT
Description of problem:
With kernel-18.104.22.168-108.fc9.i686 I can use kaffeine (DVB etc.) fine, and the CPU usage when watching videos with xine is reasonable:
4634 psavola 20 0 286m 83m 55m S 37.9 11.0 0:28.75 kaffeine
4841 psavola 20 0 367m 117m 89m S 23.5 15.5 0:05.39 xine
(Not sure if this is relevant, but I have had to use '-V xshm' argument if I want to see any video with xine.)
After upgrading to 22.214.171.124-29.fc9.i686, these jumped up dramatically, and video became very choppy:
5522 psavola 20 0 295m 94m 60m S 75.4 12.4 24:18.99 kaffeine
9854 psavola 20 0 335m 120m 94m S 39.1 15.9 0:12.45 xine
Rebooting back to 126.96.36.199-108 worked around the problem.
It seems that with 188.8.131.52-29, the system is less responsive overall but this is difficult to prove..
I suspect something significant changed in the kernel upgrade that caused this severe performance regression.
I'm using xorg-x11-drv-radeonhd-1.2.1-3.7.20080724git.fc9.i386 on AMD Sempron 3800+ (1GB memory).
Version-Release number of selected component (if applicable):
Boot to new kernel, start kaffeine or xine by opening a video.
I'm seeing similar things with current Rawhide kernels (started with kernel-PAE-2.6.27-0.317.rc5.git10.fc10.i686). Did not test video, but firefox is very sluggish. Booting an older kernel fixes things.
Yes, this seems to affect non-graphics as well. E.g. Xorg sometimes seems to use 40%+ CPU even though it's doing nothing. When top is running, top itsel fis using ~10% of CPU.
I've checked diffs in various logs to see if there is anything that might give a lead. Xorg.0.log is essentially the same. Dmesg log has the following kind of changes which may be relevant:
+x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
-Processor #0 15:15 APIC version 16
+SMP: Allowing 4 CPUs, 3 hotplug CPUs
+PERCPU: Allocating 40872 bytes of per cpu data
+NR_CPUS: 32, nr_cpu_ids: 4
-SLUB: Genslabs=12, HWalign=64, Order=0-1, MinObjects=4, CPUs=1, Nodes=1
+SLUB: Genslabs=12, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
+Initializing cgroup subsys devices
-SMP alternatives: switching to UP code
-Freeing SMP alternatives: 20k freed
In particular it seems as if new kernel is preparing for 4 CPUs and is not switching to UP mode. That may cause some differences.
In 'sysctl -a' differences the most prominent are:
-kernel.sched_wakeup_granularity_ns = 5000000
-kernel.sched_batch_wakeup_granularity_ns = 10000000
+kernel.sched_wakeup_granularity_ns = 10000000
-kernel.sched_features = 15
+kernel.sched_features = 895
I see no new issues using the FOSS Radeon driver on a xpress 200m laptop with 184.108.40.206-29. Your probably experiencing some RadeonHD specific problem.
By "new" I mean that playing video has always been more sluggish in X than in Windows XP. I could never play 720p mp4 videos under X11 on this laptop (single core 2Ghz Turion), but I can play them with 50-60% CPU usage under Windows XP using XVid.
For the record: Thinkpad X60s (Core Duo, Intel 945GM graphics)
Some further experimentation:
1) 'time sysctl -a' on 2.6.25:
This seems to indicate this is not just a graphics issue but a more general CPU usage regression.
2) testing with different boot options on 2.6.26:
- acpi=off highres=off nosmp noapic => SLOW
- highres=off noamp => NOBOOT
- nosmp => NOBOOT (Also NOBOOT on 2.6.25)
- nosmp noapic => NOBOOT
- acpi=off nosmp => SLOW
- highres=off => SLOW
NOBOOT: Booting gets stuck at "Starting udev:". Power off/on gets stuck at BIOS after detecting CPU and between detecting and testing memory. Taking power off by removing power cord fixes this.
This seems an inverse problem compared to
3) testing by restoring the scheduler settings from 2.6.25:
sysctl -w kernel.sched_wakeup_granularity_ns=5000000
sysctl -w kernel.sched_features=15
- 2.6.26 (no boot options) SLOW
- 2.6.26 acpi=off nosmp SLOW
- 2.6.26 highres=off SLOW
I don't know how to test this further.
(In reply to comment #5)
> Some further experimentation:
> 1) 'time sysctl -a' on 2.6.25:
> on 2.6.26:
> real 0m12.799s
> user 0m0.020s
> sys 0m7.825s
Does it really take 13 seconds to run the command?
Yes :-(. Tell me if there are specific things I should be looking at.
Also, does booting with the kernel option "maxcpus=1" make any difference?
I did a yum updates, and it started working fine without maxcpus=1. I did some testing to verify that I wasn't dreaming. The issue seems to be related to the radeonhd driver and kernel version. The results:
With xorg-x11-drv-radeonhd-1.2.1-3.7.20080724git, both kernel 220.127.116.11-29 and 18.104.22.168-45, with and without maxcpus=1 is sluggish. (But based on earlier tests, kernel-22.214.171.124-108 was fine.)
With xorg-x11-drv-radeonhd-1.2.1-3.9.20080917git, both kernel 126.96.36.199-29 and 188.8.131.52-45, with and without maxcpus=1 is fine.
After upgrading to xorg-x11-drv-radeonhd-1.2.1-3.9.20080917git, but restarting just with CTRL-ALT-BACKSPACE, instead of booting, is still sluggish. In other words, a reboot seems to be necessary to recover from having used a non-working radeonhd driver.
It seems something significant has changed in radeonhd driver and kernel which caused them to interact poorly.
Not sure if this is worth investigating further, and if so, by who (kernel vs radeonhd maintainers, added in Cc:). I have a diff of xorg log in old vs new radeonhd if anyone is interested.
FWIW, I am running nVidia, not ATI, and observed the slow down issues with 184.108.40.206-29 but not with 220.127.116.11-108, which I am running.
Thus, I don't believe that the radeon driver is the root cause issue. It's the kernel, not the video drivers.
Thus, the title of this bug is a bit misleading. The word 'radeonhd' should really be edited.
It would be good to know what change made the xorg driver faster. Can you attach the diff of the Xorg logs between the two versions?
Created attachment 318037 [details]
diff of slow vs fast xorg log.
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '9'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 9's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 9 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.