Bug 462180

Summary: radeonhd: kernel-2.6.26.3-29 has high CPU, kernel-2.6.25.14-108 is OK
Product: [Fedora] Fedora Reporter: Pekka Savola <pekkas>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 9CC: airlied, deknuydt, gaburici, ngaywood, redhat-bugzilla, rhbugs, zing
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-14 17:06:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
diff of slow vs fast xorg log. none

Description Pekka Savola 2008-09-13 14:14:21 UTC
Description of problem:

With kernel-2.6.25.14-108.fc9.i686 I can use kaffeine (DVB etc.) fine, and the CPU usage when watching videos with xine is reasonable:

 4634 psavola   20   0  286m  83m  55m S 37.9 11.0   0:28.75 kaffeine        
 4841 psavola   20   0  367m 117m  89m S 23.5 15.5   0:05.39 xine          

(Not sure if this is relevant, but I have had to use '-V xshm' argument if I want to see any video with xine.)

After upgrading to 2.6.26.3-29.fc9.i686, these jumped up dramatically, and video became very choppy:

 5522 psavola   20   0  295m  94m  60m S 75.4 12.4  24:18.99 kaffeine
 9854 psavola   20   0  335m 120m  94m S 39.1 15.9   0:12.45 xine

Rebooting back to 2.6.25.14-108 worked around the problem.

It seems that with 2.6.26.3-29, the system is less responsive overall but this is difficult to prove..

I suspect something significant changed in the kernel upgrade that caused this severe performance regression.

I'm using xorg-x11-drv-radeonhd-1.2.1-3.7.20080724git.fc9.i386 on AMD Sempron 3800+ (1GB memory).

Version-Release number of selected component (if applicable):
kernel-2.6.26.3-29

How reproducible:
Boot to new kernel, start kaffeine or xine by opening a video.

Comment 1 Ralf Ertzinger 2008-09-13 15:36:42 UTC
I'm seeing similar things with current Rawhide kernels (started with kernel-PAE-2.6.27-0.317.rc5.git10.fc10.i686). Did not test video, but firefox is very sluggish. Booting an older kernel fixes things.

Comment 2 Pekka Savola 2008-09-14 09:30:53 UTC
Yes, this seems to affect non-graphics as well.  E.g. Xorg sometimes seems to use 40%+ CPU even though it's doing nothing.  When top is running, top itsel fis using ~10% of CPU.

I've checked diffs in various logs to see if there is anything that might give a lead.  Xorg.0.log is essentially the same.  Dmesg log has the following kind of changes which may be relevant:

+x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
-Processor #0 15:15 APIC version 16
+SMP: Allowing 4 CPUs, 3 hotplug CPUs
+PERCPU: Allocating 40872 bytes of per cpu data
+NR_CPUS: 32, nr_cpu_ids: 4
-SLUB: Genslabs=12, HWalign=64, Order=0-1, MinObjects=4, CPUs=1, Nodes=1
+SLUB: Genslabs=12, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
+Initializing cgroup subsys devices
-SMP alternatives: switching to UP code
-Freeing SMP alternatives: 20k freed

In particular it seems as if new kernel is preparing for 4 CPUs and is not switching to UP mode.  That may cause some differences.

In 'sysctl -a' differences the most prominent are:

-kernel.sched_wakeup_granularity_ns = 5000000
-kernel.sched_batch_wakeup_granularity_ns = 10000000
+kernel.sched_wakeup_granularity_ns = 10000000
...
-kernel.sched_features = 15
+kernel.sched_features = 895

Comment 3 Vasile Gaburici 2008-09-14 13:22:25 UTC
I see no new issues using the FOSS Radeon driver on a xpress 200m laptop with 2.6.26.3-29. Your probably experiencing some RadeonHD specific problem.

By "new" I mean that playing video has always been more sluggish in X than in Windows XP. I could never play 720p mp4 videos under X11 on this laptop (single core 2Ghz Turion), but I can play them with 50-60% CPU usage under Windows XP using XVid.

Comment 4 Ralf Ertzinger 2008-09-14 13:36:45 UTC
For the record: Thinkpad X60s (Core Duo, Intel 945GM graphics)

Comment 5 Pekka Savola 2008-09-14 14:19:34 UTC
Some further experimentation:

1) 'time sysctl -a' on 2.6.25:


real    0m0.539s
user    0m0.006s
sys     0m0.155s
  

on 2.6.26:

real    0m12.799s
user    0m0.020s
sys     0m7.825s

This seems to indicate this is not just a graphics issue but a more general CPU usage regression.

2) testing with different boot options on 2.6.26:

 - acpi=off highres=off nosmp noapic => SLOW
 - highres=off noamp => NOBOOT
 - nosmp => NOBOOT  (Also NOBOOT on 2.6.25)
 - nosmp noapic => NOBOOT
 - acpi=off nosmp => SLOW
 - highres=off => SLOW

NOBOOT: Booting gets stuck at "Starting udev:".  Power off/on gets stuck at BIOS after detecting CPU and between detecting and testing memory.  Taking power off by removing power cord fixes this.

This seems an inverse problem compared to 
https://bugzilla.redhat.com/show_bug.cgi?id=405361.

3) testing by restoring the scheduler settings from 2.6.25:

sysctl -w kernel.sched_wakeup_granularity_ns=5000000
sysctl -w kernel.sched_features=15

 - 2.6.26 (no boot options) SLOW
 - 2.6.26 acpi=off nosmp SLOW
 - 2.6.26 highres=off SLOW

I don't know how to test this further.

Comment 6 Chuck Ebbert 2008-09-27 15:21:44 UTC
(In reply to comment #5)
> Some further experimentation:
> 
> 1) 'time sysctl -a' on 2.6.25:
> on 2.6.26:
> 
> real    0m12.799s
> user    0m0.020s
> sys     0m7.825s
> 

Does it really take 13 seconds to run the command?

Comment 7 Pekka Savola 2008-09-27 15:41:20 UTC
Yes :-(.  Tell me if there are specific things I should be looking at.

Comment 8 Chuck Ebbert 2008-09-27 15:48:35 UTC
Also, does booting with the kernel option "maxcpus=1" make any difference?

Comment 9 Pekka Savola 2008-09-27 18:54:18 UTC
I did a yum updates, and it started working fine without maxcpus=1.  I did some testing to verify that I wasn't dreaming.  The issue seems to be related to the radeonhd driver and kernel version. The results:

With xorg-x11-drv-radeonhd-1.2.1-3.7.20080724git, both kernel 2.6.26.3-29 and 2.6.26.5-45, with and without maxcpus=1 is sluggish. (But based on earlier tests, kernel-2.6.25.14-108 was fine.)

With xorg-x11-drv-radeonhd-1.2.1-3.9.20080917git, both kernel 2.6.26.3-29 and 2.6.26.5-45, with and without maxcpus=1 is fine.

After upgrading to xorg-x11-drv-radeonhd-1.2.1-3.9.20080917git, but restarting just with CTRL-ALT-BACKSPACE, instead of booting, is still sluggish.  In other words, a reboot seems to be necessary to recover from having used a non-working radeonhd driver.

It seems something significant has changed in radeonhd driver and kernel which caused them to interact poorly.

Not sure if this is worth investigating further, and if so, by who (kernel vs radeonhd maintainers, added in Cc:).  I have a diff of xorg log in old vs new radeonhd if anyone is interested.

Comment 10 Marc Schwartz 2008-09-27 19:05:30 UTC
FWIW, I am running nVidia, not ATI, and observed the slow down issues with 2.6.26.3-29 but not with 2.6.25.14-108, which I am running.

Thus, I don't believe that the radeon driver is the root cause issue. It's the kernel, not the video drivers.

Thus, the title of this bug is a bit misleading. The word 'radeonhd' should really be edited.

Comment 11 Chuck Ebbert 2008-09-30 03:54:58 UTC
It would be good to know what change made the xorg driver faster. Can you attach the diff of the Xorg logs between the two versions?

Comment 12 Pekka Savola 2008-09-30 04:07:56 UTC
Created attachment 318037 [details]
diff of slow vs fast xorg log.

Comment 13 Bug Zapper 2009-06-10 02:42:21 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 14 Bug Zapper 2009-07-14 17:06:31 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.