Bug 131010 - Xv video playback is very CPU intensive
Xv video playback is very CPU intensive
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
5
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Dave Jones
Brian Brock
NeedsRetesting
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-08-26 13:01 EDT by Ralf Ertzinger
Modified: 2015-01-04 17:09 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-11-02 09:30:13 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
xvperf results for FC4 and Rawhide (26.03 KB, application/octet-stream)
2006-01-23 11:58 EST, Ralf Ertzinger
no flags Details
xvperf results for FC4 and RH, with oprofile output (29.36 KB, application/octet-stream)
2006-02-11 12:04 EST, Ralf Ertzinger
no flags Details
xorg* and libX* files (3.61 KB, application/octet-stream)
2006-04-23 16:25 EDT, Ralf Ertzinger
no flags Details

  None (edit)
Description Ralf Ertzinger 2004-08-26 13:01:51 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2)
Gecko/20040809

Description of problem:
Machine in question is a 1200MHz AMD Duron, nVidia GeForce2MX, AGP bus.

When using this machine for video playback (player does not matter,
mplayer and xine both show the same behaviour), playback consumes huge
amounts of CPU time in the X server. For example, DVD playback uses
~60% CPU time, which I consider quite a lot given the CPU power. More
than half of the time is consumed by the X server.

X CPU consumption decreases when part of the video output window is
covered by other windows.

Switching video playback to use XShm makes almost no difference in
total CPU time consumption.
Due to this, video playback is quite susceptible to other processes
taking a bit more CPU time (if just a peak for a second or so).

I remember using the same hardware under XFree86 to play back video
with a lot less CPU comsumption. To be honest I do not remember if
this was using the nvidia binary drivers. Currently I am using the
stock x.org nv driver.

This problem is not specific to the current version of x.org, it was
present ever since the change to x.org (maybe earlier)

Version-Release number of selected component (if applicable):
xorg-x11-6.7.99.902-4

How reproducible:
Always

Steps to Reproduce:
1. Play back a DVD on nvidia hardware, using Xv
2. Watch CPU consumption by the X server
3.
    

Actual Results:  X server uses a lot of CPU time

Expected Results:  CPU usage ought to be modest (the actual work ought
to be done by the graphics card)

Additional info:
Comment 1 Mike A. Harris 2004-08-30 00:42:47 EDT
Please file a bug report in X.Org bugzilla upstream at the following
URL, and we will track the issue in the upstream bugzilla:

    http://bugs.freedesktop.org

It's important that X.org is directly made aware of such types of
bugs/problems to ensure that they get fixed before the next release.

Once you've filed your bug upstream, if you attach the bug URL
to this report, we'll track the issue in X.Org bugzilla and do
all followups there.

Thanks in advance.
Comment 2 Ralf Ertzinger 2004-08-31 16:28:42 EDT
After discussing this issue on the xine mailing list, and doing some
tests, I think this is not an x.org issue, but a kernel issue.

Since I do not know of a way to use a 2.4 kernel with current fedora
devel (if there is one, please point me to some instructions, so that
I can repeat the test using FC), I used a current knoppix distribution
to ensure repeatable testing conditions and to have the possibility to
use 2.4 and 2.6 series kernels with the same userspace.

The knoppix version used was 3.6 (2004-08-16), it contains kernels
2.4.27 and 2.6.7, XFree86 4.3.0 is used.
Hardware is as stated above (Duron 1200, GeForce2MX, 512MB RAM)

I started knoppix with 2.4 and 2.6, then used the xine shipped to play
a DVD chapter, watching CPU usage with top. The results are below 
(shown is the overall CPU usage, with XFree86 CPU usage in brackets)


      | Xv                |  Xshm
------+-------------------+-------------------
2.4   | 35-40% (15-17%)   | 45-50% (18-20%)
------+-------------------+-------------------
2.6   | 56-58% (37%)      | 38-40% (10-11%)


As can be seen, the CPU usage is much higher with 2.6, with the extra
cycles being consumed by XFree86 (the 2.6 behaviour wrt xv/xshm is
strange, too, but that's not the problem here)

Please reassign this bug to the kernel.
Comment 3 Ralf Ertzinger 2004-08-31 16:30:40 EDT
PS: the thread on xine-users can be found here:
http://sourceforge.net/mailarchive/forum.php?thread_id=5434405&forum_id=3438
Comment 4 Mike A. Harris 2004-09-09 03:58:24 EDT
Reassigning to kernel as per request in comment #2.
Comment 5 Ralf Ertzinger 2004-10-16 12:20:14 EDT
Further investigation shows that this is tied to the use of a frame
buffer console.

Booting with vesafb shows the high cpu usage scenario described above.
Booting with rivafb (or no fb at all) bringVideo cs the overall CPU
usage down by ~20 percentage points (which would be quite normal usage
for DVD playback on this CPU class)
Comment 6 Mike A. Harris 2004-10-16 17:00:07 EDT
If using the kernel framebuffer, I would expect all video to be
slower, including Xv.  In this case, I don't consider it to be
a bug personally, but I'll leave that up to the kernel team to
decide, as this is filed against kernel.
Comment 7 Ralf Ertzinger 2004-10-17 06:05:45 EDT
Is it possible that this is caused by different mtrr settings?
Using vesafb allocates a write-through mapping for the first 4MB of
video memory (I think, I do not have the machine at hand). X later
complains that it can not extend this to the full 32MB of video memory
the card has.

rivafb on the other hand allocates a mtrr region for the full 32MB, so
X is happy.

As I said above, it is not that using a framebuffer generally slows
things down. There is no speed difference noticed by me between rivafb
and no frame buffer at all.
Comment 8 Dave Jones 2005-02-13 00:09:19 EST
the 2.6.10 kernels should have vesafb creating MTRRs the size of video memory
(rather than the size of the framebuffer being used).

[at least I think it made it into 2.6.10 -- if not, it'll be in the 2.6.11 kernel]
Comment 9 Dave Jones 2005-10-05 20:33:12 EDT
how do things behave with the current kernels ?
Comment 10 Ralf Ertzinger 2005-12-22 05:52:21 EST
I have updated to the latest RH yesterday after a long pause
(kernel-2.6.14-1.1777_FC5) and things have gotten rather worse, I am afraid. DVD
playback consumes >70% of CPU time, ~40%+ of that being in Xorg.
Comment 11 Ralf Ertzinger 2006-01-23 11:57:10 EST
I have written a program to measure the Xv throughput.

The machine I used is a Duron 1200Mhz, 640MB RAM, GeForce2MX graphics card
using the stock nv driver.
The tested operating systems are FC4 (fully updated) and a reasonably recent
rawhide tree. Both versions shared one xorg.conf. Exact version numbers are
in the test results.

My test program uses the Xv extension to transfer a 512x512 pixel image
into a 512x512 pixel sized window, using shared memory transfers (both
the image and the window sizes are configurable, but these are the default
values). This is repeated often enough to be able to calculate the time
used for a single transfer. All test were done using the following procedure:

1) Boot the OS in single user mode
2) Become root ("su -")
3) Compile the xvperf program (link to source below, compiled with
   "CFLAGS=$(rpm --eval %optflags) make")
4) start xfs
5) Create a simple X environment ("echo xterm -geometry 80x25 > .Xclients")
6) start X11 (startx)

This creates a very simple X environment without a window manager.

7) run "maketest.sh <path to xvperf binary> (link below)

maketest.sh gets some system information (CPU, mtrr, iomem, X11 config
and log files) and then runs xvperf. All information is saved into
separate files.

The basic result of all the above is that Xv performance under current
RH is less than a third of that in FC4. Why that is I do not know.

Before I file another bug on this I'd like to ask people who are able
and willing to perform the same testing (RH and FC4 (or even FC3)) on
the same hardware to compare their perfomance values and notify me
whether they see the same as I do or not.

Attached to this mail are the log files produced by maketest.sh for
FC4 and RH.

The link to get the source for xvperf and for maketest.sh is
http://www.skytale.net/files/xvperf/

Some words of warning:

a) this is my first program dealing with X11 directly. I may be measuring
   shit.
b) The program has just been tested on a GeForce2MX (x86) and an ATI
   Rage128 (ppc). It may have bugs on other hardware (for all I know it
   probably has bugs on the hardware it was tested on)
c) xvperf will try to use a YUY2 transfer. Almost all cards ought to
   have such a transfer mode.
d) If you have a fast machine you may have to increase the number "2000"
   in the last line of maketest.sh to accomodate (xvperf will complain
   in xvperf.out)
Comment 12 Ralf Ertzinger 2006-01-23 11:58:51 EST
Created attachment 123578 [details]
xvperf results for FC4 and Rawhide
Comment 13 Dave Jones 2006-01-23 12:28:06 EST
due to the rapid rate of change upstream, rawhide kernels frequently have
extensive debugging options enabled to catch problems earlier, and get better
diagnostics when things go wrong.

It's highly likely that these are impacting Xv throughput right now.
Comment 14 Ralf Ertzinger 2006-01-23 12:30:35 EST
Is there a way to test if this is the case, i.e. attributable to kernel
debugging options vs. a real Xorg problem?
Comment 15 Dave Jones 2006-01-23 12:34:25 EST
install oprofile, and the debuginfo for the kernel

opcontrol --reset
opcontrol --start --vmlinux=/usr/lib/debug/lib/modules/`uname -r`/vmlinux
*use xv apps here*
opcontrol --stop

opreport -l | less
Comment 16 Ralf Ertzinger 2006-01-23 13:29:04 EST
A quick test with rawhide on my iBook shows that a major timeslice is spent in
_wordcopy_fwd_aligned from libc-2.3.90.so.

samples  %        image name         app name           symbol name
10317    61.1849  libc-2.3.90.so     libc-2.3.90.so     _wordcopy_fwd_aliged
5571     33.0388  vmlinux            vmlinux            ppc6xx_idle
196       1.1624  Xorg               Xorg               (no symbols)

Comparison tests between FC4 and Rawhide on the Duron machine will follow as
soon as I get at the machine (tomorrow, maybe the day after)
Comment 17 Ralf Ertzinger 2006-02-11 12:03:05 EST
I have run the above test with oprofile added on both FC4 and rawhide. The
results are attached. Having no experience with oprofile I do not know what to
make of the results since they are vastly different between FC4 and RH.
Comment 18 Ralf Ertzinger 2006-02-11 12:04:35 EST
Created attachment 124545 [details]
xvperf results for FC4 and RH, with oprofile output
Comment 19 Marius Andreiana 2006-04-02 06:33:27 EDT
On FC5, with  2.6.16-1.2080_FC5, I cannot play DVDs properly, the image pauses
every few seconds. My CPU: Intel(R) Celeron(R) CPU 1.70GHz

However, it worked ok for me on FC4.
Comment 20 Barrie Bremner 2006-04-03 16:43:31 EDT
Add a <AOL>me too!</AOL>. Rather annoying.

Linux 2.6.16-1.2080_FC5 i686 athlon 

CPU: AMD Athlon(TM) XP 2100+
Graphics card: Matrox G400 16Mb

Xine (playing DVDs) and tvtime both stutter and drop frames.

X CPU usage is ~ 65% when playing any sort of video.

Problem shows with both kernel-2.6.15-1.2054_FC5 and kernel-2.6.16-1.2080_FC5.
Comment 21 Ralf Ertzinger 2006-04-23 16:21:36 EDT
Good news, everyone.

Updated to current rawhide today, and Xv transfer speeds are massively improved.
Not quite what they were in FC4., but close.

Some further prodding reveals that this is unrelated to the booted kernel, i.e.
switching to an older kernel does not bring the slow transfers back.

So it's an X thing after all.

List of currently installed packages matching xorg* and libX* is attached. I may
be able to pry the old versions from the yum log files, if needed.
Comment 22 Ralf Ertzinger 2006-04-23 16:25:49 EDT
Created attachment 128131 [details]
xorg* and libX* files
Comment 23 W. Michael Petullo 2006-04-23 22:31:41 EDT
I have an iBook and use the radeon X11 driver.  The results from xvperf do NOT
differ much when run on FC5 vs. Rawhide.  Is this a driver-specific problem?

FC5:
Testing shared memory YUY2 transfer of a 512x512 image into a
512x512 window, 2000 iterations.
Starting calibrating loop
Starting measurement loop 1
8992240 usec (4496 usec/frame, 222.41 fps, 58304 kpps)
Starting measurement loop 2
9030704 usec (4515 usec/frame, 221.47 fps, 58056 kpps)
Starting measurement loop 3
9017565 usec (4508 usec/frame, 221.79 fps, 58141 kpps)
Starting measurement loop 4
9003452 usec (4501 usec/frame, 222.14 fps, 58232 kpps)

Rawhide:
Testing shared memory YUY2 transfer of a 512x512 image into a
512x512 window, 2000 iterations.
Starting calibrating loop
Starting measurement loop 1
8755773 usec (4377 usec/frame, 228.42 fps, 59879 kpps)
Starting measurement loop 2
8790375 usec (4395 usec/frame, 227.52 fps, 59643 kpps)
Starting measurement loop 3
8764734 usec (4382 usec/frame, 228.19 fps, 59818 kpps)
Starting measurement loop 4
8765606 usec (4382 usec/frame, 228.16 fps, 59812 kpps)
Comment 24 Ralf Ertzinger 2006-04-24 10:43:08 EDT
This may well be. The transfer speeds on my (old, ATI Rage based) iBook have not
improved, either. The machine which saw the improvement was the nvidia based
duron mentioned above.
Comment 25 Dave Jones 2006-10-16 13:59:23 EDT
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.
Comment 26 Ralf Ertzinger 2006-11-02 09:30:13 EST
I think we can close this one. Whatever it was, it seems to have disappeared
along the way to FC6.

Note You need to log in before you can comment on or make changes to this bug.