From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040809 Description of problem: Machine in question is a 1200MHz AMD Duron, nVidia GeForce2MX, AGP bus. When using this machine for video playback (player does not matter, mplayer and xine both show the same behaviour), playback consumes huge amounts of CPU time in the X server. For example, DVD playback uses ~60% CPU time, which I consider quite a lot given the CPU power. More than half of the time is consumed by the X server. X CPU consumption decreases when part of the video output window is covered by other windows. Switching video playback to use XShm makes almost no difference in total CPU time consumption. Due to this, video playback is quite susceptible to other processes taking a bit more CPU time (if just a peak for a second or so). I remember using the same hardware under XFree86 to play back video with a lot less CPU comsumption. To be honest I do not remember if this was using the nvidia binary drivers. Currently I am using the stock x.org nv driver. This problem is not specific to the current version of x.org, it was present ever since the change to x.org (maybe earlier) Version-Release number of selected component (if applicable): xorg-x11-6.7.99.902-4 How reproducible: Always Steps to Reproduce: 1. Play back a DVD on nvidia hardware, using Xv 2. Watch CPU consumption by the X server 3. Actual Results: X server uses a lot of CPU time Expected Results: CPU usage ought to be modest (the actual work ought to be done by the graphics card) Additional info:
Please file a bug report in X.Org bugzilla upstream at the following URL, and we will track the issue in the upstream bugzilla: http://bugs.freedesktop.org It's important that X.org is directly made aware of such types of bugs/problems to ensure that they get fixed before the next release. Once you've filed your bug upstream, if you attach the bug URL to this report, we'll track the issue in X.Org bugzilla and do all followups there. Thanks in advance.
After discussing this issue on the xine mailing list, and doing some tests, I think this is not an x.org issue, but a kernel issue. Since I do not know of a way to use a 2.4 kernel with current fedora devel (if there is one, please point me to some instructions, so that I can repeat the test using FC), I used a current knoppix distribution to ensure repeatable testing conditions and to have the possibility to use 2.4 and 2.6 series kernels with the same userspace. The knoppix version used was 3.6 (2004-08-16), it contains kernels 2.4.27 and 2.6.7, XFree86 4.3.0 is used. Hardware is as stated above (Duron 1200, GeForce2MX, 512MB RAM) I started knoppix with 2.4 and 2.6, then used the xine shipped to play a DVD chapter, watching CPU usage with top. The results are below (shown is the overall CPU usage, with XFree86 CPU usage in brackets) | Xv | Xshm ------+-------------------+------------------- 2.4 | 35-40% (15-17%) | 45-50% (18-20%) ------+-------------------+------------------- 2.6 | 56-58% (37%) | 38-40% (10-11%) As can be seen, the CPU usage is much higher with 2.6, with the extra cycles being consumed by XFree86 (the 2.6 behaviour wrt xv/xshm is strange, too, but that's not the problem here) Please reassign this bug to the kernel.
PS: the thread on xine-users can be found here: http://sourceforge.net/mailarchive/forum.php?thread_id=5434405&forum_id=3438
Reassigning to kernel as per request in comment #2.
Further investigation shows that this is tied to the use of a frame buffer console. Booting with vesafb shows the high cpu usage scenario described above. Booting with rivafb (or no fb at all) bringVideo cs the overall CPU usage down by ~20 percentage points (which would be quite normal usage for DVD playback on this CPU class)
If using the kernel framebuffer, I would expect all video to be slower, including Xv. In this case, I don't consider it to be a bug personally, but I'll leave that up to the kernel team to decide, as this is filed against kernel.
Is it possible that this is caused by different mtrr settings? Using vesafb allocates a write-through mapping for the first 4MB of video memory (I think, I do not have the machine at hand). X later complains that it can not extend this to the full 32MB of video memory the card has. rivafb on the other hand allocates a mtrr region for the full 32MB, so X is happy. As I said above, it is not that using a framebuffer generally slows things down. There is no speed difference noticed by me between rivafb and no frame buffer at all.
the 2.6.10 kernels should have vesafb creating MTRRs the size of video memory (rather than the size of the framebuffer being used). [at least I think it made it into 2.6.10 -- if not, it'll be in the 2.6.11 kernel]
how do things behave with the current kernels ?
I have updated to the latest RH yesterday after a long pause (kernel-2.6.14-1.1777_FC5) and things have gotten rather worse, I am afraid. DVD playback consumes >70% of CPU time, ~40%+ of that being in Xorg.
I have written a program to measure the Xv throughput. The machine I used is a Duron 1200Mhz, 640MB RAM, GeForce2MX graphics card using the stock nv driver. The tested operating systems are FC4 (fully updated) and a reasonably recent rawhide tree. Both versions shared one xorg.conf. Exact version numbers are in the test results. My test program uses the Xv extension to transfer a 512x512 pixel image into a 512x512 pixel sized window, using shared memory transfers (both the image and the window sizes are configurable, but these are the default values). This is repeated often enough to be able to calculate the time used for a single transfer. All test were done using the following procedure: 1) Boot the OS in single user mode 2) Become root ("su -") 3) Compile the xvperf program (link to source below, compiled with "CFLAGS=$(rpm --eval %optflags) make") 4) start xfs 5) Create a simple X environment ("echo xterm -geometry 80x25 > .Xclients") 6) start X11 (startx) This creates a very simple X environment without a window manager. 7) run "maketest.sh <path to xvperf binary> (link below) maketest.sh gets some system information (CPU, mtrr, iomem, X11 config and log files) and then runs xvperf. All information is saved into separate files. The basic result of all the above is that Xv performance under current RH is less than a third of that in FC4. Why that is I do not know. Before I file another bug on this I'd like to ask people who are able and willing to perform the same testing (RH and FC4 (or even FC3)) on the same hardware to compare their perfomance values and notify me whether they see the same as I do or not. Attached to this mail are the log files produced by maketest.sh for FC4 and RH. The link to get the source for xvperf and for maketest.sh is http://www.skytale.net/files/xvperf/ Some words of warning: a) this is my first program dealing with X11 directly. I may be measuring shit. b) The program has just been tested on a GeForce2MX (x86) and an ATI Rage128 (ppc). It may have bugs on other hardware (for all I know it probably has bugs on the hardware it was tested on) c) xvperf will try to use a YUY2 transfer. Almost all cards ought to have such a transfer mode. d) If you have a fast machine you may have to increase the number "2000" in the last line of maketest.sh to accomodate (xvperf will complain in xvperf.out)
Created attachment 123578 [details] xvperf results for FC4 and Rawhide
due to the rapid rate of change upstream, rawhide kernels frequently have extensive debugging options enabled to catch problems earlier, and get better diagnostics when things go wrong. It's highly likely that these are impacting Xv throughput right now.
Is there a way to test if this is the case, i.e. attributable to kernel debugging options vs. a real Xorg problem?
install oprofile, and the debuginfo for the kernel opcontrol --reset opcontrol --start --vmlinux=/usr/lib/debug/lib/modules/`uname -r`/vmlinux *use xv apps here* opcontrol --stop opreport -l | less
A quick test with rawhide on my iBook shows that a major timeslice is spent in _wordcopy_fwd_aligned from libc-2.3.90.so. samples % image name app name symbol name 10317 61.1849 libc-2.3.90.so libc-2.3.90.so _wordcopy_fwd_aliged 5571 33.0388 vmlinux vmlinux ppc6xx_idle 196 1.1624 Xorg Xorg (no symbols) Comparison tests between FC4 and Rawhide on the Duron machine will follow as soon as I get at the machine (tomorrow, maybe the day after)
I have run the above test with oprofile added on both FC4 and rawhide. The results are attached. Having no experience with oprofile I do not know what to make of the results since they are vastly different between FC4 and RH.
Created attachment 124545 [details] xvperf results for FC4 and RH, with oprofile output
On FC5, with 2.6.16-1.2080_FC5, I cannot play DVDs properly, the image pauses every few seconds. My CPU: Intel(R) Celeron(R) CPU 1.70GHz However, it worked ok for me on FC4.
Add a <AOL>me too!</AOL>. Rather annoying. Linux 2.6.16-1.2080_FC5 i686 athlon CPU: AMD Athlon(TM) XP 2100+ Graphics card: Matrox G400 16Mb Xine (playing DVDs) and tvtime both stutter and drop frames. X CPU usage is ~ 65% when playing any sort of video. Problem shows with both kernel-2.6.15-1.2054_FC5 and kernel-2.6.16-1.2080_FC5.
Good news, everyone. Updated to current rawhide today, and Xv transfer speeds are massively improved. Not quite what they were in FC4., but close. Some further prodding reveals that this is unrelated to the booted kernel, i.e. switching to an older kernel does not bring the slow transfers back. So it's an X thing after all. List of currently installed packages matching xorg* and libX* is attached. I may be able to pry the old versions from the yum log files, if needed.
Created attachment 128131 [details] xorg* and libX* files
I have an iBook and use the radeon X11 driver. The results from xvperf do NOT differ much when run on FC5 vs. Rawhide. Is this a driver-specific problem? FC5: Testing shared memory YUY2 transfer of a 512x512 image into a 512x512 window, 2000 iterations. Starting calibrating loop Starting measurement loop 1 8992240 usec (4496 usec/frame, 222.41 fps, 58304 kpps) Starting measurement loop 2 9030704 usec (4515 usec/frame, 221.47 fps, 58056 kpps) Starting measurement loop 3 9017565 usec (4508 usec/frame, 221.79 fps, 58141 kpps) Starting measurement loop 4 9003452 usec (4501 usec/frame, 222.14 fps, 58232 kpps) Rawhide: Testing shared memory YUY2 transfer of a 512x512 image into a 512x512 window, 2000 iterations. Starting calibrating loop Starting measurement loop 1 8755773 usec (4377 usec/frame, 228.42 fps, 59879 kpps) Starting measurement loop 2 8790375 usec (4395 usec/frame, 227.52 fps, 59643 kpps) Starting measurement loop 3 8764734 usec (4382 usec/frame, 228.19 fps, 59818 kpps) Starting measurement loop 4 8765606 usec (4382 usec/frame, 228.16 fps, 59812 kpps)
This may well be. The transfer speeds on my (old, ATI Rage based) iBook have not improved, either. The machine which saw the improvement was the nvidia based duron mentioned above.
A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you.
I think we can close this one. Whatever it was, it seems to have disappeared along the way to FC6.