Bug 815076 - RFE: improve Rhel6 guest video performance
RFE: improve Rhel6 guest video performance
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: xorg-x11-drv-qxl (Show other bugs)
6.2
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Søren Sandmann Pedersen
Desktop QE
: FutureFeature, Reopened
Depends On:
Blocks: 960058 814211
  Show dependency treegraph
 
Reported: 2012-04-22 07:42 EDT by Yonit Halperin
Modified: 2014-06-18 05:15 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-02-20 10:34:13 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
removing the pre-oom update_area io exits (3.04 KB, patch)
2012-04-22 07:46 EDT, Yonit Halperin
no flags Details | Diff

  None (edit)
Description Yonit Halperin 2012-04-22 07:42:42 EDT
Description of problem:

Rhel6 guest video playback is worse than windows guest with the same bandwidth.
When playing a full-screen youtube flash movie with firefox, on 6Mbps connection, the movie is sometimes not recognized as stream by spice-server due to too large time differences between frames. As a result, the frames are not mjpeg compressed, and the performance is bad.

I've made some modifications to the driver in order to investigate this. The most significant problem is that each time the driver alloc fails, it io exits with a call to UPDATE_AREA(all_primary_surface) and then io exits again with an OOM call. The first io_exit is unnecessary - OOM will handle rendering of the oldest drawables if needed.
Removing the UPDATE_AREA call improves the video performance.

In addition there are other enhancements in the Windows driver that worth consideration for the Rhel6 driver:
(1) copy images from guest memory to the device memory using SSE2 
(2) Do not compute hash value for images that are suspected to be a part of a video stream 
(3) replace lookup3 hash with murmur (Alon has already done it upstream. Need to take it to rhel).

All the changes described here should affect not only video, but the general performance of the driver.
Comment 1 Yonit Halperin 2012-04-22 07:46:50 EDT
Created attachment 579279 [details]
removing the pre-oom update_area io exits
Comment 2 Christophe Fergeau 2012-04-23 05:12:46 EDT
(In reply to comment #0)
> In addition there are other enhancements in the Windows driver that worth
> consideration for the Rhel6 driver:
> (1) copy images from guest memory to the device memory using SSE2 
> (2) Do not compute hash value for images that are suspected to be a part of a
> video stream 
> (3) replace lookup3 hash with murmur (Alon has already done it upstream. Need
> to take it to rhel).
> 
> All the changes described here should affect not only video, but the general
> performance of the driver.

Your bug report is about improving the driver's performance with respect to bandwidth usage if I'm not mistaken while the points above would improve CPU usage. Without actual scenarios where CPU is a bottleneck, I'm not sure it is that important to optimize this.
Regarding (1) if the driver is using memcpy, then glibc may provide a SSE2 memcpy version through the STT_GNU_IFUNC mechanism (no idea if such an implementation is already available upstream and in RHEL)
Comment 3 Yonit Halperin 2012-04-23 06:07:24 EDT
(In reply to comment #2)
> (In reply to comment #0)
> > In addition there are other enhancements in the Windows driver that worth
> > consideration for the Rhel6 driver:
> > (1) copy images from guest memory to the device memory using SSE2 
> > (2) Do not compute hash value for images that are suspected to be a part of a
> > video stream 
> > (3) replace lookup3 hash with murmur (Alon has already done it upstream. Need
> > to take it to rhel).
> > 
> > All the changes described here should affect not only video, but the general
> > performance of the driver.
> 
> Your bug report is about improving the driver's performance with respect to
> bandwidth usage if I'm not mistaken while the points above would improve CPU
> usage. Without actual scenarios where CPU is a bottleneck, I'm not sure it is
> that important to optimize this.
No, the bug is about cpu. The update_area issue affects video for example, since the time difference between frames become larger, and then it makes it harder to classify the frames as video stream.
For the sse2 and murmur hash - IIRC the windows driver experience showed it has improved performance significantly. So I'm not sure it is not important to optimize them. It's at least worth investigation.

> Regarding (1) if the driver is using memcpy, then glibc may provide a SSE2
> memcpy version through the STT_GNU_IFUNC mechanism (no idea if such an
> implementation is already available upstream and in RHEL)
Comment 4 Christophe Fergeau 2012-04-23 06:23:50 EDT
(In reply to comment #3)
> For the sse2 and murmur hash - IIRC the windows driver experience showed it has
> improved performance significantly. So I'm not sure it is not important to
> optimize them. It's at least worth investigation.
> 

All I'm saying is that saying something needs to be changed because it "improves performance" is useless, we need to know if it improves CPU, throughput, ... and have some (possibly rough) measurements of what was improved, it's very easy to work on "performance improvements" which are visible in microbenchmarks but cause no visible changes at all in the application because the code that was optimized wasn't a bottlenect in the application.
Comment 7 Tomas Dosek 2012-09-11 05:09:47 EDT
Raising up priority of this bug as we have customers affected with this issue
Comment 8 RHEL Product and Program Management 2012-12-14 02:28:28 EST
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 13 Søren Sandmann Pedersen 2013-05-17 07:54:34 EDT
Reopening I meant to cond-nak, not nak.

Note You need to log in before you can comment on or make changes to this bug.