Bug 705785

Summary: [RFE] Compile QXL drivers with SSE2 flags
Product: Red Hat Enterprise Linux 8 Reporter: Yaniv Kaul <ykaul>
Component: spice-qxl-xddmAssignee: ybendito
Status: CLOSED WONTFIX QA Contact: SPICE QE bug list <spice-qe-bugs>
Severity: low Docs Contact:
Priority: medium    
Version: ---CC: bsettle, cfergeau, dblechte, ddumas, mkrcmari, mtessun, Rhev-m-bugs, tpelka, ybendito
Target Milestone: rcKeywords: FutureFeature, Improvement, Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Windows   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-13 12:49:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Spice RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Yaniv Kaul 2011-05-18 13:15:54 UTC
Description of problem:
Currently the code needlessly tries to check if SSE2 is available and then use it. However, in all supported QEMU CPUs, we expose SSE2.

In addition, it looks like this hack has #ifndef _WIN64 , which means in WIN64 we don't use the fast (manually implemented) memcpy.

The compiler knows how to do it memcpy fast using SSE2 if we have the flag in the compiler to target SSE2.
(Note: need to see if it can do it even if the memory is not aligned, but in any case this can't hurt).


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 David Blechter 2011-06-30 11:49:52 UTC
too much pressure on the current qxl driver for beta. need to stabilize the driver first, set delel_ack-

Comment 2 RHEL Program Management 2011-06-30 11:55:12 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Comment 3 Andrew Cathrow 2011-06-30 12:36:17 UTC
Don't want this closed won't fix, moving to rhev-m future.
David we should revisit this potentially before 3.1

Comment 4 Yonit Halperin 2011-07-13 15:00:16 UTC
The driver doesn't use memcopy but rather RtlCopyMemory, which I think is not affected by this flag.

Implementing SSE for win64 will probably help video performance there.

Comment 11 David Blechter 2012-05-04 14:05:30 UTC
ack the bug. we can re-compile, the new binary has to be certified

Comment 12 Marian Krcmarik 2012-05-04 14:13:56 UTC
(In reply to comment #11)
> ack the bug. we can re-compile, the new binary has to be certified

This means WHQL process again?

Comment 13 Yonit Halperin 2012-05-04 17:02:59 UTC
(In reply to comment #11)
> ack the bug. we can re-compile, the new binary has to be certified

Please see comment #8. This flag is not relevant for kernel drivers. We implement the SSE2 memory copy by ourselves for the 32-bit driver. But we are missing this implementation for the 64-bit driver. I would have tried to compare hd video on windows 7 32-guest to windoes 7 64-guest, and if there is a major difference, consider implementing the SSE2 code for 64-guest. Or wait till we have a user space driver...

Comment 14 Yaniv Kaul 2012-05-08 08:20:35 UTC
I highly doubt win7 does not use SSE for memcpy, in kernel. In any case, I don't see the harm in adding this compilation switch.
Regardless, the fact we don't have it for Win7/64 is documented at https://bugs.freedesktop.org/show_bug.cgi?id=37457

Comment 15 David Blechter 2012-05-08 17:53:50 UTC
(In reply to comment #13)
> (In reply to comment #11)
> > ack the bug. we can re-compile, the new binary has to be certified
> 
> Please see comment #8. This flag is not relevant for kernel drivers. 

1. In this case we need to close this bug, or change the subject to "Implement the SSE2 code for the 64-bit kernel space driver"  
 
>We
> implement the SSE2 memory copy by ourselves for the 32-bit driver. 

2. Was it worse doing it? Any performance improvements were observed on Win32?

>But we are
> missing this implementation for the 64-bit driver. I would have tried to
> compare hd video on windows 7 32-guest to windoes 7 64-guest,

3. it is reasonable and easy step.

> and if there is a
> major difference, consider implementing the SSE2 code for 64-guest.

4. if difference, then the implementation is justified!
> Or wait
> till we have a user space driver...

Comment 16 David Blechter 2012-06-11 19:08:53 UTC
set devel_ack? and will re-visit in the rhevm-future

Comment 17 David Blechter 2013-05-10 13:46:37 UTC
Let's keep it for rhev-future, There is no rush to add to the rhevm release.

Comment 19 Alon Levy 2014-04-30 14:43:58 UTC
posted patches on spice-devel removing all SSE2 using fast_memcpy_* code after benchmarks showed it was slower then the internal RtlCopyMemory (which is ifdeffed to memcpy in current Windows development kits).

http://patchwork.freedesktop.org/patch/25265/

Comment 21 Christophe Fergeau 2015-06-04 14:43:59 UTC
Patch series was sent to spice-devel in http://lists.freedesktop.org/archives/spice-devel/2014-April/016774.html but this never made it upstream for some reason :(

Comment 22 Mike McCune 2016-03-28 22:26:18 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 26 Sandro Bonazzola 2019-01-28 09:44:39 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 28 Sandro Bonazzola 2019-03-12 12:54:26 UTC
4.3.1 has been released, please re-target this bug as soon as possible.

Comment 29 Yaniv Kaul 2019-03-13 12:49:12 UTC
Closing, no one really did any effort to look at this since 2014 (comment 19 or so).