Bug 57471 - pci window memcpy optimizations
pci window memcpy optimizations
Product: eCos
Classification: Retired
Component: Patches and contributions (Show other bugs)
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: ecc-bugs-int
Depends On:
  Show dependency treegraph
Reported: 2001-12-13 09:58 EST by Andrew Lunn
Modified: 2007-04-18 12:38 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2003-06-20 12:08:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
The patch (for 1.5.2) (29.35 KB, patch)
2001-12-13 09:59 EST, Andrew Lunn
no flags Details | Diff
.pdf file describing the work/results (for interest only) (105.64 KB, application/octet-stream)
2001-12-13 10:00 EST, Andrew Lunn
no flags Details

  None (edit)
Description Andrew Lunn 2001-12-13 09:58:55 EST
Description of Contribution:

The i82559 network device drivers are doing lots of half word aligned
memcpy's to/from the PCI window. The memcpy function is not optimized for
this and so uses its fallback of byte by byte copies. For the EBSA platform
(and probably afe1) any accesses to the PCI window are slow since they are
uncached, unbuffered, none burst etc... Its much more efficient to do 2
half word accesses to normal memory and one word access to PCI window
memory. This patch adds functions which do this. Tests have shown various
degrees of speed up from 40% to nearly 4x.

This patch could be made more generic. At the moment it just modifes the
two i82559 drivers. It will only work on little endian machines. What may
be interesting is to make these functions part of the pci library. By
default memcpy could be used, but the hardware specific part of the pci
code may provide its own implementation optimised to the architecture? Just
an idea...  

Version-Release number of selected component (if applicable): 1.5.2
Comment 1 Andrew Lunn 2001-12-13 09:59:45 EST
Created attachment 40487 [details]
The patch (for 1.5.2)
Comment 2 Andrew Lunn 2001-12-13 10:00:57 EST
Created attachment 40488 [details]
.pdf file describing the work/results (for interest only)
Comment 3 Jonathan Larmour 2001-12-13 11:31:58 EST
While this is obviously a good patch for you to have. I'm not entirely sure
about this going in generally. Firstly, the 82559 driver is generic i.e. cross
platform, and so we can't put in endian specific dependencies.

Secondly, as alluded to in the recent eCos thread, it would be better to just
a) the generic memcpy to be more efficient for unaligned copies, possibly also
with a configuration dependent choice of using a Duff's device copy

b) pulling in architecture/target specific optimizations. This requires a
framework to be defined though.
Comment 4 Andrew Lunn 2001-12-13 12:17:57 EST
The arm/xscale code will not help (much). Its designed for symmetric access
times for src & dst. Thats very untrue for PCI window accesses. eg aligned word
copies between normal memory i get arount 90Mbyte/s. Word copies to/from PCI
window to normal memory i get about 16Mbytes/S max. 

So we need a memcpy optimized for normal memory and a memcopy optimized for PCI
window memory. I would put the PCI memcopy into the PCI library. 

I would realy see this code as half the code needed for the PCI library. It
should not be too hard to write big endian code for the other half. (I don't
have a big endian embedded system, but i could at least do some testing on a Sun
Sparx machine).
Comment 5 Alex Schuilenburg 2003-06-20 12:08:47 EDT
This bug has moved to http://bugs.ecos.sourceware.org/show_bug.cgi?id=57471

Note You need to log in before you can comment on or make changes to this bug.