Description of problem:
NumPy uses BLAS for the dot() function only if built against ATLAS (math-atlas.sf.net), a generic BLAS library isn't sufficient. See http://scipy.org/scipy/numpy/ticket/667
In case ATLAS isn't available, dot() performance will be very poor.
Here are matrix multiplication benchmark results using the numpy currently in EPEL:
Double precision matrix multiplication test using NumPy.
Multiplying two NxN matrices.
With an optimized BLAS, the equivalent Fortran code using dgemm performance:
Double precision matrix multiplication test
Matrix side size Matmul (Gflops/s) dgemm (Gflops/s)
2 0.150 0.001
4 0.303 0.005
8 0.530 0.042
16 0.829 0.245
32 1.405 1.366
64 1.669 3.533
128 2.030 5.611
256 2.369 6.792
512 2.596 7.058
1024 0.601 7.428
2048 0.565 7.766
(The matmul column is performance using the F90 MATMUL intrinsic, which is in this case the same one would get using the generic netlib BLAS library). So one can see that for large matrices, NumPy without ATLAS is just incredibly slow.
Python benchmark code below:
# Matmul benchmark in python/numpy
import numpy as npy
"""Matrix multiplication benchmark for nxn matrices until nnxnn."""
n = 2
print "Double precision matrix multiplication test using NumPy."
print "Multiplying two NxN matrices."
print " N Gflops/s"
while n < nn:
a = npy.random.rand(n, n)
b = npy.random.rand(n, n)
flops = (2 * float(n) - 1) * float(n)**2
# Assuming an on average 1 gflop/s cpu, 1e9 flops takes about 1 second and
# should be enough. We also do a maximum of 1e5 loops, since
# for small arrays the overhead is large.
loop = int(max(min(1.e9 / flops, 1e5), 1))
t1 = time.time()
for i in xrange(loop):
c = npy.dot(a, b)
t2 = time.time()
perf = flops * loop / (t2 - t1) / 1.e9
print "%4i" % n + " " + "%6.3f" % perf
n *= 2
if n > nn:
I've not been able to locate the RHEL or EPEL package I should BuildRequire in lieu of blas-devel. Can you point it out?
Uh, seems ATLAS is not in EPEL (yet). It's in Fedora though:
So I suppose this bug must be on hold until someone packages ATLAS for EPEL as well.
In case you're responsible for numpy in fedora as well, it could at least be fixed there, in case it isn't already.
I'll try it out in Fedora, and I've pinged the atlas maintainer to investigate branching atlas for EPEL.
The Fedora atlas maintainer is in the process of updating atlas in fedora. When completed, I'll rebuild numpy against it there, and then atlas will be built for EL-5, and I'll rebuild numpy there as well.
Great, thanks a lot!
The atlas maintainer reports that atlas has been built for EL-5. Once it's pushed, I'll rebuild numpy.
Built for EL-5, will be pushed to testing in stable in due course.