Bug 461472 - NumPy should be built against ATLAS BLAS
NumPy should be built against ATLAS BLAS
 Keywords: CLOSED NEXTRELEASE None Fedora EPEL Fedora numpy --- el5 All Linux medium medium --- Gwyn Ciesla Fedora Extras Quality Assurance depends on / blocked

 Reported: 2008-09-08 12:22 UTC by Janne Blomqvist 2009-06-29 08:49 UTC (History) 1 user (show) tru Bug Fix 2009-04-03 13:27:38 UTC ---

 Janne Blomqvist 2008-09-08 12:22:47 UTC ```Description of problem: NumPy uses BLAS for the dot() function only if built against ATLAS (math-atlas.sf.net), a generic BLAS library isn't sufficient. See http://scipy.org/scipy/numpy/ticket/667 In case ATLAS isn't available, dot() performance will be very poor. Here are matrix multiplication benchmark results using the numpy currently in EPEL: Double precision matrix multiplication test using NumPy. Multiplying two NxN matrices. N Gflops/s =============== 2 0.007 4 0.058 8 0.301 16 0.785 32 1.166 64 1.217 128 0.839 256 0.872 512 0.469 1024 0.215 2048 0.125 With an optimized BLAS, the equivalent Fortran code using dgemm performance: Double precision matrix multiplication test Matrix side size Matmul (Gflops/s) dgemm (Gflops/s) ========================================================= 2 0.150 0.001 4 0.303 0.005 8 0.530 0.042 16 0.829 0.245 32 1.405 1.366 64 1.669 3.533 128 2.030 5.611 256 2.369 6.792 512 2.596 7.058 1024 0.601 7.428 2048 0.565 7.766 (The matmul column is performance using the F90 MATMUL intrinsic, which is in this case the same one would get using the generic netlib BLAS library). So one can see that for large matrices, NumPy without ATLAS is just incredibly slow. Python benchmark code below: #!/usr/bin/python # Matmul benchmark in python/numpy import numpy as npy import time def mm_timing(nn): """Matrix multiplication benchmark for nxn matrices until nnxnn.""" n = 2 print "Double precision matrix multiplication test using NumPy." print "Multiplying two NxN matrices." print "" print " N Gflops/s" print "===============" while n < nn: a = npy.random.rand(n, n) b = npy.random.rand(n, n) flops = (2 * float(n) - 1) * float(n)**2 # Assuming an on average 1 gflop/s cpu, 1e9 flops takes about 1 second and # should be enough. We also do a maximum of 1e5 loops, since # for small arrays the overhead is large. loop = int(max(min(1.e9 / flops, 1e5), 1)) t1 = time.time() for i in xrange(loop): c = npy.dot(a, b) t2 = time.time() perf = flops * loop / (t2 - t1) / 1.e9 print "%4i" % n + " " + "%6.3f" % perf n *= 2 if n > nn: break mm_timing(3000) ``` Gwyn Ciesla 2008-09-08 12:52:30 UTC ```I've not been able to locate the RHEL or EPEL package I should BuildRequire in lieu of blas-devel. Can you point it out? ``` Janne Blomqvist 2008-09-08 13:42:57 UTC ```Uh, seems ATLAS is not in EPEL (yet). It's in Fedora though: https://admin.fedoraproject.org/pkgdb/packages/name/atlas So I suppose this bug must be on hold until someone packages ATLAS for EPEL as well. In case you're responsible for numpy in fedora as well, it could at least be fixed there, in case it isn't already. ``` Gwyn Ciesla 2008-09-08 14:10:00 UTC ```I'll try it out in Fedora, and I've pinged the atlas maintainer to investigate branching atlas for EPEL. ``` Gwyn Ciesla 2008-09-08 14:31:56 UTC ```The Fedora atlas maintainer is in the process of updating atlas in fedora. When completed, I'll rebuild numpy against it there, and then atlas will be built for EL-5, and I'll rebuild numpy there as well. ``` Janne Blomqvist 2008-09-08 16:07:16 UTC ```Great, thanks a lot! ``` Gwyn Ciesla 2009-03-09 13:47:54 UTC ```The atlas maintainer reports that atlas has been built for EL-5. Once it's pushed, I'll rebuild numpy. ``` Gwyn Ciesla 2009-04-03 13:27:38 UTC ```Built for EL-5, will be pushed to testing in stable in due course. ```