Bug 1114964

Summary: no AVX support
Product: [Fedora] Fedora Reporter: Dave Love <dave.love>
Component: fftwAssignee: Conrad Meyer <cse.cem+redhatbugz>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 22CC: cse.cem+redhatbugz, jchaloup, susi.lehtola
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-19 11:53:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sandybridge benchfft real, double, avx
none
sandybridge benchfft real, double, no avx
none
sandybridge benchfft complex, double, avx
none
sandybridge benchfft complex, double, no avx
none
Graph of AVX vs no-AVX (real and complex), powers of 2
none
Graph of AVX vs no-AVX (real and complex), non-powers of 2
none
Graph of AVX vs no-AVX (real and complex), powers of 2
none
Graph of AVX vs no-AVX (real and complex), non-powers of 2 none

Description Dave Love 2014-07-01 10:56:44 UTC
I was surprised to find that the packages don't have AVX support turned on.
The spec file just says

  (no avx as it is claimed to drastically slower)

That's not generally true, specifically on sandybridge with the DP 1D cases that
seem to be most important for chemistry codes which use so many cycles on typical
HPC systems.  I can't do full benchfft runs now due to maintenance, but will
later; I just ran a few different sizes interactively.  I can't find any relevant
info on the web, and Debian has AVX on.

I think this at least needs documenting with justification.  If there
are cases where there's a significant problem, and the maintainers can't fix them,
how about a runtime switch (via the environment), or at least a separate library?

Comment 1 Conrad Meyer 2014-07-01 14:31:49 UTC
If you have the time and resources to run benchmark both and compare, please do.

(I started running benchfft on my personal machine, but it takes a long time.)

Comment 2 Dave Love 2014-07-03 15:58:25 UTC
(In reply to Conrad Meyer from comment #1)
> If you have the time and resources to run benchmark both and compare, please
> do.

It might help to know under what circumstances the AVX is supposed to be slow.
Anyhow, I'll attach results with and without AVX (selected with an environment
variable, using a modified version of the library) on a sandybridge node.

I'm a bit confused by them, and the bench program in the source tests directory
gives better results for AVX in the cases I tried, but I don't have time to
investigate now.

Comment 3 Dave Love 2014-07-03 15:59:51 UTC
Created attachment 914522 [details]
sandybridge benchfft real, double, avx

Comment 4 Dave Love 2014-07-03 16:00:36 UTC
Created attachment 914523 [details]
sandybridge benchfft real, double, no avx

Comment 5 Dave Love 2014-07-03 16:01:39 UTC
Created attachment 914524 [details]
sandybridge benchfft complex, double, avx

Comment 6 Dave Love 2014-07-03 16:02:33 UTC
Created attachment 914525 [details]
sandybridge benchfft complex, double, no avx

Comment 7 Conrad Meyer 2014-07-03 16:29:06 UTC
Created attachment 914527 [details]
Graph of AVX vs no-AVX (real and complex), powers of 2

> It might help to know under what circumstances the AVX is supposed to be slow.

Honestly, I have no idea.

Comment 8 Conrad Meyer 2014-07-03 16:29:34 UTC
Created attachment 914528 [details]
Graph of AVX vs no-AVX (real and complex), non-powers of 2

Comment 9 Conrad Meyer 2014-07-03 16:31:05 UTC
Oh, hrm, plots may be bogus...

Comment 10 Conrad Meyer 2014-07-03 16:39:27 UTC
Created attachment 914531 [details]
Graph of AVX vs no-AVX (real and complex), powers of 2

Fixed the plot.

Comment 11 Conrad Meyer 2014-07-03 16:41:02 UTC
Created attachment 914532 [details]
Graph of AVX vs no-AVX (real and complex), non-powers of 2

Plot fixed. Doesn't seem to make much difference on the real mode (r2r), but for complex number FFT, AVX is the clear winner.

Thanks for collecting the data!

Comment 12 Susi Lehtola 2014-07-03 16:55:13 UTC
The original rationale was that GROMACS claims 20% worse performance with avx enabled fftw. But I guess it should be enabled in the default fftw library.

It sure would be useful if one could trigger the use of avx instructions in userland.

Comment 13 Conrad Meyer 2014-07-03 17:46:40 UTC
Rawhide build w/ AVX enabled: http://koji.fedoraproject.org/koji/taskinfo?taskID=7104966 (tested local build first and it finished fine).

Comment 14 Conrad Meyer 2014-07-03 17:49:56 UTC
> GROMACS claims 20% worse performance with avx enabled fftw

They claim *up to* 20% worse, because they only use short transform lengths. I agree that it's not a good reason to disable AVX for the system library, especially when we can't reproduce those claims. (The attached graphs don't show anything close to 20% worse, and on larger transform sizes we see 40-60% improvement with AVX enabled.)

Comment 15 Fedora Update System 2014-07-04 01:28:31 UTC
fftw-3.3.4-3.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/fftw-3.3.4-3.fc20

Comment 16 Fedora Update System 2014-07-10 00:23:39 UTC
fftw-3.3.4-3.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Jaroslav Reznik 2015-03-03 17:14:47 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 18 Dave Love 2015-03-30 20:53:09 UTC
For information of anyone interested (not suggesting it needs action):
There's a recent change available from Gromacs-land to address what I assume is the reported problem with small transforms: <https://github.com/FFTW/fftw3/commit/b606e3191e5b65e2e13f67ef7dad5b1e7c40206c>.

Comment 19 Fedora End Of Life 2016-07-19 11:53:08 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.