Bug 1114964 - no AVX support
Summary: no AVX support
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: fftw
Version: 22
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Conrad Meyer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-01 10:56 UTC by Dave Love
Modified: 2016-07-19 11:53 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-07-19 11:53:08 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
sandybridge benchfft real, double, avx (6.97 KB, text/plain)
2014-07-03 15:59 UTC, Dave Love
no flags Details
sandybridge benchfft real, double, no avx (6.96 KB, text/plain)
2014-07-03 16:00 UTC, Dave Love
no flags Details
sandybridge benchfft complex, double, avx (41.40 KB, text/plain)
2014-07-03 16:01 UTC, Dave Love
no flags Details
sandybridge benchfft complex, double, no avx (41.44 KB, text/plain)
2014-07-03 16:02 UTC, Dave Love
no flags Details
Graph of AVX vs no-AVX (real and complex), powers of 2 (8.81 KB, application/pdf)
2014-07-03 16:29 UTC, Conrad Meyer
no flags Details
Graph of AVX vs no-AVX (real and complex), non-powers of 2 (8.57 KB, application/pdf)
2014-07-03 16:29 UTC, Conrad Meyer
no flags Details
Graph of AVX vs no-AVX (real and complex), powers of 2 (6.98 KB, application/pdf)
2014-07-03 16:39 UTC, Conrad Meyer
no flags Details
Graph of AVX vs no-AVX (real and complex), non-powers of 2 (6.79 KB, application/pdf)
2014-07-03 16:41 UTC, Conrad Meyer
no flags Details

Description Dave Love 2014-07-01 10:56:44 UTC
I was surprised to find that the packages don't have AVX support turned on.
The spec file just says

  (no avx as it is claimed to drastically slower)

That's not generally true, specifically on sandybridge with the DP 1D cases that
seem to be most important for chemistry codes which use so many cycles on typical
HPC systems.  I can't do full benchfft runs now due to maintenance, but will
later; I just ran a few different sizes interactively.  I can't find any relevant
info on the web, and Debian has AVX on.

I think this at least needs documenting with justification.  If there
are cases where there's a significant problem, and the maintainers can't fix them,
how about a runtime switch (via the environment), or at least a separate library?

Comment 1 Conrad Meyer 2014-07-01 14:31:49 UTC
If you have the time and resources to run benchmark both and compare, please do.

(I started running benchfft on my personal machine, but it takes a long time.)

Comment 2 Dave Love 2014-07-03 15:58:25 UTC
(In reply to Conrad Meyer from comment #1)
> If you have the time and resources to run benchmark both and compare, please
> do.

It might help to know under what circumstances the AVX is supposed to be slow.
Anyhow, I'll attach results with and without AVX (selected with an environment
variable, using a modified version of the library) on a sandybridge node.

I'm a bit confused by them, and the bench program in the source tests directory
gives better results for AVX in the cases I tried, but I don't have time to
investigate now.

Comment 3 Dave Love 2014-07-03 15:59:51 UTC
Created attachment 914522 [details]
sandybridge benchfft real, double, avx

Comment 4 Dave Love 2014-07-03 16:00:36 UTC
Created attachment 914523 [details]
sandybridge benchfft real, double, no avx

Comment 5 Dave Love 2014-07-03 16:01:39 UTC
Created attachment 914524 [details]
sandybridge benchfft complex, double, avx

Comment 6 Dave Love 2014-07-03 16:02:33 UTC
Created attachment 914525 [details]
sandybridge benchfft complex, double, no avx

Comment 7 Conrad Meyer 2014-07-03 16:29:06 UTC
Created attachment 914527 [details]
Graph of AVX vs no-AVX (real and complex), powers of 2

> It might help to know under what circumstances the AVX is supposed to be slow.

Honestly, I have no idea.

Comment 8 Conrad Meyer 2014-07-03 16:29:34 UTC
Created attachment 914528 [details]
Graph of AVX vs no-AVX (real and complex), non-powers of 2

Comment 9 Conrad Meyer 2014-07-03 16:31:05 UTC
Oh, hrm, plots may be bogus...

Comment 10 Conrad Meyer 2014-07-03 16:39:27 UTC
Created attachment 914531 [details]
Graph of AVX vs no-AVX (real and complex), powers of 2

Fixed the plot.

Comment 11 Conrad Meyer 2014-07-03 16:41:02 UTC
Created attachment 914532 [details]
Graph of AVX vs no-AVX (real and complex), non-powers of 2

Plot fixed. Doesn't seem to make much difference on the real mode (r2r), but for complex number FFT, AVX is the clear winner.

Thanks for collecting the data!

Comment 12 Susi Lehtola 2014-07-03 16:55:13 UTC
The original rationale was that GROMACS claims 20% worse performance with avx enabled fftw. But I guess it should be enabled in the default fftw library.

It sure would be useful if one could trigger the use of avx instructions in userland.

Comment 13 Conrad Meyer 2014-07-03 17:46:40 UTC
Rawhide build w/ AVX enabled: http://koji.fedoraproject.org/koji/taskinfo?taskID=7104966 (tested local build first and it finished fine).

Comment 14 Conrad Meyer 2014-07-03 17:49:56 UTC
> GROMACS claims 20% worse performance with avx enabled fftw

They claim *up to* 20% worse, because they only use short transform lengths. I agree that it's not a good reason to disable AVX for the system library, especially when we can't reproduce those claims. (The attached graphs don't show anything close to 20% worse, and on larger transform sizes we see 40-60% improvement with AVX enabled.)

Comment 15 Fedora Update System 2014-07-04 01:28:31 UTC
fftw-3.3.4-3.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/fftw-3.3.4-3.fc20

Comment 16 Fedora Update System 2014-07-10 00:23:39 UTC
fftw-3.3.4-3.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Jaroslav Reznik 2015-03-03 17:14:47 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 18 Dave Love 2015-03-30 20:53:09 UTC
For information of anyone interested (not suggesting it needs action):
There's a recent change available from Gromacs-land to address what I assume is the reported problem with small transforms: <https://github.com/FFTW/fftw3/commit/b606e3191e5b65e2e13f67ef7dad5b1e7c40206c>.

Comment 19 Fedora End Of Life 2016-07-19 11:53:08 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.