Bug 1556989 - gromacs: EwaldUnitTests and SimdUnitTests fail on ppc64le with gcc-8.0.1
Summary: gromacs: EwaldUnitTests and SimdUnitTests fail on ppc64le with gcc-8.0.1
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: gromacs
Version: 29
Hardware: ppc64le
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Christoph Junghans
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: PPCTracker
TreeView+ depends on / blocked
 
Reported: 2018-03-15 17:08 UTC by Christoph Junghans
Modified: 2018-08-14 10:10 UTC (History)
16 users (show)

See Also:
(edit)
Clone Of:
(edit)
Last Closed:


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
GNU Compiler Collection 84907 None None None 2019-02-13 11:17 UTC

Description Christoph Junghans 2018-03-15 17:08:44 UTC
Description of problem:
Gromacs' SimdUnitTests fails it:

/builddir/build/BUILD/gromacs-2018/src/gromacs/simd/tests/bootstrap_loadstore.cpp:112: Failure
Value of: pCopyDst[i]
  Actual: 5
Expected: pCopySrc[i]
Which is: 6
SIMD load or store not moving data correctly for element 0

Version-Release number of selected component (if applicable):
gromacs-2018.0 on f28, f29 with gcc-8

How reproducible:
Build gromacs-2018-1.fc29.1 (https://src.fedoraproject.org/rpms/gromacs/c/c3a5a134b34548618e3b98aea66b700b7fb4e122?branch=master) on f28 or f29 (currently with coming with gcc-8)

Steps to Reproduce:
1. fedpkg clone gromacs
2. cd gromacs
3. git checkout c3a5a134b34548618e3b98aea66b700b7fb4e122
4. fedpkg srpm
5. fedpkg build --scratch --srpm gromacs-2018-1.fc29.1.src.rpm

Actual results:
Test fail

Expected results:
Test should pass (and pass on f27 with older gcc)

Additional info:
Reported upstream here https://redmine.gromacs.org/issues/2421, they think it is a compiler bug.

https://koji.fedoraproject.org/koji/taskinfo?taskID=25294220

Comment 1 Jakub Jelinek 2018-03-15 18:27:56 UTC
I see gromacs-2018-1.fc29.2.src.rpm has been built successfully.
So, does the -1.fc29.2 contain some workaround for this?
https://kojipkgs.fedoraproject.org/packages/gromacs/2018/1.fc29.2/data/logs/ppc64le/build.log
has
      Start  7: EwaldUnitTests
 7/24 Test  #7: EwaldUnitTests ...................   Passed    0.41 sec
      Start 23: SimdUnitTests
23/24 Test #23: SimdUnitTests ....................   Passed    0.06 sec

Comment 2 Dan Horák 2018-03-15 20:39:07 UTC
It does have a workaround, if I see right - https://src.fedoraproject.org/rpms/gromacs/c/c5c17791036e12d4ddcd6e017b22f5e7bd33ffcb?branch=master - VSX is not used

Comment 3 Jakub Jelinek 2018-03-16 16:09:29 UTC
This is a gromacs bug.

I assume at minimum you need something like following, where vec_xl and vec_xst are proper intrinsics for unaligned vector loads and stores,
but looking around I see various *oadU* and *toreU* functions/function templates calling non-U suffixed ones, so that might need fixing too.
E.g. gatherLoadUTranspose uses simdLoad rather than simdLoadU etc., that looks wrong to me.

--- gromacs-2018/src/gromacs/simd/impl_ibm_vsx/impl_ibm_vsx_simd_float.h.jj	2017-12-11 13:44:54.000000000 +0100
+++ gromacs-2018/src/gromacs/simd/impl_ibm_vsx/impl_ibm_vsx_simd_float.h	2018-03-16 17:02:45.915186989 +0100
@@ -121,14 +121,14 @@ static inline SimdFloat gmx_simdcall
 simdLoadU(const float *m, SimdFloatTag = {})
 {
     return {
-               *reinterpret_cast<const __vector float *>(m)
+               vec_xl(0, m)
     };
 }
 
 static inline void gmx_simdcall
 storeU(float *m, SimdFloat a)
 {
-    *reinterpret_cast<__vector float *>(m) = a.simdInternal_;
+    vec_xst(a.simdInternal_, 0, m);
 }
 
 static inline SimdFloat gmx_simdcall
@@ -157,14 +157,14 @@ static inline SimdFInt32 gmx_simdcall
 simdLoadU(const std::int32_t *m, SimdFInt32Tag)
 {
     return {
-               *reinterpret_cast<const __vector int *>(m)
+               vec_xl(0, m)
     };
 }
 
 static inline void gmx_simdcall
 storeU(std::int32_t * m, SimdFInt32 a)
 {
-    *reinterpret_cast<__vector int *>(m) = a.simdInternal_;
+    vec_xst(a.simdInternal_, 0, m);
 }
 
 static inline SimdFInt32 gmx_simdcall
--- gromacs-2018/src/gromacs/simd/impl_ibm_vsx/impl_ibm_vsx_simd_double.h.jj	2017-12-11 13:44:54.000000000 +0100
+++ gromacs-2018/src/gromacs/simd/impl_ibm_vsx/impl_ibm_vsx_simd_double.h	2018-03-16 17:05:08.248153217 +0100
@@ -121,14 +121,14 @@ static inline SimdDouble gmx_simdcall
 simdLoadU(const double *m, SimdDoubleTag = {})
 {
     return {
-               *reinterpret_cast<const __vector double *>(m)
+               vec_xl(0, m)
     };
 }
 
 static inline void gmx_simdcall
 storeU(double *m, SimdDouble a)
 {
-    *reinterpret_cast<__vector double *>(m) = a.simdInternal_;
+    vec_xst(a.simdInternal_, 0, m);
 }
 
 static inline SimdDouble gmx_simdcall

Comment 4 Jakub Jelinek 2018-03-16 17:58:16 UTC
Feel free to point upstream at the above mentioned GCC http://gcc.gnu.org/PR84907 , which has been closed as INVALID.  -fsanitize=undefined would have likely diagnosed this bug.

Comment 5 Erik Lindahl 2018-03-16 21:50:03 UTC
When we first wrote this code (just when Power8 appeared), gcc-6 produced buggy assembly when using the intrinsics.

After discussing with Michael Gschwind from IBM (whose team wrote the VSX compiler layer for gcc), we were strongly instructed that we should NOT be using the low-level intrinsics directly, but the prescribed usage was (at least at the time) to simply dereference memory and let the compiler do its job.

Thus, if you claim this is invalid use, I would appreciate a pointer to GCC documentation, authors or mailing lists posts specifying that things have either changed, or that those instructions were incorrect for PPC64le/VSX - since they only instructions at least quite recently were the opposite ;-)

Comment 6 Bill Schmidt 2018-03-16 22:10:46 UTC
Hi Erik,

I'm a co-author and current maintainer of the ELF v2 ABI document that included these instructions.  Michael has sadly moved on from this project.

We've recently published errata against the document when we realized that the recommendations in the ABI actually constitute undefined behavior according to the C specification, and was broken with respect to alignment guarantees in GCC.  You can find the original document at https://openpowerfoundation.org/?resource_lib=64-bit-elf-v2-abi-specification-power-architecture, and the errata describing the changes at https://openpowerfoundation.org/?resource_lib=openpower-elfv2-errata-elfv2-abi-version-1-4.

Your existing code works with GCC 7 and earlier code, but optimizations that take alignment into account in the upcoming GCC 8 release expose its undefined behavior, leading to this bug report.

I apologize for the difficulties that this causes you, but indeed the right answer is to always use the vec_xl and vec_xst intrinsics for unaligned loads and stores.  Please feel free to contact me directly with any questions or concerns.

Thanks,

Bill Schmidt, Ph.D.
STSM and GCC Architect for Linux on Power
Linux Technology Center
IBM Corporation

Comment 7 Erik Lindahl 2018-03-16 22:17:10 UTC
Hi Bill,

Thanks - no need to apologize, I realize things change, and we always appreciate clear definitions.

1) We might not have tested with earlier gcc versions in a while, but does this mean it is now considered a valid bug if the intrinsics don't work e.g. on gcc-6?

2) Can we also count on the intrinsics working for all aligned/unaligned load/store operations on xlC? Looking back at my correspondence, it appears that's where we first saw problems a few years ago.

Cheers,

Erik

Comment 8 Bill Schmidt 2018-03-16 22:22:31 UTC
1) Yes, absolutely.  GCC 6 is still in maintenance until this summer, and we want to be sure everything is working.  To my knowledge, vec_xl and vec_xst are working well in GCC 6.  Please report any bugs at the GCC bugzilla site and add wschmidt@gcc.gnu.org to the CC list to get my attention directly.

2) To my knowledge, yes, but I know the XLC team will want to know of any problems.  We've worked closely with them on compatibility issues over the years, and I know that they've been actively testing the intrinsics.  If you find problems, I can definitely put you in touch with the right person.

Thanks very much for your kind understanding!  Please let me know if I can be of any further assistance.

Bill

Comment 9 Erik Lindahl 2018-03-16 22:32:39 UTC
Just for reference, fix available at https://gerrit.gromacs.org/#/c/7688/ - it will be in the next Gromacs patch release.

Comment 10 Christoph Junghans 2018-03-20 05:37:12 UTC
Build that included that patch: 
https://koji.fedoraproject.org/koji/taskinfo?taskID=25824985

Comment 11 Christoph Junghans 2018-04-11 20:29:53 UTC
gromacs-2018.1 (bug #1559202) doesn't fix the issue, hopefully gromacs-2018.2 will.

Comment 12 Erik Lindahl 2018-04-11 20:54:24 UTC
yeah, sorry - there was one more fix we needed ( https://gerrit.gromacs.org/#/c/7696/ ) that was uploaded the day after, but unfortunately 2018.1 was pushed out before that was committed on Mar 27. Please test the release-2018 branch for us if you want to make 100% sure - if that doesn't work we'll investigate before we push out 2018.2.

Comment 13 Christoph Junghans 2018-04-11 21:28:58 UTC
(In reply to Erik Lindahl from comment #12)
> yeah, sorry - there was one more fix we needed (
> https://gerrit.gromacs.org/#/c/7696/ ) that was uploaded the day after, but
> unfortunately 2018.1 was pushed out before that was committed on Mar 27.
> Please test the release-2018 branch for us if you want to make 100% sure -
> if that doesn't work we'll investigate before we push out 2018.2.
No worries, I have an eye on https://redmine.gromacs.org/issues/2421

Comment 14 Jan Kurik 2018-08-14 10:10:49 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle.
Changing version to '29'.


Note You need to log in before you can comment on or make changes to this bug.