Bug 1709933 - Rebuild for MPI-based ARMCI
Summary: Rebuild for MPI-based ARMCI
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: ga
Version: epel7
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: marcindulak
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-14 15:03 UTC by Dave Love
Modified: 2024-07-09 02:50 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-07-09 02:50:59 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Dave Love 2019-05-14 15:03:00 UTC
The package currently uses openib for ARMCI, so it won't run on a network that isn't supported by openib (e.g. TCP, whether or not that's sensible).  openib seems to be deprecated in favour of MPI anyhow.  For EPEL with openmpi 1.10, and probably Fedora with openmpi 2.whatever, I think that needs to be mpi-pr.  mpi-rma might be better for a later openmpi, but unfortunately the ARMCI site doesn't have the promised support matrix.  I don't know about MPICH.

Comment 1 marcindulak 2019-05-17 18:24:13 UTC
See https://github.com/GlobalArrays/ga/issues/144

Comment 2 Jeff Hammond 2019-05-17 19:26:20 UTC
You should switch to MPI-PR, because it's widely portable and better for almost every purpose.  The likelihood that OPENIB is faster with Verbs is offset by its instability when the user allocates a large fraction of system memory or page registration is otherwise stressed.

You also need to stop using an obsolete version of Open-MPI.  Version 1.10 was released in 2015 and lacks a wide range of improvements, bug fixes, etc.  Using Open-MPI 1.x is equivalent to using Linux 2.6.x at this point.  Please use Open-MPI 3+ or MPICH 3+ if you want to have a reasonable expectation of things working properly.

The MPI-RMA of ARMCI from PNNL isn't better than MPI-PR (because MPI-PR works quite well).  If you want to use RMA, please look at ARMCI-MPI (https://github.com/pmodels/armci-mpi/), but I'm not recommending that.

For context, I am:
- An NWChem developer and power user who installs and runs the code on a wide range of Linux-based platforms.
- An outside contributor to Global Arrays (and the canonical ARMCI it includes).
- The lead developer of ARMCI-MPI since 2014, which means I know about essentially every MPI RMA bug in MPICH and Open-MPI that has or still exists.

Comment 3 Orion Poplawski 2019-05-19 22:34:51 UTC
RHEL7 now provides the openmpi3 package with version 3.0.2.

Comment 4 Dave Love 2019-05-20 10:57:40 UTC
It would be helpful to document the recommendation not to use MPI-RMA
(which I hadn't realized was separate from the mpi3 configuration
shipped with GA).  It was on my list to raise an issue about
recommendations.

The default RHEL7 OMPI is 1.10, against which we need to package
(assuming it works), but nwchem failed against my rebuild for 1.10 and
mpi-pr, as reported.  That build includes openmpi3 support, but I
haven't tested it yet.  I'd need to rebuild nwchem, which is rather
protracted; presumably it could use openblas64 and speed up building
if there was a scalapack64 package.  I'll try eventually and report
back with the openmpi3+mpi-pr that's building in copr, but I can't
easily test it on an HPC fabric.

[My non-trivial experience of openmpi is that newer versions have N
steps forwards and M>1 backwards.  It wouldn't surprise me to find
just different problems with the openmpi3 package.  We've found plenty
of problems with OMPI4, though the second release does allow building
a few more packages.]

Comment 5 Dave Love 2019-05-20 15:53:20 UTC
Apologies, scratch what I said about the openmpi 1.10 problem -- I was getting confused with lammps, context-switching too much recently.

The builds of nwchem and ga in the loveshack/livhpc copr work for openmpi and openmpi3, but the openmpi3 one was substantially slower for the one job I've run (over TCP -- whether or not that's sensible...).

Comment 6 Dave Love 2019-06-26 14:18:34 UTC
Any chance of getting this fixed?

Comment 7 marcindulak 2019-10-01 19:19:58 UTC
About the specific ga version bump we need to synchronize with Edoardo - he prepared the last update of the spec https://github.com/edoapra/fedpkg/tree/master/ga
Taking ga-5.7 may be fine. 

The nwchem.spec we should target https://src.fedoraproject.org/rpms/nwchem/tree/master corresponds to https://github.com/nwchemgit/nwchem/commit/03aaf4c85eecb8bf85b6f23783d650e6db1f4c43 (see https://github.com/nwchemgit/nwchem/issues/136), but I could update the spec to a more recent nwchem commit if needed.

Dave - if you could prepare a single updated spec based on https://src.fedoraproject.org/rpms/ga/tree/master, merging your changes, that builds on all supported Fedora (including Rawhide) and EPEL (I'm not sure we should drop epel6 yet) - I could submit that as updates.

Comment 8 Edoardo Apra 2019-10-01 21:54:19 UTC
Marcin
I have update ga.spec in https://github.com/edoapra/fedpkg/tree/master/ga to enable use of ga 5.7
https://github.com/edoapra/fedpkg/commit/8dfeb6d4d3eea9794b5474e0c483302cbd97fb30

We might want to set up different RPMs for TARGET=MPI-TS and TARGET=MPI-PR. MPI-TS the default and MPI-PR only for advanced users.

Comment 9 Jeff Hammond 2019-10-01 23:19:39 UTC
The problem here is that MPI-PR doesn't support nproc=1 but MPI-TS will not perform well for nproc>1 in most cases.  It would be lovely if we could select one or the other at runtime, but I don't think anyone has time to implement that.

ARMCI-MPI is often a better option for single-node usage.

Comment 10 Edoardo Apra 2019-10-02 01:34:55 UTC
We might want to have a series of RPMs for a variety of TARGET: MPI-PR, MPI-TS, ARMCI-MPI, OPENIB.
Then it's up to the users to pick the one they prefer.
Does anyone have any better solution?

Comment 11 Edoardo Apra 2019-10-02 21:29:15 UTC
Added MPI-PR rpms for openmpi
https://github.com/edoapra/fedpkg/commit/7cf84617080cd12e55ac6e2c5f8c031cc4f26990
Since I am not really proficient in RPM recipes, I would like to hear feedback about these changes.

Comment 12 Dave Love 2019-10-03 10:19:18 UTC
(In reply to marcindulak from comment #7)
> Dave - if you could prepare a single updated spec based on
> https://src.fedoraproject.org/rpms/ga/tree/master, merging your changes,
> that builds on all supported Fedora (including Rawhide) and EPEL (I'm not
> sure we should drop epel6 yet) - I could submit that as updates.

I did that in copr and tested it, but I think it needs substantial
changes, as the current version wouldn't pass review.  I'll fix it up.
I've tested with mpi-pr as Jeff suggested.  (I hadn't realized you
were a co-maintainer, as I've been chasing David Brown.  Thanks.)

Comment 13 Dave Love 2019-10-03 10:21:36 UTC
(In reply to Jeff Hammond from comment #9)
> ARMCI-MPI is often a better option for single-node usage.

Does that mean specifically your version rather than ga's?

Comment 14 Jeff Hammond 2019-10-05 00:12:40 UTC
Yes.  There are tradeoffs.  I use MPI-PR quite a bit but I think some users will be bothered by the lack of support for sequential execution, i.e. "nwchem input.nw".

Comment 15 marcindulak 2019-10-05 17:24:37 UTC
(In reply to Edoardo Apra from comment #8)
> Marcin
> I have update ga.spec in https://github.com/edoapra/fedpkg/tree/master/ga to
> enable use of ga 5.7
> https://github.com/edoapra/fedpkg/commit/
> 8dfeb6d4d3eea9794b5474e0c483302cbd97fb30
> 
> We might want to set up different RPMs for TARGET=MPI-TS and TARGET=MPI-PR.
> MPI-TS the default and MPI-PR only for advanced users.

We need to enable make check by adding to the spec
%define do_test 1

I see different types of failures
- epel6 https://koji.fedoraproject.org/koji/taskinfo?taskID=38067905
- epel7 link failure https://koji.fedoraproject.org/koji/taskinfo?taskID=38067906
- f30 https://koji.fedoraproject.org/koji/taskinfo?taskID=38071693


It is not clear to me from the spec that openmpi and openmpi-mpipr contain actually a different build.

I think we need also ga-openmpi-3 and ga-openmpi3-mpipr specifically on epel7 in the spec.

Comment 16 marcindulak 2019-10-05 17:56:57 UTC
(In reply to Jeff Hammond from comment #9)
> The problem here is that MPI-PR doesn't support nproc=1 but MPI-TS will not
> perform well for nproc>1 in most cases.  It would be lovely if we could
> select one or the other at runtime, but I don't think anyone has time to
> implement that.
> 
> ARMCI-MPI is often a better option for single-node usage.

At what NPROC scale the performance differences between MPI-TS and MPI-PR are visible, how large are they?

The purpose of RPMS is to provide a baseline, anybody who uses nwchem on a cluster will anyway compile
the whole stacks of software using specific compilers and architectures using e.g. https://github.com/easybuilders

Comment 17 Edoardo Apra 2019-10-06 19:09:05 UTC
(In reply to marcindulak from comment #15)
> (In reply to Edoardo Apra from comment #8)
> > Marcin
> > I have update ga.spec in https://github.com/edoapra/fedpkg/tree/master/ga to
> > enable use of ga 5.7
> > https://github.com/edoapra/fedpkg/commit/
> > 8dfeb6d4d3eea9794b5474e0c483302cbd97fb30
> > 
> > We might want to set up different RPMs for TARGET=MPI-TS and TARGET=MPI-PR.
> > MPI-TS the default and MPI-PR only for advanced users.
> 
> We need to enable make check by adding to the spec
> %define do_test 1
> 
> I see different types of failures
> - epel6 https://koji.fedoraproject.org/koji/taskinfo?taskID=38067905
> - epel7 link failure
> https://koji.fedoraproject.org/koji/taskinfo?taskID=38067906
> - f30 https://koji.fedoraproject.org/koji/taskinfo?taskID=38071693
> 
> 
> It is not clear to me from the spec that openmpi and openmpi-mpipr contain
> actually a different build.
> 
> I think we need also ga-openmpi-3 and ga-openmpi3-mpipr specifically on
> epel7 in the spec.

Yes, this rpm is not functional.
The two openmpi rpms end up having been build with mpipr.
The problem comes from the fact that both builds store the (same) libraries in the same directory.
Therefore, the second mpipr build over-writes the result of the first build.
I am not quire sure how to fix this

Comment 18 marcindulak 2019-10-06 19:44:13 UTC
Yes, we can have only one openmpi build on fedora/epel6/epel8.

On epel7 only, in addition to the default MPI-TS, we can use the openmpi3 for the MPI-PR build.

https://src.fedoraproject.org/rpms/scalapack/blob/HEAD/f/scalapack.spec uses some clever bcond_with/without logic to build openmpi3 on epel7.

Comment 19 Dave Love 2019-10-07 10:02:58 UTC
The builds under https://copr.fedorainfracloud.org/coprs/loveshack/livhpc/build/1049581/ have the sort of (fairly extensive) changes I'd want as a reviewer, built for mpi-pr.
(I've run a previous version with mpi-pr before cleaning up the spec.)
I dropped things which seem not to be relevant for Fedora, including static libraries.
Yes, I elided the tests, due to the undefined symbols which I couldn't find anywhere, which presumably deserve a bug report.

I disagree with packaging not supporting multi-node use properly; that copr is originally from when I ran an HPC system off packages.
I don't know what the detailed profile of typical nwchem modes is.  What is the typical effect of building with, say, gcc-8 on el7 or with avx support?
How does that compare with the difference between mpi-ts and mpi-pr at reasonable scale?

Comment 20 Edoardo Apra 2019-10-07 17:48:09 UTC
(In reply to Dave Love from comment #4)

> 
> The default RHEL7 OMPI is 1.10, against which we need to package
> (assuming it works), but nwchem failed against my rebuild for 1.10 and
> mpi-pr, as reported.  That build includes openmpi3 support, but I
> haven't tested it yet.  I'd need to rebuild nwchem, which is rather
> protracted; presumably it could use openblas64 and speed up building
> if there was a scalapack64 package. 

Since scalapack64 is not like to surface anytime soon (the only related products are the ILP64 Intel MKL libraries),
my suggestion if to stick to the default (32-bit integers) openblas libraries for maximum compatibility

Comment 21 Fedora Admin user for bugzilla script actions 2023-04-26 00:09:36 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 22 Troy Dawson 2024-07-09 02:50:59 UTC
EPEL 7 entered end-of-life (EOL) status on 2024-06-30.\n\nEPEL 7 is no longer maintained, which means that it\nwill not receive any further security or bug fix updates.\n As a result we are closing this bug.


Note You need to log in before you can comment on or make changes to this bug.