Bug 191436 - openmpi needs some love
openmpi needs some love
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: openmpi (Show other bugs)
rawhide
All Linux
medium Severity medium
: ---
: ---
Assigned To: Doug Ledford
Fedora Extras Quality Assurance
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-05-11 18:03 EDT by Orion Poplawski
Modified: 2007-11-30 17:11 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-10-17 15:53:06 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to update to 1.1.4 (6.72 KB, patch)
2007-02-09 15:02 EST, Orion Poplawski
no flags Details | Diff
patch to openmpi.spec to support different fortran compilers (8.52 KB, patch)
2007-07-12 18:08 EDT, Orion Poplawski
no flags Details | Diff
Patch to support multiple compilers (8.74 KB, patch)
2007-07-26 12:24 EDT, Orion Poplawski
no flags Details | Diff
Updated patch (11.78 KB, patch)
2007-07-27 18:52 EDT, Orion Poplawski
no flags Details | Diff

  None (edit)
Description Orion Poplawski 2006-05-11 18:03:31 EDT
openmpi 1.0.2 has been released.  I just updated the spec and rebuilt on FC5
with no errors.  

I could check in and build the new version if you'd like.


diff -u -r1.1 openmpi.spec
--- openmpi.spec        23 Feb 2006 17:59:37 -0000      1.1
+++ openmpi.spec        11 May 2006 22:09:35 -0000
@@ -1,6 +1,5 @@
-%{?!dist:       %define dist .fe5}
 Name:           openmpi
-Version:        1.0.1
+Version:        1.0.2
 Release:        1%{dist}
 Summary:        Open Message Passing Interface

@@ -170,6 +169,9 @@


 %changelog
+* Thu May 11 2006 Orion Poplawski <orion@cora.nwra.com> - 1.0.2-1
+- Update to 1.0.2
+
 * Wed Feb 15 2006 Jason Vas Dias <jvdias@redhat.com> - 1.0.1-1
 - Import into Fedora Core
 - Resolve LAM clashes
Comment 1 Jason Vas Dias 2006-06-14 16:43:54 EDT
fixed with openmpi-1.0.2, now in Extras .
Comment 2 Orion Poplawski 2006-07-03 12:33:07 EDT
And so of course they release 1.1.  Here's a patch:

RCS file: /cvs/extras/rpms/openmpi/devel/openmpi.spec,v
retrieving revision 1.2
diff -u -r1.2 openmpi.spec
--- openmpi.spec        12 Jun 2006 21:14:40 -0000      1.2
+++ openmpi.spec        3 Jul 2006 16:38:05 -0000
@@ -1,6 +1,5 @@
-%{?!dist:       %define dist .fe5}
 Name:           openmpi
-Version:        1.0.2
+Version:        1.1
 Release:        1%{dist}
 Summary:        Open Message Passing Interface

@@ -51,6 +50,7 @@
        --includedir=%{_includedir}/%{name} \
        --libdir=%{_libdir}/%{name} \
        --datadir=%{_datadir}/%{name}/help \
+       --mandir=%{_datadir}/%{name}/man \
        LDFLAGS='-Wl,-z,noexecstack' \
        CFLAGS="$CFLAGS $XCFLAGS" \
        CXXFLAGS="$CFLAGS $XCFLAGS" \
@@ -70,7 +70,6 @@
 # ^- provides "relpath" function
 rpath=`relpath ${RPM_BUILD_ROOT}/%{_bindir}
${RPM_BUILD_ROOT}/%{_datadir}/%{name}/bin`;
 mkdir -p ${RPM_BUILD_ROOT}/%{_datadir}/%{name}/bin;
-mkdir -p ${RPM_BUILD_ROOT}/%{_datadir}/%{name}/man;
 ln -s `relpath ${RPM_BUILD_ROOT}/%{_libdir}/%{name}
${RPM_BUILD_ROOT}/%{_datadir}/%{name}` ${RPM_BUILD_ROOT}/%{_datadir}/%{name}/lib;
 ln -s `relpath ${RPM_BUILD_ROOT}/%{_includedir}/%{name}
${RPM_BUILD_ROOT}/%{_datadir}/%{name}`
${RPM_BUILD_ROOT}/%{_datadir}/%{name}/include;
 # Links: (mpiCC,mpicxx)->mpicc,  (mpiexec,mpirun)->orterun
@@ -158,7 +157,7 @@
 %defattr(-,root,root,-)
 %{_bindir}/*
 %exclude %{_bindir}/orte*
-%exclude %{_bindir}/*run
+%exclude %{_bindir}/mpirun
 %exclude %{_bindir}/*exec
 %exclude %{_bindir}/*info
 %{_includedir}/*
@@ -170,6 +169,9 @@


 %changelog
+* Fri Jun 30 2006 Orion Poplawski <orion@cora.nwra.com> - 1.1-1
+- Upgrade to 1.1
+
 * Mon Jun 12 2006 Jason Vas Dias <jvdias@redhat.com> - 1.0.2-1
 - Upgrade to 1.0.2


This also fixes:
warning: File listed twice: /usr/bin/orterun

Also, this packages finally has some man pages.  I set configure to put them in
/usr/share/openmpi/man to avoid conflicts, but we still don't really handle this
properly.
Comment 3 Jonathan Underwood 2006-08-30 05:33:38 EDT
And now version 1.1.1 has been released:

From http://www.open-mpi.org/community/lists/announce/2006/08/0007.php

The Open MPI Team, representing a consortium of research, academic, and
industry partners, is pleased to announce the release of Open MPI version
1.1.1. This release is mainly a bug fix release over the the v1.1 release,
but there are few minor new features. Version 1.1.1 can be downloaded from
the main Open MPI web site or any of its mirrors (mirrors will be updating
shortly).

We strongly recommend that all users upgrade to version 1.1.1 if possible. 
Comment 4 Orion Poplawski 2007-02-09 14:59:08 EST
And now 1.1.4 has been released.  Will attach my usual spec patch.
Comment 5 Orion Poplawski 2007-02-09 15:02:10 EST
Created attachment 147801 [details]
Patch to update to 1.1.4

* Fri Feb 09 2007 Orion Poplawski <orion@cora.nwra.com> - 1.1.4-1
- Update to 1.1.4
- Change %%{name} to openmpi where appropriate
- Change install to /usr/share/%%{name}-%%{mode}/bin from
  /usr/share/%%{name}/bin%%{mode} for cleaner separation and to support
  orterun --prefix
- Make links for the runtime binaries and libs so orterun --prefix works
- Move development wrappers into -devel
Comment 6 Orion Poplawski 2007-06-27 13:49:33 EDT
So, what's up with openmpi these days?  It's been languishing at 1.1, and 1.2.3
just came out.  Need a new maintainer or co-maintainer?  There seems to perhaps
be a need to rework the alternatives system.  Anything else holding things up?
Comment 7 Doug Ledford 2007-06-28 15:52:58 EDT
Two things, the alternatives system and getting the various Infiniband/iWARP
libraries into fedora so that openmpi can be built against them.  I've already
built openmpi-1.2.3 for RHEL, but the Infiniband/iWARP libraries already exist
there (albeit in a format that's unlikely to get approved as a Fedora package,
the huge, monda OFED tarball is what I use in RHEL for the Infiniband/iWARP
stuff, and it *really* needs broken out into the individual libraries and such
for Fedora).  I'll go ahead and build openmpi-1.2.3 for Fedora but without the
OFED support for now.  I'll also keep the existing RHEL alternatives setup for
now.  That will at least get it in place.
Comment 8 Doug Ledford 2007-06-29 14:41:09 EDT
I built a new package, but I went ahead and took a swing at getting the
multi-install issues worked out.  Here are the things I've been told need to be
addressed on the multi-install stuff:

Ability to install multi-lib
Ability to install more than one version and have them all work
Ability to install the same version but built with different compilers and have
them all work

So, here's a few requirements I followed as I attempted to address these issues:

Install using the normal FHS locations (for the most part, aka not in /opt, but
not exactly in /usr/lib(64)/openmpi either).
Keep actual executable files in /usr/bin where they belong.
Keep global namespace pollution to a minimum (aka, don't put man pages in the
default manpath, we don't want to try and tackle man page conflicts between all
the different installs people want).

Given those various requirements, here's what I actually did.

First, I only support installing one copy of the openmpi base package (well, a
single multilib install would work just because rpm ignores the secondary arch
binary files, but it need not be installed).  The mpirun/mpiexec facility does
not, to my knowledge, need to match the version or arch of the mpi program being
run.

The openmpi-libs packages are fully separated off and should not conflict when
you install multiple versions of the same library that only differ by version or
by compiler.  In order to do this, the library path is set to
%{_libdir}/%{name}/%{version}-%{opt_cc} for each compile.  The opt_cc macro
allows you to define a compiler other than gcc without having to hack all
through the spec file.

The openmpi-devel package has the actual orte_wrapper binary.  It is still
located in /usr/bin, but to differentiate between installs, it is moved to
orte_wrapper-%{version}-%{opt_cc}-%{mode} (I would prefer that a single wrapper
binary deal with both arches on multilib arches, but that will require at least
some re-architecting of the way the wrapper gets its options, so that's best
left to upstream to decide if/how they are going to do it).  There are symlinks
for all the common names from /usr/share/%{name}/%{version}-%{opt_cc}/bin%{mode}
to the correct orte_wrapper binary.  The pkg-config and module files have been
changed to reflect this, and since the man pages are now in
/usr/share/%{name}/%{version}-%{opt_cc}/man, I attempted to put a path-prepend
MANPATH element in the module file to help in getting the correct man pages
(pkg-config doesn't support that of course).  Finally, although the header files
should likely be mostly identical in lots of cases, there are chances for
conflicts when installing multiple versions, so the headers are similarly placed
in different directories based on version and compiler (although arch is not
counted in headers, that's handled by a fixup in the spec file so most headers
are identical between arches).

Let me know if that works for you guys.  It's present in the 1.2.3-1.fc8 build
in rawhide.
Comment 9 Orion Poplawski 2007-07-12 18:06:54 EDT
I'm reopening because I just managed to get some time to test this out.  First
off, thanks for all the work on this - it's greatly appreciated.

First issue for me though is that it's not the C compiler that I change for
other builds, it's the Fortran compiler.  This opens a big kettle of worms
though if we multiple C and multiple Fortran compilers and the various
combinations.  Personally, I don't really need for the extra "name" suffix to be
tied directly to the name of the compilers, just that it be an easily changed
name.  In my patch I make a "suffix" variable that initially is the combination
of the C and Fortran compilers, but allows people to change independently.

Also, if you change the C or Fortran compiler, they almost certainly won't
accept $RPM_OPT_FLAGS.

Finally, if you do rebuild the package with a different compiler, you almost
certainly with change the name of the package so you can have different versions
installed at the same time.  With %{mpidir} including both %{name} and a
compiler in the path, it ends up somewhat redundant.  Also, you end up needing
to change %{name} to openmpi in a few other places.  Not really a big deal since
this spec is very openmpi specific already.

I'll attach an initial version of a patch to support these changes.  I'm off for
a weeks vacation though, so I won't be able to respond for a while.
Comment 10 Orion Poplawski 2007-07-12 18:08:52 EDT
Created attachment 159103 [details]
patch to openmpi.spec to support different fortran compilers

This allows me to change the following:

< Name: 	  openmpi
---
> Name: 	  openmpi-ifort
60c60
< %define opt_fc gfortan
---
> %define opt_fc ifort
63c63
< %define opt_fflags $RPM_OPT_FLAGS
---
> %define opt_fflags -O2 -axWPT -i-static
67c67
< %define cname %{opt_cc}-%{opt_fc}
---
> %define cname ifort
107c107
< %define priority 10
---
> %define priority 20
109c109
< %define priority 11
---
> %define priority 21

To build a version with ifort without changing anything but variables.
Comment 11 Jonathan Underwood 2007-07-12 18:43:23 EDT
Hi Orion,

I can't help wondering why are you wanting to patch a Fedora package to cope
with compilers that are not shipped with Fedora (and are not free)? It seems a
bit unfair to expect the package maintainer to take on the burden of maintaining
a spec file to build outside of Fedora. Did I misunderstand?
Comment 12 Doug Ledford 2007-07-13 00:13:28 EDT
For MPI libraries anyway, it's all too common :-(  Not that I like it, but I
understand it in the context of compiler A is 5% more performant on task X while
compiler B is 5% more performant on task Y.  For regular users, that isn't a big
deal.  For someone running a week long job on a 1000 node cluster, 5% is a big
deal.  Here's an example of how ugly spec files can be over this issue, from the
mvapich spec file:

%build
OPTIMIZATION_FLAG="-O3 -fno-strict-aliasing"
CONFIG_ENABLE_F77="--enable-f77"
CONFIG_ENABLE_F90="--enable-f90"
conffile=mvapich.conf
#############################################################################
# Compiler definition
# GNU compilers
%if %(test "%{compiler}" = "gcc" && echo 1 || echo 0)
    export CC=gcc
    export CXX=g++
    if ( which gfortran &>/dev/null ); then
        # new gcc version
        export FC=gfortran
        export F77=gfortran
        export F90=gfortran
        export F77_GETARGDECL=" "
    elif ( which g77 &>/dev/null ); then
        # old gcc version
        export FC=g77
        export F77=g77
        export F90=g77
    fi
    export CFLAGS="-Wall"
    export FFLAGS=
    export CXXFLAGS=""
    export F90FLAGS=""
    export CONFIG_FLAGS=""
%endif
# Intel compiler
%if %(test "%{compiler}" = "intel" && echo 1 || echo 0)
    export CC=icc
    export CXX=icc
    export FC=ifort
    export F90=$FC
    export CFLAGS="-D__INTEL_COMPILER"
    export FFLAGS=""
    export CXXFLAGS=
    export CCFLAGS="-lstdc++"
    export F90FLAGS=$FFLAGS
    export CONFIG_FLAGS=""
    export COMPILER_CONFIG="--enable-f90modules --with-romio"
%endif
# Pathscale compiler
%if %(test "%{compiler}" = "pathscale" && echo 1 || echo 0)
    export CC=pathcc
    export CXX=pathCC
    export FC=pathf90
    export F90=pathf90
    export F77=pathf90
    export CFLAGS=""
    export FFLAGS=""
    export CXXFLAGS=""
    export CCFLAGS=""
    export F90FLAGS=$FFLAGS
    export CONFIG_FLAGS=""
    export COMPILER_CONFIG="--enable-f90modules --with-romio"
%endif
# PGI compiler
%if %(test "%{compiler}" = "pgi" && echo 1 || echo 0)
    export CC=pgcc
    export CXX=pgCC
    export FC=pgf77
    export F90=pgf90
    export CFLAGS="-Msignextend -B -DPGI"
    export FFLAGS=""
    export CXXFLAGS=""
    export F90FLAGS=$FFLAGS
    export CONFIG_FLAGS=""
    export OPTIMIZATION_FLAG=""
%endif
#############################################################################


I still haven't decided yet whether to leave that or just chunk the whole schmeal.
Comment 13 Doug Ledford 2007-07-13 00:15:43 EDT
Oh, and the relevance of that post was just to illustrate that upstream takes
the approach that the c/c++/fortran compilers are a bundle.  If you want ifort,
then you build an icc based openmpi.  Since they can be installed side by side,
you don't loose anything and it cuts down on the very complexity that Orion
brought up.
Comment 14 Jonathan Underwood 2007-07-13 05:48:31 EDT
Right - I understand the desire only too well (myself using intel compilers with
openmpi on fedora/RHEL systems). However, I am strongly of the opinion that we
shouldn't be crudding up Fedora spec files with code that is not relevant to the
Fedora distribution. The point here is ease of maintainance and QA. For example,
Fedora of course doesn't include the intel compilers. Therefore to burden the
maintainer with maintaining reams of code in the spec file for
out-of-distribution purposes seems unreasonable. The point about spec files in
Fedora is they should be as clean as possible to allow community maintainance.
Comment 15 Orion Poplawski 2007-07-26 12:11:45 EDT
I think the changes I suggest are very modest and don't "crud up" the spec file
and allow downstream users of the srpm an easy way to build custom versions that
are compatible with the default version. 

As for bundling c/fortran, I certainly don't do it here.  I use gcc everywhere
and ifort or pgf90 as desired.  Perhaps change %cname to be:

%define cname gnu

in the default spec to avoid "gcc-gfortran" which is indeed quite clunky.
Comment 16 Orion Poplawski 2007-07-26 12:24:49 EDT
Created attachment 160036 [details]
Patch to support multiple compilers

Fixes a typo (gfortan) and change cname default to "gnu".  Also allows multi
word compilers.
Comment 17 Orion Poplawski 2007-07-26 16:30:16 EDT
With the current setup you cannot install two different openmpi-devel packages
because they require the base openmpi package and they conflict:


# yum -y install openmpi-ifort-devel

=============================================================================
 Package                 Arch       Version          Repository        Size
=============================================================================
Installing:
 openmpi-ifort-devel     x86_64     1.2.3-4.fc5.cora.1  CoRA              302 k
Installing for dependencies:
 openmpi-ifort           x86_64     1.2.3-4.fc5.cora.1  CoRA              135 k
 openmpi-ifort-libs      x86_64     1.2.3-4.fc5.cora.1  CoRA              1.0 M

Transaction Check Error:
  file /usr/bin/ompi_info from install of openmpi-ifort-1.2.3-4.fc5.cora.1
conflicts with file from package openmpi-1.2.3-4.el5.cora.1
  file /usr/bin/orted from install of openmpi-ifort-1.2.3-4.fc5.cora.1 conflicts
with file from package openmpi-1.2.3-4.el5.cora.1
  file /usr/bin/orterun from install of openmpi-ifort-1.2.3-4.fc5.cora.1
conflicts with file from package openmpi-1.2.3-4.el5.cora.1

However, I think openmpi-devel really only requires openmpi-libs, not openmpi,
and this change allows you to have multiple openmpi-devel (and --libs) installed.

Also, though not a big deal since I don't think the man thing works at all,
/etc/alternatives/mpi-run-man and mpi-exec-man point to:

/usr/share/openmpi/1.2.3-gnu/man/man1/mpirun.1.gz
/usr/share/openmpi/1.2.3-gnu/man/man1/orterun.1.gz

which don't exist.  The actual names are:

/usr/share/openmpi/1.2.3-gnu/man/man1/mpirun.1
/usr/share/openmpi/1.2.3-gnu/man/man1/orterun.1

looks like they don't get compressed because they aren't in /usr/share/man
Comment 18 Orion Poplawski 2007-07-27 18:52:17 EDT
Created attachment 160147 [details]
Updated patch

A more serious issue is that the bin32/64 directory structure breaks the
orterun --prefix option because it looks for "bin".  This updated patch moves
the "%{mode}" into the %{mpidir} name.

Also, openmpi uses the basename of the configured libdir to determine the
library path for --prefix.  Currently, this is "1.2.3-gnu-64".	I've changed
the install so that it installs /usr/share/openmpi/1.2.3-gnu-64/lib then moves
the files to /usr/lib64/openmpi/1.2.3-gnu-64 and makes links back to the share
directory.

Lastly, --prefix looks for orted in <prefix>/bin.  This means we need a link to
orted there.  Problem then becomes what package to put it in.  Gah, this is
becoming a big can of worms.
Comment 19 Doug Ledford 2007-10-17 15:53:06 EDT
The man page issue should be fixed.

The bin32/bin64 doesn't work with --prefix because it isn't supposed to.  The
whole --prefix thing is intended to work when you have situations like openmpi
installed in /opt/openmpi-1.2.4 and it then looks under that for bin/lib/man/etc
and does not try to work with the current directory structure.  Trying to make
--prefix work when *not* using this sort of directory setup is an exercise in
futility.

The issue mentioned about installing the openmpi-ifort-{libs,devel} alongside
the regular packages should be resolved as openmpi-devel only requires
openmpi-libs now.  Also given that it should be easy to install the icc version
alongside the gcc version and simply choose which you link against to get the
right compiler, I'm going to agree with Orion that trying support mixed compiler
environments during build is not something we want in the spec file.
Comment 20 Jonathan Underwood 2007-10-17 16:04:19 EDT
(In reply to comment #19)
> The issue mentioned about installing the openmpi-ifort-{libs,devel} alongside
> the regular packages should be resolved as openmpi-devel only requires
> openmpi-libs now.  Also given that it should be easy to install the icc version
> alongside the gcc version and simply choose which you link against to get the
> right compiler, I'm going to agree with Orion that trying support mixed compiler
> environments during build is not something we want in the spec file.

I am confused here, as you say you agree with Orion, but then go on to say
"trying support mixed compiler environments during build is not something we
want in the spec file." which seems to disagree with Orion - am I misunderstanding?
Comment 21 Doug Ledford 2007-10-17 16:42:03 EDT
Sorry, misattribution.  I agree with you Jonathon ;-)

Note You need to log in before you can comment on or make changes to this bug.