Bug 433841

Summary: mpirun is MIA on openmpi rpms
Product: Red Hat Enterprise Linux 5 Reporter: Gurhan Ozen <gozen>
Component: openmpiAssignee: Doug Ledford <dledford>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 5.2CC: atodorov, bugzilla, ddomingo, jburke
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0481 Doc Type: Bug Fix
Doc Text:
(all architectures) A bug in previous versions of openmpi and lam may prevent you from upgrading these packages. This bug manifests in the following error (when attempting to upgrade openmpi or lam: error: %preun(openmpi- [version] ) scriptlet failed, exit status 2 As such, you need to manually remove older versions of openmpi and lam in order to install their latest versions. To do so, use the following rpm command: rpm -qa | grep '^openmpi-\|^lam-' | xargs rpm -e --noscripts --allmatches
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 15:25:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 391221, 391231, 458752    

Description Gurhan Ozen 2008-02-21 17:50:49 UTC
Description of problem:
mpirun wrapper is missing in the openmpi rpms.

# rpm -q openmpi
openmpi-1.2.5-2.el5
# which mpirun
/usr/bin/which: no mpirun in
(/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)


Version-Release number of selected component (if applicable):
openmpi-1.2.5-2.el5

How reproducible:
Very

Steps to Reproduce:
1. Install openmpi-1.2.5-2.el5 and look for mpirun 
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 3 Gurhan Ozen 2008-04-03 02:24:44 UTC
I will fail this bug per the latest packages in RHEA-2008:8176-02 advisory.
Installation of packages spews out a lot of errors:

Preparing...                ########################################### [100%]
   1:openmpi-debuginfo      ########################################### [  7%]
   2:lam-debuginfo          ########################################### [ 13%]
   3:openmpi-debuginfo      ########################################### [ 20%]
   4:lam-debuginfo          ########################################### [ 27%]
   5:mpi-selector           ########################################### [ 33%]
   6:openmpi-libs           ########################################### [ 40%]
   7:openmpi-libs           ########################################### [ 47%]
   8:lam-libs               ########################################### [ 53%]
ERROR: Cannot read from source directory (/usr/lib64/lam/etc)
error: %post(lam-libs-7.1.2-12.el5.x86_64) scriptlet failed, exit status 1
   9:lam                    ########################################### [ 60%]
  10:openmpi                ########################################### [ 67%]
  11:openmpi-devel          ########################################### [ 73%]
  12:lam-devel              ########################################### [ 80%]
  13:lam-libs               ########################################### [ 87%]
ERROR: Cannot read from source directory (/usr/lib/lam/etc)
error: %post(lam-libs-7.1.2-12.el5.i386) scriptlet failed, exit status 1
  14:lam-devel              ########################################### [ 93%]
  15:openmpi-devel          ########################################### [100%]
error: %preun(openmpi-devel-1.2.5-2.el5.i386) scriptlet failed, exit status 2
error: %preun(openmpi-libs-1.2.5-2.el5.i386) scriptlet failed, exit status 2
error: %preun(openmpi-1.2.5-2.el5.x86_64) scriptlet failed, exit status 2
error: %preun(openmpi-devel-1.2.5-2.el5.x86_64) scriptlet failed, exit status 2
error: %preun(openmpi-libs-1.2.5-2.el5.x86_64) scriptlet failed, exit status 2


Comment 4 Doug Ledford 2008-04-03 13:52:30 UTC
Suggest release note:

Both the lam and openmpi packages previously shipped in RHEL5 contain a bug in
the package installation and removal scripts.  This bug can prevent users from
being able to remove or upgrade these packages.  If a user encounters this error
from any of the lam or openmpi packages:

error: %preun(openmpi-1.2.5-2.el5.x86_64) scriptlet failed, exit status 2

then that package will have to be removed manually.  The easiest way to
accomplish this task would be to run this shell command:

rpm -qa | grep '^openmpi-\|^lam-' | xargs rpm -e --noscripts --allmatches

This will remove all installed openmpi or lam rpms.  After running this command
the user will be able to install the latest version of these packages which
resolves the underlying bug.

Comment 5 Doug Ledford 2008-04-03 17:55:39 UTC
Note that the errors are from the previous package and as such can't be avoided.
 The best alternative for that problem is the release note above.

Comment 6 Don Domingo 2008-04-04 01:29:18 UTC
added to RHEl5.2 release notes under "Known Issues":

<quote>
A bug in previous versions of openmpi and lam may prevent you from upgrading
these packages. This bug manifests in the following error (when attempting to
upgrade openmpi or lam:

error: %preun(openmpi-[version]) scriptlet failed, exit status 2

As such, you need to manually remove older versions of openmpi and lam in order
to install their latest versions.
</quote>

please advise (before April 15) if any further revisions are required. thanks!

Comment 7 Doug Ledford 2008-04-04 01:47:03 UTC
We either need to have the suggested method of removing packages in the release
note or in a knowledge base entry that the release note points to.  Manual
attempts to uninstall the rpms will fail without the --noscripts flag to rpm,
and most people don't know about it.

Comment 8 Don Domingo 2008-04-04 03:13:29 UTC
understood. revising as follows:

<quote>
As such, you need to manually remove older versions of openmpi and lam in order
to install their latest versions. To do so, use the following rpm command:

rpm -qa | grep '^openmpi-\|^lam-' | xargs rpm -e --noscripts --allmatches
</quote>

please advise (before April 15) if any further revisions are required. thanks!

Comment 9 Gurhan Ozen 2008-04-09 03:23:21 UTC
failing this to because running when mpi-selector is set to make the 32 bit one
default on an x86_64 box, we still get no mpirun since i386 package isn't
included on x86_64 file list:

openmpi-1.2.5-4.el5.i386.rpm package should be included in x86_64 arch as well:

# mpi-selector --query
default:openmpi-1.2.5-gcc-i386
level:system
# which mpirun 
/usr/bin/which: no mpirun in
(/usr/lib/openmpi/1.2.5-gcc/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)


Comment 10 Alexander Todorov 2008-04-09 09:32:02 UTC
Gurhan,
shouldn't mpi-selector be fixed to set the 64bit mpirun as default on x86_64
instead of including the 32bit package in the repo?

Thanks.

Comment 11 Doug Ledford 2008-04-09 15:18:48 UTC
No, mpi-selector doesn't set either one as default.  He was attempting to use
openmpi in 32bit mode and that's what failed.  The way the packages are set up
now, you *do* need the base package in a multilib environment to be multilib as
well.  That was an oversight on my part that I'll get corrected today.

Comment 12 Chris Schanzle 2008-04-17 18:42:53 UTC
I think the fix in comment #8 may be incomplete, as I have recently discovered
my Centos (sorry, just trying to help) 5.0 -> 5.1 systems broke a lot of
openmpi* alternatives links.  Removing the packages per #8 and installing
updated ones does not fix the broken links and you'll end up with two instances
of priority "10" /var/lib/alternatives/mpicc links, one still broken and the
fixed ones masked (listed secondly).  E.g., after removal and reinstall,
/etc/alternatives/mpirun contains:
...
/usr/bin/opal_wrapper-32
10
/usr/bin/opal_wrapper-32
/usr/bin/opal_wrapper-32
/usr/bin/opal_wrapper-32
/usr/bin/opal_wrapper-32
/usr/bin/opal_wrapper-32
/usr/bin/opal_wrapper-1.2.3-gcc-32
10
/usr/bin/opal_wrapper-1.2.3-gcc-32
/usr/bin/opal_wrapper-1.2.3-gcc-32
/usr/bin/opal_wrapper-1.2.3-gcc-32
/usr/bin/opal_wrapper-1.2.3-gcc-32
/usr/bin/opal_wrapper-1.2.3-gcc-32

I believe the proper cleanup procedure is:

alternatives --remove mpilibs32 \
  /usr/lib/openmpi/openmpi.ld.conf
alternatives --remove mpicc \
  /usr/bin/opal_wrapper-32
alternatives --remove mpi-run /usr/bin/orterun
rpm -qa | egrep '^(openmpi-|lam-)' | \
  xargs --verbose rpm -e --noscripts  --allmatches
yum -y install openmpi-\* lam-\*
alternatives --install /usr/bin/mpirun mpi-run \
 /usr/bin/orterun 10 --slave /usr/bin/mpiexec \
 mpi-exec /usr/bin/orterun \
 --slave /usr/share/man/man1/mpirun.1.gz mpi-run-man \
 /usr/share/openmpi/1.2.3-gcc/man/man1/mpirun.1 \
 --slave /usr/share/man/man1/mpiexec.1.gz mpi-exec-man \
 /usr/share/openmpi/1.2.3-gcc/man/man1/orterun.1

The last alternatives fixes broken man page references for ortorun.1.gz ->
ortorun.1 and mpirun.1.gz -> mpirun.1, discussed in a different openmpi bug report.

Comment 13 Doug Ledford 2008-04-17 18:49:54 UTC
The latest openmpi rpms completely do away with the alternatives stuff and use
mpi-selector instead.  In your case, you are seeing dangling alternatives links.
 Those can be fixed by this operation:

%post libs
# In order to work when upgrading from older packages that used alternatives
# instead, we will need to forcably remove all the old incarnations of the
# alternatives support, and then install our mpi-selector support.

# We never used anything other than orterun for the mpi-run link
/usr/sbin/alternatives --remove mpi-run /usr/bin/orterun >/dev/null 2>&1

# but, we used opal_wrapper, and then later opal_wrapper-version-gcc-mode for
# mpicc....this little grep/awk thing catches all of them that might still
# be installed
/usr/sbin/alternatives --display mpicc | grep priority | grep opal_wrapper | awk
'{ system("/usr/sbin/alternatives --remove mpicc "$1) }'

# ditto for possible variants of mpilibs mpilibs%{mode}, get them all
for i in mpilibs mpilibs%{mode}; do /usr/sbin/alternatives --display $i | grep
priority | grep openmpi | awk '{ system("/usr/sbin/alternatives --remove '$i'
"$1) }'; done


Comment 14 Chris Schanzle 2008-04-17 19:02:27 UTC
Thanks for the clarification, looking forward to the new packages.  Thanks for
the effort!

Comment 19 errata-xmlrpc 2008-05-21 15:25:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0481.html


Comment 20 Don Domingo 2008-06-02 23:14:14 UTC
Hi,

the RHEL4.7 release notes deadline is on June 17, 2008 (Tuesday). they will
undergo a final proofread before being dropped to translation, at which point no
further additions or revisions will be entertained.

a mockup of the RHEL4.7 release notes can be viewed here:
http://intranet.corp.redhat.com/ic/intranet/RHEL4u7relnotesmockup.html

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.

Cheers,
Don

Comment 21 Ryan Lerch 2008-08-07 23:39:30 UTC
Tracking this bug for the Red Hat Enterprise Linux 5.3 Release Notes. 

This Release Note is currently located in the Known Issues section.

Comment 22 Ryan Lerch 2008-08-07 23:39:30 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.