Description of problem: OpenMPI packages break when upgrading. OpenMPI programs don't run anymore after the upgrade, ldd can't find the libraries. How reproducible: always Steps to Reproduce: 1) up2date openmpi-libs or 2) rpm -U openmpi-libs*.rpm Actual results: ldd test.out libmpi.so.0 => not found libopen-rte.so.0 => not found libopen-pal.so.0 => not found Expected results: ldd test.out libmpi.so.0 => /usr/lib64/openmpi/1.2.3-gcc/libmpi.so.0 (0x0000002a95580000) libopen-rte.so.0 => /usr/lib64/openmpi/1.2.3-gcc/libopen-rte.so.0 (0x0000002a95713000) libopen-pal.so.0 => /usr/lib64/openmpi/1.2.3-gcc/libopen-pal.so.0 (0x0000002a9586c000) Additional info: The 1.1.1 version of the openmpi-libs have an (erroneous?) check for an upgrade in the RPM preuninstall scriptlet ==openmpi-devel== if [ "$1" -eq 0 ]; then alternatives --remove mpicc /usr/bin/opal_wrapper-64 fi ==== in the 1.2.3 version this has been corrected: === alternatives --remove mpicc /usr/bin/opal_wrapper-1.2.3-gcc-64 === When running the preuninstall scriptlets for 1.1.1 by hand and do a lddconfig the alternatives setup is fixed. The openmpi and openmpi-devel packages also have problems with the alternatives setup.
ADDITIONAL NOTES FROM SUPPORT ENGINEERS Upgrade of openmpi packages causes links in /etc/alternatives/mpi* to break The problem is caused by the pre-uninstall scriptlet in the older version of package which delete the links only if the package is being deleted and not for upgrading. ex: for libmpi version 1.1.1: preuninstall scriptlet (using /bin/sh): if [ "$1" -eq 0 ]; then alternatives --remove mpilibs32 /usr/lib/openmpi/openmpi.ld.conf fi postuninstall program: /sbin/ldconfig The openmpi packages uses the alternatives system(see man alternatives) for providing locations to the files on the system. The alternatives link in /etc/alternatives/mpi* are not deleted by the preuninstall scriptlet of the older version in case of an upgrade. This causes problems with the new packages which change the location of these files. Since the links in /etc/alternatives already exist, they are not overwritten with the links to newer locations. This causes a problem with programs which use the links in /etc/alternatives/mpi* to access these files. The newer version of the packages fixes this issue by removing the links for all situations. version 1.2.3: preuninstall scriptlet (using /bin/sh): alternatives --remove mpilibs32 /usr/lib/openmpi/1.2.3-gcc/openmpi.ld.conf postuninstall program: /sbin/ldconfig To fix this for the packages isntalled, we will have to run commands similar to alternatives --remove mpilibs32 /usr/lib/openmpi/openmpi.ld.conf alternatives --install /etc/ld.so.conf.d/mpi32.conf \ mpilibs32 /usr/lib/openmpi/1.2.3-gcc/openmpi.ld.conf 10 This should be fixed by modifying the scripts on the newer packages to delete the links if they exist before recreating them.
Actually, the analysis of the problem is slightly wrong. The problem is that the old version checked for update and the new one doesn't. You wouldn't think this would be a problem, except that the %pre/%post scriptlets for the package you are upgrading to get run *prior* to the %preun/%postun scriptlets for the package you are upgrading from. So, when you install the latest openmpi package, the %post scriptlets install the proper alternatives links, but then the %preun scriplet removes them as part of its cleanup. Furthermore, since the %preun scriptlet that's actually causing this problem comes from the old package already on the system, fixing the %preun scriptlets in our current package does *not* solve the problem. It will require a two step upgrade process, where we ship out a package with the %preun scriptlets fixed (but which will have a broken installation due to the existing package on the machine having the broken %preun scriptlet), and then ship out an update that simply reruns the upgrade process against the package with the fixed %preun scriptlet so that that installation is no longer busted. This is, of course, unless Nalin or someone more knowledgable in RPM foo than I can tell me another way to resolve the problem. There is, fortunately, a workaround in the meantime. Once you've upgraded the system to the current openmpi rpms, simply re-run the upgrade as a forced upgrade to the same packages. When you run an rpm upgrade of a package to itself (aka, upgrade openmpi-1.2.3-1 to openmpi-1.2.3-1 using the -f flag to RPM since otherwise it will say the package is already installed), then it will *not* run the %preun/%postun scripts at all. Alternatively, you can simply uninstall and reinstall the rpms (if the uninstall fails for packages due to the alternatives removal failing, then just run the uninstall with the rpm --noscripts option). Either of those will work around the problem.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2008-0662.html