Bug 996860 - building of a simple MPI fortran program fails
building of a simple MPI fortran program fails
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: openmpi (Show other bugs)
18
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: Doug Ledford
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-14 03:55 EDT by Jos de Kloe
Modified: 2013-12-19 07:41 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-12-18 22:54:13 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
output of the verbose version of the problematic command (5.29 KB, text/plain)
2013-12-18 04:58 EST, Jos de Kloe
no flags Details

  None (edit)
Description Jos de Kloe 2013-08-14 03:55:00 EDT
Description of problem:

after upgrading from f16 to f18 it seems no longer possible for me to build fortran programs that use MPI. It fails during the link stage. On previous fedora versions the fortran code compiled just fine.

Version-Release number of selected component (if applicable):

openmpi-1.6.3-7.fc18.x86_64
gcc-gfortran-4.7.2-8.fc18.x86_64
hwloc-1.4.2-2.fc18.x86_64
numactl-libs-2.0.7-7.fc18.x86_64

How reproducible:

always

Steps to Reproduce:
1. download some sample mpi fortran code from 
http://people.sc.fsu.edu/~jburkardt/f_src/hello_mpi/hello_mpi.html
(I used the f90 example http://people.sc.fsu.edu/~jburkardt/f_src/hello_mpi/hello_mpi.f90)
2. try to build it with this command:
/usr/lib64/openmpi/bin/mpif90 -o prog.mpi hello_mpi.f90

Actual results:

The linking stage produces these errors:

>/usr/lib64/openmpi/bin/mpif90 -o prog.mpi fortran_mpi_example.F90
/lib64/libhwloc.so.5: undefined reference to `migrate_pages@libnuma_1.2'
/lib64/libhwloc.so.5: undefined reference to `mbind@libnuma_1.1'
/lib64/libhwloc.so.5: undefined reference to `set_mempolicy@libnuma_1.1'
/lib64/libhwloc.so.5: undefined reference to `get_mempolicy@libnuma_1.1'
collect2: error: ld returned 1 exit status

Expected results:

an executable named prog.mpi should have been produced

Additional info:
Comment 1 Orion Poplawski 2013-08-14 21:57:35 EDT
I'm afraid I can't reproduce this.  Two thoughts:

- You mention "fortran_mpi_example.F90" but I can't find that.
- What does "rpm -V hwloc numactl-libs" and "yum list hwloc numactl-libs" report?
Comment 2 Jos de Kloe 2013-08-15 05:05:27 EDT
sorry about "fortran_mpi_example.F90", I renamed it locally and forgot to mention it in the bug report. It is the same file as hello_mpi.f90 mentioned in my report above.

"rpm -V hwloc numactl-libs" gives no output at all

"yum list hwloc numactl-libs" reports:
Installed Packages
hwloc.x86_64         1.4.2-2.fc18   @f18-x86_64-everything
numactl-libs.x86_64  2.0.7-7.fc18   @Fedora-18-source
Available Packages
hwloc.i686           1.4.2-2.fc18   f18-x86_64-everything
numactl-libs.i686    2.0.7-7.fc18   f18-x86_64-everything

We keep a local repository mirror for the fedora packages, which explains the unfamiliar repository names. 

Disabling the local repository I see similar output:
yum list --disablerepo='*' --enablerepo=fedora hwloc numactl-libs
Loaded plugins: langpacks, presto, refresh-packagekit
Installed Packages
hwloc.x86_64        1.4.2-2.fc18    @f18-x86_64-everything
numactl-libs.x86_64 2.0.7-7.fc18    @Fedora-18-source
Available Packages
hwloc.i686          1.4.2-2.fc18    fedora
numactl-libs.i686   2.0.7-7.fc18    fedora

I downloaded and verified the packages from the local repository mirror with the ones from the public fedora repository and they are identical, so that should not be the problem.
I also tested on a standalone laptop running Fedora 19 with latest updates, and there everything runs fine.

My feeling is that this also could be a local configuration issue on our side, but I have no idea where to start looking or what could be wrong.
Comment 3 Jos de Kloe 2013-12-18 04:58:34 EST
Created attachment 838222 [details]
output of the verbose version of the problematic command
Comment 4 Jos de Kloe 2013-12-18 05:00:18 EST
I found the cause of the trouble.
The problem is that we have the portland fortran compiler installed as well, and this ships with its own libnuma version.
The library search path seems mixed up causing the portland supplied version to be used with mpif90 as well.

This reproduces the reported error:

>/usr/lib64/openmpi/bin/mpif90 -o testprogram_mpi /opt/pgi/linux86-64/11.10/lib/libnuma.so.1 testfile_mpi.F90
/lib64/libhwloc.so.5: undefined reference to `migrate_pages@libnuma_1.2'
/lib64/libhwloc.so.5: undefined reference to `mbind@libnuma_1.1'
/lib64/libhwloc.so.5: undefined reference to `set_mempolicy@libnuma_1.1'
/lib64/libhwloc.so.5: undefined reference to `get_mempolicy@libnuma_1.1'
collect2: error: ld returned 1 exit status

This seems to work just fine:

>/usr/lib64/openmpi/bin/mpif90 -o testprogram_mpi /usr/lib64/libnuma.so.1 testfile_mpi.F90

However, it still puzzles me why this happens. The path to /opt/pgi/ is nowhere to be found in my environment settings, and also the LIBRARY_PATH definition reported by running 'mpif90 -v' does not contain this path. 

The /usr/bin/pgf90 command on my system is a simple script that sets an environment variable to the license file and then calls 
/opt/pgi/linux86-64/11.10/bin/pgf90. I don't think this should interfere with mpif90 in any way.

For reference the full output of the command:

/usr/lib64/openmpi/bin/mpif90 -v -o testprogram_mpi testfile_mpi.F90

is attached.
Comment 5 Orion Poplawski 2013-12-18 22:54:13 EST
Other places to check are /etc/ld.so.conf and /etc/ld.so.conf.d/*.  Output of ldd /lib64/libhwloc.so.5 would be interesting.

Meanwhile, I'm going to close this.
Comment 6 Jos de Kloe 2013-12-19 07:36:03 EST
Thanks for the hint. You are right, we have a conf file named
/etc/ld.so.conf.d/knmi.conf which contains these (and other) lines:

/opt/pgi/linux86-64/11.10/lib
/opt/pgi/linux86-64/11.10/libso

I agree this is not a fedora issue and can be closed.
Comment 7 Jos de Kloe 2013-12-19 07:41:07 EST
For your reference, here is the output of the ldd command:

>ldd /lib64/libhwloc.so.5
        linux-vdso.so.1 =>  (0x00007fff7b3fe000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003d48200000)
        libnuma.so.1 => /lib64/libnuma.so.1 (0x0000003d5ea00000)
        libpci.so.3 => /lib64/libpci.so.3 (0x0000003d4ca00000)
        libxml2.so.2 => /lib64/libxml2.so.2 (0x0000003d52200000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003d47e00000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003d47a00000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003d4b200000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003d48600000)
        libz.so.1 => /lib64/libz.so.1 (0x0000003d48e00000)
        liblzma.so.5 => /lib64/liblzma.so.5 (0x0000003d4e600000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003d49200000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003d48a00000)

Here there is no hint whatsoever that something is wrong, which may be a bug in its own. Shouldn't the ldd command take /etc/ld.so.conf and the files below /etc/ld.so.conf.d/ into account?

Note You need to log in before you can comment on or make changes to this bug.