Description of problem: mpiexec prints an error "Bad address", seemingly only on i686 arch: https://koji.fedoraproject.org/koji/taskinfo?taskID=94103358. Adding "--mca btl self,tcp --mca btl_tcp_if_include 127.0.0.1/24" as mpiexec arguments makes the error disappear. Version-Release number of selected component (if applicable): openmpi-4.1.4-7.fc38 How reproducible: Steps to Reproduce: 1. create libgomp-test.spec Name: libgomp-test Version: 1.0.0 Release: 1%{?dist} Summary: libgomp test License: GPLv3+ BuildRequires: openssh-clients BuildRequires: openmpi-devel BuildRequires: libgomp BuildRequires: gcc BuildRequires: strace BuildRequires: hostname BuildRequires: time %description %check export TIMEOUT_OPTS='--preserve-status --kill-after 10 60' %{_openmpi_load} # https://github.com/mikaem/mpi-examples/blob/master/helloworld.cpp cat <<EOF > hello.c #include <mpi.h> #include <stdio.h> int main(int argc, char** argv) { // Initialize the MPI environment MPI_Init(NULL, NULL); // Get the number of processes int world_size; MPI_Comm_size(MPI_COMM_WORLD, &world_size); // Get the rank of the process int world_rank; MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); printf("Hello world! from rank %d" " out of %d processors\n", world_rank, world_size); // Finalize the MPI environment. MPI_Finalize(); } EOF mpicc -fopenmp hello.c -o hello timeout ${TIMEOUT_OPTS} time -- env -i PATH=$MPI_BIN:/bin OMP_NUM_THREADS=1 mpiexec --mca btl self,tcp --mca btl_tcp_if_include 127.0.0.1/24 --allow-run-as-root -np 2 ./hello timeout ${TIMEOUT_OPTS} time -- env -i PATH=$MPI_BIN:/bin OMP_NUM_THREADS=1 mpiexec --mca btl_tcp_if_include 127.0.0.1/24 --allow-run-as-root -np 2 ./hello timeout ${TIMEOUT_OPTS} time -- env -i PATH=$MPI_BIN:/bin OMP_NUM_THREADS=1 mpiexec --allow-run-as-root -np 2 ./hello %{_openmpi_unload} 2. rpmbuild -bs libgomp-test.spec 3. koji build --nowait --scratch f38 ~/rpmbuild/SRPMS/libgomp-test-1.0.0-1.fc36.src.rpm Actual results: ``` -------------------------------------------------------------------------- A system call failed during shared memory initialization that should not have. It is likely that your MPI job will now either abort or experience performance degradation. Local host: d279f57c8afa4f999af7064835646a20 System call: mmap(2) Error: Bad address (errno 14) Hello world! from rank 0 out of 2 processors Hello world! from rank 1 out of 2 processors ``` Expected results: ``` Hello world! from rank 0 out of 2 processors Hello world! from rank 1 out of 2 processors ``` Additional info: Despite the "Bad address" error, the above hello world mpi program succeeds. On the other hand: 1) elk i686 https://koji.fedoraproject.org/koji/taskinfo?taskID=94086999 is printing "libgomp: Thread creation failed: Bad address" and returns exit 1 when running tests. The tests pass despite it as they produce the expected output files. 2) gpaw i686 https://koji.fedoraproject.org/koji/taskinfo?taskID=94085529 prints "Bad address" and fails with "Fatal Python error: Segmentation fault" when running tests. Adding "--mca btl_tcp_if_include 127.0.0.1/24" to mpiexec arguments allows the tests to pass.
Another case of "Segmentation fault - invalid memory reference" in bug #2152521. Maybe openmpi spec itself should include %check that performs some basic functionality tests?
I'm seeing this as well in elpa testsuite.
This bug appears to have been reported against 'rawhide' during the Fedora Linux 38 development cycle. Changing version to 38.
Just a heads up that with 5.0 openmpi is dropping support for 32-bit architectures, so it is unlikely that this bug will be fixed any time soon, and support will disappear completely once 5.0 is released and packaged for Fedora.