Created attachment 361848 [details] log from a failed x86_64 build Description of problem: freefem++-3.5 testsuite fails consistently on rawhide, but it works with openmpi-1.3.3-2.fc11 on F11. Version-Release number of selected component (if applicable): openmpi-1.3.3-5 How reproducible: Always Steps to Reproduce: 1. /usr/bin/koji build --scratch dist-f12 'cvs://cvs.fedoraproject.org/cvs/pkgs?rpms/freefem++/devel#freefem++-3_5-1_fc12' Actual results: [xenbuilder4.fedora.phx.redhat.com:18024] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file ess_hnp_module.c at line 130 -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_plm_base_select failed --> Returned value Not found (-13) instead of ORTE_SUCCESS ... and so on. Expected results: The tests should pass. Additional info: Some more failed scratch builds: https://koji.fedoraproject.org/koji/taskinfo?taskID=1693624 https://koji.fedoraproject.org/koji/taskinfo?taskID=1693607 https://koji.fedoraproject.org/koji/taskinfo?taskID=1693589 Note these strange lines in the log: ++ MPI_COMPILER='openmpi-x86_64%{_cc_name_suffix}' ... ++ MPI_HOME=/usr/lib64/openmpi@ ...
Created attachment 361849 [details] log from a failed x86_64 build
I get an error when I attempt to access the link in the attachment. Can you attach the actual log instead of just a link to it? I wasn't able to get freefem++ to build locally on my rawhide box until I enclosed the %build section in "exec bash << EOF" and "EOF" lines. Apparently the /etc/profile.d/modules.sh script contains some bash-isms that don't work when run by /bin/sh. That might be a bug in the environment-modules package. However, once I edited the spec file, freefem++-3.5-2 compiled against openmpi-1.3.3-6.fc12 without problem.
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
New builds are failing in exact the same way on koji/dist-f13. See for yourself: koji build --scratch --arch=x86_64 dist-f13 'cvs://cvs.fedoraproject.org/cvs/pkgs?rpms/freefem++/devel#HEAD' It works fine in mock.
Created attachment 376386 [details] build log from a failed x86_64 build Build log from: http://koji.fedoraproject.org/koji/taskinfo?taskID=1851955
Created attachment 376391 [details] log from a failed local mock build (rawhide/x86_64) I was wrong, local mock rawhide build fails, too.
ping? still fails with current rawhide (in mock)
I'm not sure that this can reasonably be expected to work. The error is a failure to run the openmpi runtime during the build process. While it is expected that you can build an openmpi using app in a build root, there is no guarantee that you can run the same app in the build root. It may have worked in the past, and they may have been pure coincidence that the defaults for a totally unconfigured openmpi install allow you to run a single process mpi job. We can look into it, but I make no promises that running an mpi job in a build root will be officially supported.
Is it possible to configure the openmpi environment in an automated way as part of the build process then? I don't mind adding a few lines of shell script before running the testsuite in %check.
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle. Changing version to '13'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
I don't believe you can run openmpi programs on the builders, because you can't make any assumptions about the network configurations on the build machines. (Or even that they have networking at all.)
This message is a reminder that Fedora 13 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '13'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 13's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 13 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.
I realize this bug is closed, but I just hit this problem myself while building some packages for internal consumption and found a workaround. The issue is described more clearly here: http://permalink.gmane.org/gmane.comp.clustering.open-mpi.user/966 And the workaround I used was to simply add BuildRequires: rsh to my spec file. An alternative would be touch /usr/bin/rsh I suppose. I do wonder if openmpi-devel should Require the rsh package such that (single host) opnmpi can be ran during %build/%check. Anyway, posting this here in case someone else runs into the problem in the future.