Bug 737043

Summary: Module load/unload drops /bin from PATH
Product: [Fedora] Fedora Reporter: Petr Machata <pmachata>
Component: environment-modulesAssignee: Orion Poplawski <orion>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: mnewsome, orion, susi.lehtola
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: environment-modules-3.2.8a-3.fc16 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-09 19:51:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Petr Machata 2011-09-09 12:49:55 UTC
Description of problem:
This might well be in the openmpi module unload or mpich2 load, I'm not sure who's responsible for what.  Also, the whole issue is not exactly deterministic.  I've been seeing this bug on and off, reissuing the build would usually "fix" the problem.

Anyway, exhibit A:
  http://koji.fedoraproject.org/koji/getfile?taskID=3338498&name=build.log

which is part of a larger build task:
  http://koji.fedoraproject.org/koji/taskinfo?taskID=3338496

Note the lines matching '++ PATH':
++ PATH=/usr/lib/openmpi/bin:/usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/bin:/usr/local/sbin:/builddir/.local/bin:/builddir/bin
++ PATH=/usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/usr/local/sroot/bin:/usr/local/sbin:/builddir/.local/bin:/builddir/bin

The first line is for openmpi module load, the second for mpich2 load.  We see /bin present in line 1, but absent from the second line.  So at some point between the first load and the second one (inclusive), /bin is dropped from the PATH.

The build itself is done between the load and unload, and is run in a subshell, so I don't think that can be the culprit here:

%{_openmpi_load}
MPI_COMPILER=openmpi-%{_arch}
export MPI_COMPILER
( mkdir $MPI_COMPILER
   ... etc ...
)
%{_openmpi_unload}

The following line placed immediately after the unload commands fixes the build for me again:
export PATH=/bin${PATH:+:}$PATH

Version-Release number of selected component (if applicable):
As I said, it's been an on-and-off thing for me.  Currently I'm seeing it for the following:
  environment-modules    i686      3.2.8a-2.fc15
  mpich2-devel           i686      1.4.1-1.fc17
  openmpi-devel          i686      1.5-4.fc16

The following build might be of interest, too:
  http://koji.fedoraproject.org/koji/buildinfo?buildID=262768

Here, the x86_64 build has progressed far enough that it can be seen how it's not affected by the above bug.

Unfortunately, same-version rebuilds (repeated issues of failed build with the same version) rewrite older fails, so I can't point you to more examples of the same.

How reproducible:
Sometimes.

Steps to Reproduce:
1. fetch boost
2. build release 5 srpm (current HEAD is 6)
3. issue scratch build
  
Actual results:
Sometimes it passes

Expected results:
It always passes

Additional info:
I'll try to trim the above to a reasonable test case, but perhaps you know what the deal is off-hand.

Comment 1 Petr Machata 2011-09-09 15:36:33 UTC
Cutting down to reasonable reproducer has been unsuccessful, but annotating the boost spec works reasonably well:
  http://koji.fedoraproject.org/koji/getfile?taskID=3339014&name=build.log

The lines of interest are '+ echo ---'.  In particular (edited to fit):

# This is for serial build.  Why mpich2 is in PATH I don't know.
+ /usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/sbin:/bin:[...]
+ /usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/sbin:/bin:[...]

# This is for openMPI build
+ /usr/lib/openmpi/bin:/usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/sbin:/bin:[...]
+ /usr/lib/openmpi/bin:/usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/sbin:/bin:[...]
+ /usr/lib/openmpi/bin:/usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/sbin:/bin:[...]
# openMPI unload took place here.  /bin is gone, /sbin too, /usr/bin stays
+ /usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/usr/local/sroot/bin:/usr/local/sbin:/builddir/.local/bin:/builddir/bin

So the bug seems to be either in openMPI unload module, or in the module loading harness itself.

Comment 2 Orion Poplawski 2011-09-22 16:48:53 UTC
*** Bug 728187 has been marked as a duplicate of this bug. ***

Comment 3 Susi Lehtola 2011-09-22 18:42:32 UTC
(In reply to comment #1)
> # This is for serial build.  Why mpich2 is in PATH I don't know.
> + /usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/sbin:/bin:[...]
> + /usr/lib/mpich2/bin:/usr/sbin:/usr/bin:/sbin:/bin:[...]

See bug #647147. This is intentional misbehavior on behalf of the mpich2 maintainer. I've just asked the FPC to clarify the MPI guidelines on this regard.

Comment 4 Orion Poplawski 2011-09-22 18:53:48 UTC
I've found an overlapping strcpy() call in Remove_Path() that is likely the source of this.  New build submitted: http://koji.fedoraproject.org/koji/taskinfo?taskID=3371315.  When that is completed and a new rawhide repo is done, please try building against that.  If that helps I'll submit updates.

Comment 5 Susi Lehtola 2011-09-30 15:35:21 UTC
GROMACS now builds in rawhide. Please build and tag in F16 buildroot as well.

Comment 6 Fedora Update System 2011-09-30 15:55:36 UTC
environment-modules-3.2.8a-3.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/environment-modules-3.2.8a-3.fc16

Comment 7 Susi Lehtola 2011-09-30 17:20:35 UTC
Could you do the buildroot tag?

Comment 8 Orion Poplawski 2011-09-30 17:26:56 UTC
Submitted.

Comment 9 Fedora Update System 2011-09-30 19:49:50 UTC
Package environment-modules-3.2.8a-3.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing environment-modules-3.2.8a-3.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/environment-modules-3.2.8a-3.fc16
then log in and leave karma (feedback).

Comment 10 Fedora Update System 2011-10-09 19:51:30 UTC
environment-modules-3.2.8a-3.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.