Bug 2320997 - Loading of MPI module with %{_mpich_load} and %{_openmpi_load} is broken in F40 and F39
Summary: Loading of MPI module with %{_mpich_load} and %{_openmpi_load} is broken in F...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: Lmod
Version: 40
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Orion Poplawski
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-10-22 14:15 UTC by Sandro
Modified: 2024-11-01 03:42 UTC (History)
5 users (show)

Fixed In Version: Lmod-8.7.53-1.fc40 Lmod-8.7.53-1.fc39 Lmod-8.7.53-1.el9 Lmod-8.7.53-1.el8 Lmod-8.7.53-1.fc41
Clone Of:
Environment:
Last Closed: 2024-11-01 02:43:59 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Sandro 2024-10-22 14:15:18 UTC
The execution of those macros kills the build. It seems due to a non-zero exit status when sourcing environment-modules. Though, there is no specific error printed.

This seems to be the same issue we saw a while ago [1] and it boils down to what package is selected for providing `environment(modules)`. If the honor falls upon Lmod the build breaks. I fit happens to be `environment-modules` everything works.

Last time this was hotfixed by having rpm-mpi-hooks depend on environment-modules instead of environment(modules) [2]. However, that fix was only applied to rawhide at the time. Ultimately, I think whatever package provides environment(modules) should also be useable in out build environments. In other words, I consider the dependency change applied a hack rather than a solution.

[1] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/SL4IL4SWBYIUZVGXNHTX6CRYDHPIXWMO/#7DPVV7PRXTR73BSXHQXLD6R5LANFIZUV
[2] https://src.fedoraproject.org/rpms/rpm-mpi-hooks/c/90f8dfbf925a815820cd1bcdd469c8827a757b19?branch=rawhide

Reproducible: Always

Steps to Reproduce:
1. fedpkg clone python-lfpy
2. cd python-lfpy
3. fedpkg --release f40 mockbuild --no-cleanup-after
Actual Results:  
Build fails if Lmod is selected as provider of environment(modules).

Expected Results:  
Build succeeds regardless of selected environment(modules) provider.

Both packages, Lmod and environment-modules, provide /etc/profile.d/modules.sh. The version from the latter appears to work while the former doesn't. Or I'm missing something and the issue is manifesting somewhere else entirely.

For F40/F39 there's a workaround by explicitly specifying environment-modules as a build dependency. However, the reverse, specifying Lmod to get a breaking build in F41/rawhide, does not work. It will install both packages and apparently environment-modules gets to install environment-modules.

I'm not sure that's desirable either. Maybe two providers of the same capability should conflict each other?

Comment 1 Sandro 2024-10-22 14:16:59 UTC
cc'ing the maintainers of environment-modules and rpm-mpi-hooks for awareness

Comment 2 Sandro 2024-10-22 14:25:13 UTC
I've been a bit hasty and sloppy. Let me clarify...

(In reply to Sandro from comment #0)
> The execution of those macros kills the build. It seems due to a non-zero
> exit status when sourcing environment-modules.

when sourcing /etc/profile.d/modules.sh

> For F40/F39 there's a workaround by explicitly specifying
> environment-modules as a build dependency. However, the reverse, specifying
> Lmod to get a breaking build in F41/rawhide, does not work. It will install
> both packages and apparently environment-modules gets to install
> environment-modules.

gets to install /etc/profile.d/modules.sh

Let me know if anything else is unclear.

Comment 3 hannes 2024-10-22 20:58:10 UTC
There are quite some packages affected by this issue. 
https://koschei.fedoraproject.org/affected-by/Lmod?epoch1=0&version1=8.7.37&release1=1.fc40&epoch2=0&version2=8.7.48&release2=1.fc40&collection=f40

I just recently came across this issue, when trying to update gretl in f40 and the build failed 
https://koji.fedoraproject.org/koji/buildinfo?buildID=2572608

Comment 4 Cristian Le 2024-10-23 00:38:41 UTC
I was working on cp2k update for unrelated reasons and I've made a change to remove the `source /etc/profile.d/modues.sh` and I have the pure `module load` commands running. The build seems to have detected the mpi environments just fine. Not sure why it is working though. I will re-run the builds for F40 and check the build logs again

Comment 5 Orion Poplawski 2024-10-23 14:00:47 UTC
I thik this is fixed in later Lmod releases.  Starting some tests now...

Comment 6 Orion Poplawski 2024-10-23 14:05:40 UTC
Confirmed that 8.7.53 is working.  FYI - for future debugging, this helps:

export LMOD_SH_DBG_ON=1

Comment 7 Fedora Update System 2024-10-23 14:09:07 UTC
FEDORA-2024-14d553a254 (Lmod-8.7.53-1.fc39) has been submitted as an update to Fedora 39.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-14d553a254

Comment 8 Fedora Update System 2024-10-23 14:09:08 UTC
FEDORA-EPEL-2024-7ca00cd70b (Lmod-8.7.53-1.el9) has been submitted as an update to Fedora EPEL 9.
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2024-7ca00cd70b

Comment 9 Fedora Update System 2024-10-23 14:09:08 UTC
FEDORA-2024-b883c27c18 (Lmod-8.7.53-1.fc40) has been submitted as an update to Fedora 40.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-b883c27c18

Comment 10 Fedora Update System 2024-10-23 14:09:09 UTC
FEDORA-2024-602fe3db71 (Lmod-8.7.53-1.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-602fe3db71

Comment 11 Fedora Update System 2024-10-23 14:09:10 UTC
FEDORA-EPEL-2024-14060f4306 (Lmod-8.7.53-1.el8) has been submitted as an update to Fedora EPEL 8.
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2024-14060f4306

Comment 12 Orion Poplawski 2024-10-23 14:11:21 UTC
I've also submitted buildroot overrides for the Fedora releases.

Comment 13 Sandro 2024-10-23 19:53:47 UTC
First of all thanks for the fix. Purely out of curiosity, do you happen to know or have a pointer to what got fixed where?

When investigating the issue, I also tried running the first two commands %{_mpich_load} expands to inside the mock chroot. Nothing blew up. I just did so again with `export LMOD_SH_DBG_ON=1` (still on Lmod-8.7.49). It didn't give me any additional output.

Anyway, I can confirm I'm able to build again (in Koji with the override) using Lmod as the environment(modules) provider.

Comment 14 Fedora Update System 2024-10-24 01:57:29 UTC
FEDORA-2024-602fe3db71 has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-602fe3db71`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-602fe3db71

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 15 Fedora Update System 2024-10-24 02:08:22 UTC
FEDORA-EPEL-2024-7ca00cd70b has been pushed to the Fedora EPEL 9 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2024-7ca00cd70b

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 16 Fedora Update System 2024-10-24 02:14:31 UTC
FEDORA-EPEL-2024-14060f4306 has been pushed to the Fedora EPEL 8 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2024-14060f4306

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 17 Fedora Update System 2024-10-24 02:19:09 UTC
FEDORA-2024-14d553a254 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-14d553a254`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-14d553a254

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Fedora Update System 2024-10-24 02:23:26 UTC
FEDORA-2024-b883c27c18 has been pushed to the Fedora 40 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-b883c27c18`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-b883c27c18

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 19 Fedora Update System 2024-11-01 02:43:59 UTC
FEDORA-2024-b883c27c18 (Lmod-8.7.53-1.fc40) has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 20 Fedora Update System 2024-11-01 03:17:23 UTC
FEDORA-2024-14d553a254 (Lmod-8.7.53-1.fc39) has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 21 Fedora Update System 2024-11-01 03:17:28 UTC
FEDORA-EPEL-2024-7ca00cd70b (Lmod-8.7.53-1.el9) has been pushed to the Fedora EPEL 9 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 22 Fedora Update System 2024-11-01 03:36:32 UTC
FEDORA-EPEL-2024-14060f4306 (Lmod-8.7.53-1.el8) has been pushed to the Fedora EPEL 8 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 23 Fedora Update System 2024-11-01 03:42:04 UTC
FEDORA-2024-602fe3db71 (Lmod-8.7.53-1.fc41) has been pushed to the Fedora 41 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.