Bug 2302033

Summary: Requires: environment(modules) fails to install environment-modules leading to openmpi no longer providing libmpi.so.40()(64bit)(openmpi-x86_64) but libmpi.so.40()(64bit)
Product: [Fedora] Fedora Reporter: david08741
Component: rpm-mpi-hooksAssignee: Sandro Mani <manisandro>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rawhideCC: dledford, gui1ty, hladky.jiri, jkolarik, manisandro, mblaha, nsella, orion, pkfed, pkratoch, rpm-software-management
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-08-01 13:14:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2301933, 2301934, 2301935, 2301936, 2301937, 2301938, 2301939, 2301940, 2301941, 2301942, 2301943, 2301944, 2301945, 2301946, 2301947, 2301948, 2301949, 2301950, 2301951, 2301952, 2301953, 2301954, 2301955, 2301956, 2301957, 2301958, 2301959, 2301960, 2301961, 2301962, 2301963, 2301964, 2301965, 2301966, 2301967, 2301968, 2301969, 2301970, 2301971, 2301972, 2301973, 2301974, 2301975, 2301976, 2301977, 2301978, 2301979, 2301980, 2301981, 2301982, 2301983, 2301984, 2301985    

Description david08741 2024-07-31 11:32:32 UTC
There is a number of FTI bugs, that are caused by a change in the provides of openmpi.

  - nothing provides libmpi.so.40()(64bit)(openmpi-x86_64) needed by bout++-openmpi-5.1.0-13.fc41.x86_64

5.0.5 provides:
libmpi.so.40()(64bit)
5.0.3 did provide:
libmpi.so.40()(64bit)(openmpi-x86_64)



Reproducible: Always

Steps to Reproduce:
1. dnf5 install bout++-openmpi

Actual Results:  

  - nothing provides libmpi.so.40()(64bit)(openmpi-x86_64) needed by bout++-openmpi-5.1.0-13.fc41.x86_64

Expected Results:  
bout++ installs

Comment 1 david08741 2024-07-31 11:43:04 UTC
I think all the bugs 2301933 to 2301980 are also affected by this.
Should I add them all?

Comment 2 Sandro Mani 2024-07-31 13:13:37 UTC
I believe this is because environment-modules has fallen off the dependency chain somehow and hence rpm-mpi-hooks fails to load the MPI environment and hence to generate the correct provides. rpm-mpi-hooks actually has

Requires:       environment(modules)

which makes me suspect a dnf5 issue? Explicitly Requiring: environment-modules fixes the issue, I've done this in rpm-mpi-hooks-8-10.fc41. As soon as this build is ready, rebuilding openmpi should be sufficient.

Comment 3 Sandro 2024-07-31 17:22:05 UTC
I can confirm that the explicit `Requires: environment(modules)` solves the issue. I did a scratch build of openmpi before I knew of the root cause. That build picked up the fixed rpm-mpi-hooks and produced correct metadata.

Logs of the openmpi build with wrong / incomplete metadata: https://koji.fedoraproject.org/koji/buildinfo?buildID=2519082

Build with fixed rpm-mpi-hooks: https://koji.fedoraproject.org/koji/taskinfo?taskID=121299407

Previous release of openmpi with good meta data: https://koji.fedoraproject.org/koji/buildinfo?buildID=2499148

I'm reassigning to dnf5 for now. From there it can trickle down further once it becomes clear what's actually causing the issue (libsolv?).

Extending on comment 0, here is the full difference of provides between openmpi 5.0.5-1 and 5.0.3-3:

$ rpm -q --provides -p openmpi-5.0.5-1.fc41.x86_64.rpm
config(openmpi) = 5.0.5-1.fc41
libmpi.so.40()(64bit)
libmpi_java.so.40()(64bit)
libmpi_mpifh.so.40()(64bit)
libmpi_usempi_ignore_tkr.so.40()(64bit)
libmpi_usempif08.so.40()(64bit)
liboshmem.so.40()(64bit)
mpi
openmpi = 5.0.5-1.fc41
openmpi(x86-64) = 5.0.5-1.fc41

$ rpm -q --provides -p openmpi-5.0.3-3.fc41.x86_64.rpm
config(openmpi) = 5.0.3-3.fc41
libmpi.so.40()(64bit)(openmpi-x86_64)
libmpi_java.so.40()(64bit)(openmpi-x86_64)
libmpi_mpifh.so.40()(64bit)(openmpi-x86_64)
libmpi_usempi_ignore_tkr.so.40()(64bit)(openmpi-x86_64)
libmpi_usempif08.so.40()(64bit)(openmpi-x86_64)
liboshmem.so.40()(64bit)(openmpi-x86_64)
mpi
openmpi = 5.0.3-3.fc41
openmpi(x86-64) = 5.0.3-3.fc41

Comment 4 Sandro 2024-07-31 17:50:24 UTC
(In reply to Sandro Mani from comment #2)
> I believe this is because environment-modules has fallen off the dependency
> chain somehow and hence rpm-mpi-hooks fails to load the MPI environment and
> hence to generate the correct provides. rpm-mpi-hooks actually has
> 
> Requires:       environment(modules)
> 
> which makes me suspect a dnf5 issue? Explicitly Requiring:
> environment-modules fixes the issue, I've done this in
> rpm-mpi-hooks-8-10.fc41. As soon as this build is ready, rebuilding openmpi
> should be sufficient.

I checked both 5.0.5-1 and 5.0.3-3 official builds. Neither has any mention of environment-modules nor environment(modules) in build.log or root.log. What am I missing?

I also saw that we have two packages providing environment(modules):

$ fedrq pkgs --resolve 'environment(modules)'
Lmod-8.7.44-2.fc41.x86_64
environment-modules-5.4.0-2.fc41.x86_64

Comment 5 Sandro Mani 2024-07-31 20:51:27 UTC
In https://kojipkgs.fedoraproject.org//packages/openmpi/5.0.5/2.fc41/data/logs/x86_64/root.log you can see that environment-modules is being installed.

Comment 6 Sandro 2024-07-31 21:34:22 UTC
Yes. But that's the rebuild after you changed rpm-mpi-hooks from "Requires: environment(modules)" to "Requires: environment-modules". I was curious if it got installed in the 5.0.3-3 build, which has the correct meta data.

Comment 7 Todd Zullinger 2024-08-01 00:44:23 UTC
The logs for https://koji.fedoraproject.org/koji/buildinfo?buildID=2519082 at https://kojipkgs.fedoraproject.org//packages/openmpi/5.0.5/1.fc41/data/logs/x86_64/root.log show that Lmod was installed.  That provides 'environment(modules)' so it is entirely valid for it to be installed when you have 'Requires: environment(modules)' in your package.

If you really do need 'environment-modules' and not 'Lmod' (which it seems you do), then you should use that as the Requires. In other words, keep the change you made but don't look at it as a workaround -- it fixes a bug in the previous Requires, which were ambigious.

Comment 8 Orion Poplawski 2024-08-01 03:57:18 UTC
The issue is that an update to lua-lpeg 1.1.0 broke lua-json and thus Lmod - see bz#2302036.

Comment 9 Sandro 2024-08-01 06:56:28 UTC
(In reply to Orion Poplawski from comment #8)
> The issue is that an update to lua-lpeg 1.1.0 broke lua-json and thus Lmod -
> see bz#2302036.

Right. I saw a failed build for an updated `Lmod`, but didn't look into why it failed. With `Lmod` broken, that makes me think...

(In reply to Todd Zullinger from comment #7)
> If you really do need 'environment-modules' and not 'Lmod' (which it seems
> you do), then you should use that as the Requires. In other words, keep the
> change you made but don't look at it as a workaround -- it fixes a bug in
> the previous Requires, which were ambigious.

Does `rpm-mpi-hooks` really need `environment-modules`? As pointed out in comment 4 neither of the openmpi builds 5.0.3-3 nor 5.0.5-1 had it installed in the buildroot. Yet one produced correct meta data, while the other, with a broken Lmod, failed to do so.

At the same time installing `environment-modules` allowed the 5.0.5-2 build to succeed with correct meta data output. So, it looks like either `Lmod` or `environment-modules` can be used to produce the required correct meta data. If there's a preference for one over the other, I agree, it should be specified explicitly.

Comment 10 Marek Blaha 2024-08-01 07:00:52 UTC
As said in comment 7, "environment(modules)" is provided by two packages:

$ dnf provides 'environment(modules)'
Lmod-8.7.44-2.fc41.x86_64 : Environmental Modules System in Lua
Repo        : rawhide
Matched from:
Provide    : environment(modules)

environment-modules-5.4.0-2.fc41.x86_64 : Provides dynamic modification of a user's
                                        : environment
Repo        : rawhide
Matched from:
Provide    : environment(modules)

For the solver both solutions are valid and you cannot rely on which one is selected. Btw, also openmpi-5.0.3-3.fc41 (previous build with good metadata) was build with Lmod installed as "environment(moduled)" provider.

That said, I don't think dnf5 can resolve this issue. Reassigning to Lmod component for further investigation.

Comment 11 Sandro 2024-08-01 07:10:16 UTC
I suppose this has turned out to be a duplicate of bug 2302036, where the Lmod issue is discussed. Though, I find the the `Lmod` vs. `environment-modules` discussion interesting.

Comment 12 Sandro Mani 2024-08-01 07:30:56 UTC
> Does `rpm-mpi-hooks` really need `environment-modules`?

See https://src.fedoraproject.org/rpms/rpm-mpi-hooks/blob/rawhide/f/mpi.prov, ultimately it needs to be able to do "module load <module>". And I believe Lmod does provide this functionality, even though I was not aware of it's existence before. As I see it explicitly requiring environment-modules in rpm-mpi-hooks should not harm, and I guess this issue can be closed, as #2302036 is tracking the Lmod issue.

Comment 13 Orion Poplawski 2024-08-01 13:14:04 UTC
All rpm-mpi-hooks needs is a working 'module' implementation - either environment-modules or Lmod.  I'm fine with it being fixed to environment-modules though.