Bug 1396213

Summary: [RHEL-7.5] libpsm2-compat: package up libpsm_infinipath.so.1 as private library
Product: Red Hat Enterprise Linux 7 Reporter: Tuomo Soini <tis>
Component: libpsm2Assignee: Honggang LI <honli>
Status: CLOSED ERRATA QA Contact: Mike Stowell <mstowell>
Severity: low Docs Contact:
Priority: low    
Version: 7.5CC: bhu, dledford, honli, infiniband-qe, jshortt, knweiss, mstowell, pandrade, rdma-dev-team, russell.w.mcguire, s.renatscher, tumeya
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libpsm2-10.3.8-3.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 17:46:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1442258    
Attachments:
Description Flags
Proposed fix for libpsm2 none

Description Tuomo Soini 2016-11-17 17:46:56 UTC
infinipath-psm-3.3-22_g4abbc60_open.2.el7 obsoletes libpsm2-compat-10.2.33-1.el7

infinipath-psm is older package than libpsm2-compat so this must be a error.

Related bugs which are private and not visible to normal humans: 1272022

Comment 1 Honggang LI 2016-11-18 02:14:10 UTC
(In reply to Tuomo Soini from comment #0)
> infinipath-psm-3.3-22_g4abbc60_open.2.el7 obsoletes
> libpsm2-compat-10.2.33-1.el7
> 
> infinipath-psm is older package than libpsm2-compat so this must be a error.

This is not an error, even infinipath-psm is older than libpsm2-compat. It is intentional. The libpsm2 library has been created to support Intel OPA-HFI1 hardware. libpsm2 has a psm1 shim to allow psm1 applications (built for TrueScale QIB hardware) to run over OPA-HFI1 hardware. However, libpsm2 dose *NOT* support TrueScale QIB hardware. As a result the libpsm2-compat, a.k.a, the psm1 shim, conflicts with infinipath-psm (the original PSM1 library).

If you have TrueScale hardware, you have to link your applications against infinipath-psm library, not the libpsm2-compat. That is why infinipath-psm obsoletes libpsm2-compat.

Thanks

Comment 3 Tuomo Soini 2016-11-18 06:43:40 UTC
That doesn't work the way you described.

Obsoletes: means if you have libpsm2-compat installed on your machine and you update your system, Obsoleting package replaces libpsm2-compat.

That means libpsm2-compat package shouldn't be in distro at all.

Comment 4 Tuomo Soini 2016-11-18 06:48:11 UTC
And there is no Conflicts: in libpsm2-compat package at all.

If you say Obsoletes: is correct that means you should change this bug to component libpsm2 and start deprecating libpsm2-compat sub-package which is obsolete.

Comment 5 Honggang LI 2016-11-18 07:43:59 UTC
(In reply to Tuomo Soini from comment #3)
> That doesn't work the way you described.
> 
> Obsoletes: means if you have libpsm2-compat installed on your machine and
> you update your system, Obsoleting package replaces libpsm2-compat.

Both infinipath-psm and libpsm2-compat provides 'libpsm_infinipath.so.1'.

# yum provides 'libpsm_infinipath.so.1*'
infinipath-psm-3.3-22_g4abbc60_open.2.el7.x86_64 : QLogic PSM Libraries
Provides    : libpsm_infinipath.so.1()(64bit)

libpsm2-compat-10.2.33-1.el7.x86_64 : Support for MPIs linked with PSM1
Provides    : libpsm_infinipath.so.1(PSM_1.0)(64bit)
Provides    : libpsm_infinipath.so.1()(64bit)


If you need infinipath-psm, that means you have TrueScale QIB (psm1) hardware. Then you should link your application against libpsm_infinipath.so.1 from infinipath-psm. Obsoleting will delete libpsm2-compat, then you do not have to setup the environment variable "LD_LIBRARY_PATH". This is important for rpm building system mock.

> That means libpsm2-compat package shouldn't be in distro at all.

You will need libpsm_infinipath.so.1 from libpsm2-compat to run psm1 application over OPA-HFI1 hardware.

Comment 6 Honggang LI 2016-11-18 07:50:32 UTC
(In reply to Tuomo Soini from comment #4)
> And there is no Conflicts: in libpsm2-compat package at all.
> 
> If you say Obsoletes: is correct that means you should change this bug to
> component libpsm2 and start deprecating libpsm2-compat sub-package which is
> obsolete.

Yes, you are right we should make libpms2-compat conflicts with infinipath-psm. But we would not deprecate libpsm2-compat sub-package because psm1 over OPA-HFI1 needs it.

Comment 7 Tuomo Soini 2016-11-18 07:57:20 UTC
Note. It will still not work if you don't remove Obsoletes: tag. Ever.

Comment 8 Tuomo Soini 2016-11-18 14:12:32 UTC
Let me rephrase it: libpsm2-compat Conflicts infinipath-psm won't help if there is Obsoletes libpsm-compat in infinipath-psm - infinipath-psm is still replacement and yum will replace installed libpsm2-compat with infinipath-psm.

So whatever you do you can't use Obsoletes: tag here - obsoletes is one of the most powerful tags in rpm and it instructs yum telling obsoleted package shouldn't be installed any more (and is removed from distribution) and is to be replaced with obsoleting package.

Comment 9 Doug Ledford 2016-11-19 04:22:04 UTC
(In reply to Tuomo Soini from comment #8)
> Let me rephrase it: libpsm2-compat Conflicts infinipath-psm won't help if
> there is Obsoletes libpsm-compat in infinipath-psm - infinipath-psm is still
> replacement and yum will replace installed libpsm2-compat with
> infinipath-psm.
> 
> So whatever you do you can't use Obsoletes: tag here - obsoletes is one of
> the most powerful tags in rpm and it instructs yum telling obsoleted package
> shouldn't be installed any more (and is removed from distribution) and is to
> be replaced with obsoleting package.

This is all correct.  And it's right.  You are correct that Obsoletes: is one of the most powerful tags.  The situation here is that Intel made a decision that we simply can't go along with.  They made a bad decision to replace all libpsm.so.1 instances with a replacement lib that transparently worked with old libpsm1 apps but only supported the new hfi1 (OPA) hardware.  Every cluster that used qib hardware and worked previously and used the PSM library interface would have suddenly stopped working.  We obsolete the libpsm2 compat library solely to stop their library from breaking existing clusters.  So, now, if an app links against libpsm2, it will work on hfi1 (OPA) hardware by default, and if it is linked against libpsm1 (InfiniBand), it will continue to work and not be broken by the addition of libpsm2.  This is NOTABUG and things are working as intended.

Comment 10 Tuomo Soini 2016-11-19 09:46:21 UTC
Ok. So you want older installed libpsm2-compat to be replaced with infinipath-psm on upgrade. That means there should be versioned obsolete.

Obsoletes: libpsm2-compat < 10.2.33-1

That would be toe correct thing to do if upgrade should replace libpsm2-compat with infinipath-psm.

What you currently have is unversioned Obsoletes: which makes libpsm2-compat current version also obsoleted causing libpsm2-compat not to be usable on any system because yum update on system with libpsm2-compat replaces it with infinipath-psm. So that makes even new version of libpsm2-compat not usable.

Comment 12 Doug Ledford 2016-11-21 19:56:36 UTC
(In reply to Tuomo Soini from comment #10)
> Ok. So you want older installed libpsm2-compat to be replaced with
> infinipath-psm on upgrade. That means there should be versioned obsolete.
> 
> Obsoletes: libpsm2-compat < 10.2.33-1
> 
> That would be toe correct thing to do if upgrade should replace
> libpsm2-compat with infinipath-psm.
> 
> What you currently have is unversioned Obsoletes: which makes libpsm2-compat
> current version also obsoleted causing libpsm2-compat not to be usable on
> any system because yum update on system with libpsm2-compat replaces it with
> infinipath-psm. So that makes even new version of libpsm2-compat not usable.

Yes, that is exactly right.  The libpsm2-compat exists for one reason only: to allow older applications linked against the library from infinipath-psm to be run on new clusters using hfi1 hardware.  Intel has no intention of modifying libpsm2-compat to ever work with the older qib hardware.  In Intel's mind, this was fine because they ship their software with their clusters, and so they would ship the old software with the old qib based clusters, and their new software with their new clusters, and everything would work just fine.

However, Fedora doesn't ship a specific version of our OS for a given cluster, we simply have our OS.  The mere presence of libpsm2-compat on a qib cluster renders that cluster inoperable.  The very idea of libpsm2-compat was such a bad idea in its execution that I have, more than once, had words with Intel about the issue.  Their engineers understand just how bad of an idea it was, and it was forced on them by their product management folks.  There is nothing we can do about it now.  I would be just as happy if we didn't build the libpsm2-compat package at all, but it is kept around for the odd circumstance where someone really, truly wants it.  But for our OS, an unversioned Obsolete: is *exactly* what we want.  Intel has every intention to keep releasing libpsm2 updates, and so the version number will keep going up, but they have no intention of fixing the fact that libpsm2-compat renders qib clusters dead.  In order to protect our installed base of qib hardware from being rendered inoperable by libpsm2, that unversioned Obsoletes: must remain.  Anyone truly wanting to run libpsm2-compat will need to uninstall infinipath-psm, add infinipath-psm to the exclude list for yum/dnf, and then install libpsm2-compat.

Comment 13 Tuomo Soini 2016-11-22 09:19:55 UTC
I'd still say this is a bug.

Comment 14 Tuomo Soini 2016-11-22 09:23:51 UTC
Created attachment 1222646 [details]
Proposed fix for libpsm2

Comment 15 Doug Ledford 2016-11-22 12:58:04 UTC
No, everything is exactly as it needs to be, even as atypical as it might be.

Comment 16 russell.w.mcguire 2017-06-22 03:46:02 UTC
I think I am missing something on this issue:

Is it a conflict when the filename in question resides in different paths?
 
https://bugzilla.redhat.com/attachment.cgi?id=1222646&action=diff#a/libpsm2.spec_sec3

It clearly states in the spec file that this is a known filename conflict and thus puts the file in different location to avoid conflicts, 
i.e. /lib64/psm2-compat/ and not /lib64/
Also I would note that the libpsm2-compat rpm doesn't ship with any -devel packages, i.e. .so or .h files, thus makeing it impossibly to link against to avoid conflicts during any kind of build/compile/link state. 

So is there a rule for RPM's that says any filename regardless of location is a conflict? If so, doesn't this preclude any sort of cross compile chains? I thought RPM only yelled if the filename was of the same name and location?

Intel put this together to co-exist with both libpsm2 and libinfinipath on the same machine at the same time. If this is not possible, then please advise on how to make this possible. Last I checked the spec file we ship with works just fine, i.e. I can place all three RPM's into the system without conflicts using normal YUM and/or RPM installations, if I avoid the spec file shipped by RHEL. i.e. internally built without the obsoletes line.

They co-exist just fine at runtime and in the filesystem. So why the conflict?

[lib64]# find . | grep libpsm
./libpsm2.so
./libpsm2.so.2
./libpsm_infinipath.so.1.16
./libpsm_infinipath.so.1
./libpsm2.so.2.1
./psm2-compat/libpsm_infinipath.so.1

Comment 17 Tuomo Soini 2017-06-22 07:04:31 UTC
I see clear problem in packaging:

./libpsm_infinipath.so.1
./psm2-compat/libpsm_infinipath.so.1

That means both packages provide same library - that can not happen.

both versions of libpsm_infinipath.so.1 MUST work same, in this case there shouldn't be psm2-compat package at all. If that is needed, intel has screwed up badly.

Comment 18 Honggang LI 2017-06-22 07:32:02 UTC
(In reply to russell.w.mcguire from comment #16)
> I think I am missing something on this issue:
> 
> Is it a conflict when the filename in question resides in different paths?

Yes. Because RPM dependency does not check the *FULL* path of the file. For example, openmpi-xxx.rpm requires 'libpsm_infinipath.so.1()(64bit)'. openmpi does not requires "/usr/lib64/psm2-compat/libpsm_infinipath.so.1", or "/usr/lib64/libpsm_infinipath.so.1".

There is an issue like race-condition when you install openmpi with YUM, if 1) both infininpath-psm and libpsm2-compat were available in the YUM repo and 2) infinipath-psm and libpsm2-compat does not conflict each other. Sometime, YUM will install infinipath-psm for openmpi, but it may pick up libpsm2-compat which will cause openmpi failed. This is a real issue we found when we were testing RHEL-7.x. That is why we mark infinipath-psm conflicts with libpsm2-compat.


# yum provides 'libpsm_infinipath.so.1*'
Loaded plugins: product-id, search-disabled-repos, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
infinipath-psm-3.3-25_g326b95a_open.1.el7.x86_64 : QLogic PSM Libraries
Repo        : beaker-Server
Matched from:
Provides    : libpsm_infinipath.so.1()(64bit)



libpsm2-compat-10.2.63-2.el7.x86_64 : Support for MPIs linked with PSM1
Repo        : beaker-Server-optional
Matched from:
Provides    : libpsm_infinipath.so.1(PSM_1.0)(64bit)
Provides    : libpsm_infinipath.so.1()(64bit)


# rpm -qpl libpsm2-compat-10.2.63-2.el7.x86_64.rpm | grep libpsm_infinipath.so.1

/usr/lib64/psm2-compat/libpsm_infinipath.so.1
    ^^^^^^^^^^^^^^^^^^^

# rpm -qpl infinipath-psm-3.3-25_g326b95a_open.1.el7.x86_64.rpm | grep libpsm_infinipath.so.1
/usr/lib64/libpsm_infinipath.so.1
/usr/lib64/libpsm_infinipath.so.1.16
    ^^^^^^^^^^^^^^^^^^^^^

Comment 19 Doug Ledford 2017-06-22 13:25:58 UTC
(In reply to Tuomo Soini from comment #17)
> I see clear problem in packaging:
> 
> ./libpsm_infinipath.so.1
> ./psm2-compat/libpsm_infinipath.so.1
> 
> That means both packages provide same library - that can not happen.

Honggang's explanation of why this breaks rpm is helpful for understanding this blanket statement.  However, putting both of these libraries in the same package would resolve that issue.

> both versions of libpsm_infinipath.so.1 MUST work same, in this case there
> shouldn't be psm2-compat package at all. If that is needed, intel has
> screwed up badly.

They have done *exactly* that.

I will skip the gory details, so long story short, we can't just ship their psm2-compat library as it would brick all of our existing qib hardware based clusters, yet Intel really wants us to ship that library as it's absolutely needed so that commercially developed software, that can not be recompiled on a new cluster, and that was originally compiled against psm1, will work on newer hfi1 based clusters.

Given that Intel is our partner, we (meaning the Red Hat employees on this bug) are obligated to do our best to help them out of their predicament, not matter how well deserved it is :-/

My only suggested solution would be to combine the psm1 and psm2 libraries into a single source package, build psm2, psm2-devel, psm2-compat, psm1, and psm1-devel, but when packaging things up put the psm2-compat and psm1 together into just a psm1 package, put both libraries in the psm1 package in non-standard locations, and use the alternatives mechanism to default the qib version of the psm1 library to being standard, with the option for the user to select the hfi1 version instead.  I selected the alternatives mechanism because this is always going to be a system wide decision, and one that users need not be aware of, so the modules mechanism isn't really the right way to do things.  That's my suggestion if you want to be able to resolve this.

Comment 20 Tuomo Soini 2017-08-04 07:43:47 UTC
> There is an issue like race-condition when you install openmpi with YUM, if
> 1) both infininpath-psm and libpsm2-compat were available in the YUM repo
> and 2) infinipath-psm and libpsm2-compat does not conflict each other.
> Sometime, YUM will install infinipath-psm for openmpi, but it may pick up
> libpsm2-compat which will cause openmpi failed. This is a real issue we
> found when we were testing RHEL-7.x. That is why we mark infinipath-psm
> conflicts with libpsm2-compat.

That last thing is what explains what is wrong. You used Obsoletes: libpsm2-compat when Conflicts: libpsm2-compat would have been correct thing to do.

Comment 21 Honggang LI 2017-08-04 08:12:07 UTC
(In reply to Tuomo Soini from comment #20)

> > found when we were testing RHEL-7.x. That is why we mark infinipath-psm
> > conflicts with libpsm2-compat.
    ^^^^^^^^^
Sorry wrong word, it should be obsoletes.

> That last thing is what explains what is wrong. You used Obsoletes:
> libpsm2-compat when Conflicts: libpsm2-compat would have been correct thing
> to do.

We have machines with only QIB or OPA hardware install. We want infinipath-psm automatically replace libpsm2-compat when install openmpi on those machines. It works well for both QIB machines and OPA machines, because infinipath-psm obsoletes with libpsm2-compat.

In case 1) infinipath-psm conflicts with libpsm2-compat and 2) and libpsm2-compat installed, this command "yum install -y openmpi infinipath-psm" will failed with an error message like:

Error: infinipath-psm conflicts with installed libpsm2-compat

Then, you have to manually delete libpsm2-compat. And re-run "yum install -y openmpi and infinipath-psm".

Comment 22 Tuomo Soini 2017-08-04 10:04:52 UTC
Yes. but Obsoletes: causes libpsm2-compat always to be replaced with infinipath-psm: you can install libpsm2-compat but on next update obsolete takes over.

Comment 23 Honggang LI 2017-08-07 03:31:44 UTC
(In reply to Tuomo Soini from comment #22)
> Yes. but Obsoletes: causes libpsm2-compat always to be replaced with
> infinipath-psm: you can install libpsm2-compat but on next update obsolete
> takes over.

Neither 'Obsoletes' nor 'Conflicts' is perfect solution for this issue. But 'Obsoletes' is the best for Redhat customer. So, I'm closing this as WONTFIX.

Comment 24 Paulo Andrade 2017-12-26 20:07:06 UTC
  The libraries are completely different, and infinipath-psm really should not
obsolete libpsm2-compat. The infinipath-psm package should not have a library
with the same name and major of the one in libpsm2-compat.

  I do not see infinipath-psm upstream being noticed or consulted about it.

  The easiest approach now is to change libpsm2-compat to not provide
libpsm_infinipath.so.1, what would still be wrong (the right owner is libpsm2-compat,
that has versioned symbols, etc, for proper backwards compatibility updates), but
reasonable, by not breaking existing programs and/or libraries, that are likely using
LD_LIBRARY_PATH or some other method to pick the compat library.

  See
https://fedoraproject.org/wiki/Packaging:AutoProvidesAndRequiresFiltering#Private_Libraries

  The correct solution would be for infinipath-psm to change the library name and/or
major. But doing the right thing may now cause more problems than the solution of not
having the packages conflict due to both having a library with same name and major.

Comment 30 Honggang LI 2018-01-05 00:04:02 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1531317

Opened this infinipath-psm bug to delete the obsolete between infinipath-psm and libpsm2-compat.

Comment 38 errata-xmlrpc 2018-04-10 17:46:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0954