| Summary: | [RHEL-7.5] libpsm2-compat: package up libpsm_infinipath.so.1 as private library | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Tuomo Soini <tis> | ||||
| Component: | libpsm2 | Assignee: | Honggang LI <honli> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Mike Stowell <mstowell> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 7.5 | CC: | bhu, dledford, honli, infiniband-qe, jshortt, knweiss, mstowell, pandrade, rdma-dev-team, russell.w.mcguire, s.renatscher, tumeya | ||||
| Target Milestone: | rc | Keywords: | Reopened | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libpsm2-10.3.8-3.el7 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-04-10 17:46:09 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1442258 | ||||||
| Attachments: |
|
||||||
|
Description
Tuomo Soini
2016-11-17 17:46:56 UTC
(In reply to Tuomo Soini from comment #0) > infinipath-psm-3.3-22_g4abbc60_open.2.el7 obsoletes > libpsm2-compat-10.2.33-1.el7 > > infinipath-psm is older package than libpsm2-compat so this must be a error. This is not an error, even infinipath-psm is older than libpsm2-compat. It is intentional. The libpsm2 library has been created to support Intel OPA-HFI1 hardware. libpsm2 has a psm1 shim to allow psm1 applications (built for TrueScale QIB hardware) to run over OPA-HFI1 hardware. However, libpsm2 dose *NOT* support TrueScale QIB hardware. As a result the libpsm2-compat, a.k.a, the psm1 shim, conflicts with infinipath-psm (the original PSM1 library). If you have TrueScale hardware, you have to link your applications against infinipath-psm library, not the libpsm2-compat. That is why infinipath-psm obsoletes libpsm2-compat. Thanks That doesn't work the way you described. Obsoletes: means if you have libpsm2-compat installed on your machine and you update your system, Obsoleting package replaces libpsm2-compat. That means libpsm2-compat package shouldn't be in distro at all. And there is no Conflicts: in libpsm2-compat package at all. If you say Obsoletes: is correct that means you should change this bug to component libpsm2 and start deprecating libpsm2-compat sub-package which is obsolete. (In reply to Tuomo Soini from comment #3) > That doesn't work the way you described. > > Obsoletes: means if you have libpsm2-compat installed on your machine and > you update your system, Obsoleting package replaces libpsm2-compat. Both infinipath-psm and libpsm2-compat provides 'libpsm_infinipath.so.1'. # yum provides 'libpsm_infinipath.so.1*' infinipath-psm-3.3-22_g4abbc60_open.2.el7.x86_64 : QLogic PSM Libraries Provides : libpsm_infinipath.so.1()(64bit) libpsm2-compat-10.2.33-1.el7.x86_64 : Support for MPIs linked with PSM1 Provides : libpsm_infinipath.so.1(PSM_1.0)(64bit) Provides : libpsm_infinipath.so.1()(64bit) If you need infinipath-psm, that means you have TrueScale QIB (psm1) hardware. Then you should link your application against libpsm_infinipath.so.1 from infinipath-psm. Obsoleting will delete libpsm2-compat, then you do not have to setup the environment variable "LD_LIBRARY_PATH". This is important for rpm building system mock. > That means libpsm2-compat package shouldn't be in distro at all. You will need libpsm_infinipath.so.1 from libpsm2-compat to run psm1 application over OPA-HFI1 hardware. (In reply to Tuomo Soini from comment #4) > And there is no Conflicts: in libpsm2-compat package at all. > > If you say Obsoletes: is correct that means you should change this bug to > component libpsm2 and start deprecating libpsm2-compat sub-package which is > obsolete. Yes, you are right we should make libpms2-compat conflicts with infinipath-psm. But we would not deprecate libpsm2-compat sub-package because psm1 over OPA-HFI1 needs it. Note. It will still not work if you don't remove Obsoletes: tag. Ever. Let me rephrase it: libpsm2-compat Conflicts infinipath-psm won't help if there is Obsoletes libpsm-compat in infinipath-psm - infinipath-psm is still replacement and yum will replace installed libpsm2-compat with infinipath-psm. So whatever you do you can't use Obsoletes: tag here - obsoletes is one of the most powerful tags in rpm and it instructs yum telling obsoleted package shouldn't be installed any more (and is removed from distribution) and is to be replaced with obsoleting package. (In reply to Tuomo Soini from comment #8) > Let me rephrase it: libpsm2-compat Conflicts infinipath-psm won't help if > there is Obsoletes libpsm-compat in infinipath-psm - infinipath-psm is still > replacement and yum will replace installed libpsm2-compat with > infinipath-psm. > > So whatever you do you can't use Obsoletes: tag here - obsoletes is one of > the most powerful tags in rpm and it instructs yum telling obsoleted package > shouldn't be installed any more (and is removed from distribution) and is to > be replaced with obsoleting package. This is all correct. And it's right. You are correct that Obsoletes: is one of the most powerful tags. The situation here is that Intel made a decision that we simply can't go along with. They made a bad decision to replace all libpsm.so.1 instances with a replacement lib that transparently worked with old libpsm1 apps but only supported the new hfi1 (OPA) hardware. Every cluster that used qib hardware and worked previously and used the PSM library interface would have suddenly stopped working. We obsolete the libpsm2 compat library solely to stop their library from breaking existing clusters. So, now, if an app links against libpsm2, it will work on hfi1 (OPA) hardware by default, and if it is linked against libpsm1 (InfiniBand), it will continue to work and not be broken by the addition of libpsm2. This is NOTABUG and things are working as intended. Ok. So you want older installed libpsm2-compat to be replaced with infinipath-psm on upgrade. That means there should be versioned obsolete. Obsoletes: libpsm2-compat < 10.2.33-1 That would be toe correct thing to do if upgrade should replace libpsm2-compat with infinipath-psm. What you currently have is unversioned Obsoletes: which makes libpsm2-compat current version also obsoleted causing libpsm2-compat not to be usable on any system because yum update on system with libpsm2-compat replaces it with infinipath-psm. So that makes even new version of libpsm2-compat not usable. (In reply to Tuomo Soini from comment #10) > Ok. So you want older installed libpsm2-compat to be replaced with > infinipath-psm on upgrade. That means there should be versioned obsolete. > > Obsoletes: libpsm2-compat < 10.2.33-1 > > That would be toe correct thing to do if upgrade should replace > libpsm2-compat with infinipath-psm. > > What you currently have is unversioned Obsoletes: which makes libpsm2-compat > current version also obsoleted causing libpsm2-compat not to be usable on > any system because yum update on system with libpsm2-compat replaces it with > infinipath-psm. So that makes even new version of libpsm2-compat not usable. Yes, that is exactly right. The libpsm2-compat exists for one reason only: to allow older applications linked against the library from infinipath-psm to be run on new clusters using hfi1 hardware. Intel has no intention of modifying libpsm2-compat to ever work with the older qib hardware. In Intel's mind, this was fine because they ship their software with their clusters, and so they would ship the old software with the old qib based clusters, and their new software with their new clusters, and everything would work just fine. However, Fedora doesn't ship a specific version of our OS for a given cluster, we simply have our OS. The mere presence of libpsm2-compat on a qib cluster renders that cluster inoperable. The very idea of libpsm2-compat was such a bad idea in its execution that I have, more than once, had words with Intel about the issue. Their engineers understand just how bad of an idea it was, and it was forced on them by their product management folks. There is nothing we can do about it now. I would be just as happy if we didn't build the libpsm2-compat package at all, but it is kept around for the odd circumstance where someone really, truly wants it. But for our OS, an unversioned Obsolete: is *exactly* what we want. Intel has every intention to keep releasing libpsm2 updates, and so the version number will keep going up, but they have no intention of fixing the fact that libpsm2-compat renders qib clusters dead. In order to protect our installed base of qib hardware from being rendered inoperable by libpsm2, that unversioned Obsoletes: must remain. Anyone truly wanting to run libpsm2-compat will need to uninstall infinipath-psm, add infinipath-psm to the exclude list for yum/dnf, and then install libpsm2-compat. I'd still say this is a bug. Created attachment 1222646 [details]
Proposed fix for libpsm2
No, everything is exactly as it needs to be, even as atypical as it might be. I think I am missing something on this issue: Is it a conflict when the filename in question resides in different paths? https://bugzilla.redhat.com/attachment.cgi?id=1222646&action=diff#a/libpsm2.spec_sec3 It clearly states in the spec file that this is a known filename conflict and thus puts the file in different location to avoid conflicts, i.e. /lib64/psm2-compat/ and not /lib64/ Also I would note that the libpsm2-compat rpm doesn't ship with any -devel packages, i.e. .so or .h files, thus makeing it impossibly to link against to avoid conflicts during any kind of build/compile/link state. So is there a rule for RPM's that says any filename regardless of location is a conflict? If so, doesn't this preclude any sort of cross compile chains? I thought RPM only yelled if the filename was of the same name and location? Intel put this together to co-exist with both libpsm2 and libinfinipath on the same machine at the same time. If this is not possible, then please advise on how to make this possible. Last I checked the spec file we ship with works just fine, i.e. I can place all three RPM's into the system without conflicts using normal YUM and/or RPM installations, if I avoid the spec file shipped by RHEL. i.e. internally built without the obsoletes line. They co-exist just fine at runtime and in the filesystem. So why the conflict? [lib64]# find . | grep libpsm ./libpsm2.so ./libpsm2.so.2 ./libpsm_infinipath.so.1.16 ./libpsm_infinipath.so.1 ./libpsm2.so.2.1 ./psm2-compat/libpsm_infinipath.so.1 I see clear problem in packaging: ./libpsm_infinipath.so.1 ./psm2-compat/libpsm_infinipath.so.1 That means both packages provide same library - that can not happen. both versions of libpsm_infinipath.so.1 MUST work same, in this case there shouldn't be psm2-compat package at all. If that is needed, intel has screwed up badly. (In reply to russell.w.mcguire from comment #16) > I think I am missing something on this issue: > > Is it a conflict when the filename in question resides in different paths? Yes. Because RPM dependency does not check the *FULL* path of the file. For example, openmpi-xxx.rpm requires 'libpsm_infinipath.so.1()(64bit)'. openmpi does not requires "/usr/lib64/psm2-compat/libpsm_infinipath.so.1", or "/usr/lib64/libpsm_infinipath.so.1". There is an issue like race-condition when you install openmpi with YUM, if 1) both infininpath-psm and libpsm2-compat were available in the YUM repo and 2) infinipath-psm and libpsm2-compat does not conflict each other. Sometime, YUM will install infinipath-psm for openmpi, but it may pick up libpsm2-compat which will cause openmpi failed. This is a real issue we found when we were testing RHEL-7.x. That is why we mark infinipath-psm conflicts with libpsm2-compat. # yum provides 'libpsm_infinipath.so.1*' Loaded plugins: product-id, search-disabled-repos, subscription-manager This system is not registered with an entitlement server. You can use subscription-manager to register. infinipath-psm-3.3-25_g326b95a_open.1.el7.x86_64 : QLogic PSM Libraries Repo : beaker-Server Matched from: Provides : libpsm_infinipath.so.1()(64bit) libpsm2-compat-10.2.63-2.el7.x86_64 : Support for MPIs linked with PSM1 Repo : beaker-Server-optional Matched from: Provides : libpsm_infinipath.so.1(PSM_1.0)(64bit) Provides : libpsm_infinipath.so.1()(64bit) # rpm -qpl libpsm2-compat-10.2.63-2.el7.x86_64.rpm | grep libpsm_infinipath.so.1 /usr/lib64/psm2-compat/libpsm_infinipath.so.1 ^^^^^^^^^^^^^^^^^^^ # rpm -qpl infinipath-psm-3.3-25_g326b95a_open.1.el7.x86_64.rpm | grep libpsm_infinipath.so.1 /usr/lib64/libpsm_infinipath.so.1 /usr/lib64/libpsm_infinipath.so.1.16 ^^^^^^^^^^^^^^^^^^^^^ (In reply to Tuomo Soini from comment #17) > I see clear problem in packaging: > > ./libpsm_infinipath.so.1 > ./psm2-compat/libpsm_infinipath.so.1 > > That means both packages provide same library - that can not happen. Honggang's explanation of why this breaks rpm is helpful for understanding this blanket statement. However, putting both of these libraries in the same package would resolve that issue. > both versions of libpsm_infinipath.so.1 MUST work same, in this case there > shouldn't be psm2-compat package at all. If that is needed, intel has > screwed up badly. They have done *exactly* that. I will skip the gory details, so long story short, we can't just ship their psm2-compat library as it would brick all of our existing qib hardware based clusters, yet Intel really wants us to ship that library as it's absolutely needed so that commercially developed software, that can not be recompiled on a new cluster, and that was originally compiled against psm1, will work on newer hfi1 based clusters. Given that Intel is our partner, we (meaning the Red Hat employees on this bug) are obligated to do our best to help them out of their predicament, not matter how well deserved it is :-/ My only suggested solution would be to combine the psm1 and psm2 libraries into a single source package, build psm2, psm2-devel, psm2-compat, psm1, and psm1-devel, but when packaging things up put the psm2-compat and psm1 together into just a psm1 package, put both libraries in the psm1 package in non-standard locations, and use the alternatives mechanism to default the qib version of the psm1 library to being standard, with the option for the user to select the hfi1 version instead. I selected the alternatives mechanism because this is always going to be a system wide decision, and one that users need not be aware of, so the modules mechanism isn't really the right way to do things. That's my suggestion if you want to be able to resolve this. > There is an issue like race-condition when you install openmpi with YUM, if
> 1) both infininpath-psm and libpsm2-compat were available in the YUM repo
> and 2) infinipath-psm and libpsm2-compat does not conflict each other.
> Sometime, YUM will install infinipath-psm for openmpi, but it may pick up
> libpsm2-compat which will cause openmpi failed. This is a real issue we
> found when we were testing RHEL-7.x. That is why we mark infinipath-psm
> conflicts with libpsm2-compat.
That last thing is what explains what is wrong. You used Obsoletes: libpsm2-compat when Conflicts: libpsm2-compat would have been correct thing to do.
(In reply to Tuomo Soini from comment #20) > > found when we were testing RHEL-7.x. That is why we mark infinipath-psm > > conflicts with libpsm2-compat. ^^^^^^^^^ Sorry wrong word, it should be obsoletes. > That last thing is what explains what is wrong. You used Obsoletes: > libpsm2-compat when Conflicts: libpsm2-compat would have been correct thing > to do. We have machines with only QIB or OPA hardware install. We want infinipath-psm automatically replace libpsm2-compat when install openmpi on those machines. It works well for both QIB machines and OPA machines, because infinipath-psm obsoletes with libpsm2-compat. In case 1) infinipath-psm conflicts with libpsm2-compat and 2) and libpsm2-compat installed, this command "yum install -y openmpi infinipath-psm" will failed with an error message like: Error: infinipath-psm conflicts with installed libpsm2-compat Then, you have to manually delete libpsm2-compat. And re-run "yum install -y openmpi and infinipath-psm". Yes. but Obsoletes: causes libpsm2-compat always to be replaced with infinipath-psm: you can install libpsm2-compat but on next update obsolete takes over. (In reply to Tuomo Soini from comment #22) > Yes. but Obsoletes: causes libpsm2-compat always to be replaced with > infinipath-psm: you can install libpsm2-compat but on next update obsolete > takes over. Neither 'Obsoletes' nor 'Conflicts' is perfect solution for this issue. But 'Obsoletes' is the best for Redhat customer. So, I'm closing this as WONTFIX. The libraries are completely different, and infinipath-psm really should not obsolete libpsm2-compat. The infinipath-psm package should not have a library with the same name and major of the one in libpsm2-compat. I do not see infinipath-psm upstream being noticed or consulted about it. The easiest approach now is to change libpsm2-compat to not provide libpsm_infinipath.so.1, what would still be wrong (the right owner is libpsm2-compat, that has versioned symbols, etc, for proper backwards compatibility updates), but reasonable, by not breaking existing programs and/or libraries, that are likely using LD_LIBRARY_PATH or some other method to pick the compat library. See https://fedoraproject.org/wiki/Packaging:AutoProvidesAndRequiresFiltering#Private_Libraries The correct solution would be for infinipath-psm to change the library name and/or major. But doing the right thing may now cause more problems than the solution of not having the packages conflict due to both having a library with same name and major. https://bugzilla.redhat.com/show_bug.cgi?id=1531317 Opened this infinipath-psm bug to delete the obsolete between infinipath-psm and libpsm2-compat. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0954 |