Bug 1841354
Summary: | Please rebase hwloc to 2.2 | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Ben Woodard <woodard> | ||||
Component: | hwloc | Assignee: | Prarit Bhargava <prarit> | ||||
Status: | CLOSED ERRATA | QA Contact: | Jeff Bastian <jbastian> | ||||
Severity: | medium | Docs Contact: | Jaroslav Klech <jklech> | ||||
Priority: | unspecified | ||||||
Version: | 8.0 | CC: | andrew.wellington, brian, carl, codonell, ernunes, fj-lsoft-kernel-it, infiniband-qe, jarod, jbastian, lgoncalv, mgrondona, prarit, rvr, tgummels, tmichael, triegel, trix, williams, zguo | ||||
Target Milestone: | rc | Keywords: | Rebase | ||||
Target Release: | 8.0 | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Enhancement | |||||
Doc Text: |
.hwloc rebased to version 2.2.0
The `hwloc` package has been upgraded to version 2.2.0, which provides the following change:
* The `hwloc` functionality can report details on Nonvolatile Memory Express (NVMe) drives including total disk size and sector size.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-05-18 14:41:53 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1732982, 1850084, 1877365 | ||||||
Attachments: |
|
Description
Ben Woodard
2020-05-28 23:15:46 UTC
It looks like any system with an NVMe drive should work for testing. Testing lstopo from RHEL-8.2.0 vs upstream git: [root@raven ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:0 0 465.8G 0 disk ├─nvme0n1p1 259:1 0 4M 0 part ├─nvme0n1p2 259:2 0 1G 0 part /boot └─nvme0n1p3 259:3 0 464.8G 0 part ├─rhel-root 253:0 0 50G 0 lvm / ├─rhel-swap 253:1 0 16G 0 lvm [SWAP] └─rhel-home 253:2 0 398.8G 0 lvm /home [root@raven ~]# lstopo --version lstopo 1.11.9 [root@raven ~]# lstopo --of xml | grep -A10 -i -e nvme [root@raven ~]# /opt/hwloc/2.2.0/bin/lstopo --version lstopo 2.2.0rc2-git [root@raven ~]# /opt/hwloc/2.2.0/bin/lstopo --of xml | grep -A10 -i -e nvme <object type="OSDev" gp_index="63" name="nvme0n1" subtype="Disk" osdev_type="0"> <info name="Size" value="488386584"/> <info name="SectorSize" value="512"/> <info name="LinuxDeviceID" value="259:0"/> <info name="Vendor" value="Samsung"/> <info name="Model" value="Samsung SSD 970 EVO Plus 500GB"/> <info name="SerialNumber" value="S4P2NF0M324748B"/> </object> </object> </object> </object> (In reply to Jeff Bastian from comment #5) > Prarit, > > What were you comparing? Version 1.11.9 to 2.2.0? I see a lot more than > whitespace changes, for example: > - <object type="Package" os_index="0" cpuset="0x00000fff" > complete_cpuset="0x00000fff" online_cpuset="0x00000fff" > allowed_cpuset="0x00000fff"> > + <object type="Package" os_index="0" cpuset="0x00000fff" > complete_cpuset="0x00000fff" nodeset="0x00000001" > complete_nodeset="0x00000001" gp_index="3"> > Yeah, I'm seeing the same thing. I cannot decide if this is a good thing or not. I think it's okay to have the extra data. What do you think? P. Opening this bugzilla up by explicit customer request. The only thing I worry about is if someone has scripts or tools that were parsing this data. In the example, online_cpuset and allowed_cpuset go away, and instead you get nodeset, complete_nodeset, and gp_index. This change could break stuff. Do we need to convert hwloc into a dnf module so we can have both version 1.x and 2.x installed in parallel? The thing is 1.x does not have a way to express the current architecture of the current generation of server class machines which was why 2.x was created. I think that it is arguable that wrong info is more of a problem than an compatibility break. (In reply to Ben Woodard from comment #10) > The thing is 1.x does not have a way to express the current architecture of > the current generation of server class machines which was why 2.x was > created. I think that it is arguable that wrong info is more of a problem > than an compatibility break. I feel the same way jbastian. We're in a "damned if we do, damned if we don't" case here. I fall on the side of making sure the code returns the correct data. P. (In reply to Jeff Bastian from comment #9) > The only thing I worry about is if someone has scripts or tools that were > parsing this data. In the example, online_cpuset and allowed_cpuset go > away, and instead you get nodeset, complete_nodeset, and gp_index. This > change could break stuff. > > Do we need to convert hwloc into a dnf module so we can have both version > 1.x and 2.x installed in parallel? I'd prefer not to do that. I think that it is still expected that RHEL8 will have functional changes like this. If this were RHEL8.6 or later, I would agree that a dnf module is necessary. P. Ok. Let's move forward with the rebase then! http://people.redhat.com/tgummels/.pdir/.llnl-ccc4ca1b257fe34cc317ace66264f86d User: 6264f86d Pass: %hP2shpR9 hwloc build at above location *** Bug 1732988 has been marked as a duplicate of this bug. *** Basic functionality tests and the built-in self tests (*) passed: https://beaker.engineering.redhat.com/jobs/4614350 (*) I need to report a bug upstream in the gather-topology self-test. The test fails because it finds info on the memory modules (DIMMs) when scanning the raw /sys, but it does not record the memory module info when running /usr/bin/hwloc-gather-topology, and so the diff of the two outputs fails. I don't think this is a show-stopper bug, though. I've modified our Beaker task to skip gather-topology for now. http://pkgs.devel.redhat.com/cgit/tests/hwloc/commit/?id=0887f5f0977d87f0be60687958a95addf26e048f Further manual testing on a Lenovo 4-socket Cooper Lake server also looked good. I noticed one odd bit of info, but this could be a firmware error on this particular Lenovo system: CPUs P#32 and P#34 are swapped. lstopo graphical output is attached too. [root@lenovo-ts7z60-01 ~]# rpm -qa hwloc\* | sort hwloc-2.2.0-1.el8.x86_64 hwloc-debugsource-2.2.0-1.el8.x86_64 hwloc-devel-2.2.0-1.el8.x86_64 hwloc-gui-2.2.0-1.el8.x86_64 hwloc-libs-2.2.0-1.el8.x86_64 hwloc-plugins-2.2.0-1.el8.x86_64 [root@lenovo-ts7z60-01 ~]# lscpu | grep NUMA NUMA node(s): 4 NUMA node0 CPU(s): 0-7,32-39 NUMA node1 CPU(s): 8-15,40-47 NUMA node2 CPU(s): 16-23,48-55 NUMA node3 CPU(s): 24-31,56-63 [root@lenovo-ts7z60-01 ~]# lstopo-no-graphics Machine (121GB total) Package L#0 NUMANode L#0 (P#0 30GB) L3 L#0 (36MB) L2 L#0 (1024KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#34) # <-- should be 32? L2 L#1 (1024KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 PU L#2 (P#1) PU L#3 (P#33) # <-- ok L2 L#2 (1024KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 PU L#4 (P#2) PU L#5 (P#32) # <-- should be 34? L2 L#3 (1024KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 PU L#6 (P#3) PU L#7 (P#35) L2 L#4 (1024KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 PU L#8 (P#4) PU L#9 (P#36) L2 L#5 (1024KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 PU L#10 (P#5) PU L#11 (P#37) L2 L#6 (1024KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 PU L#12 (P#6) PU L#13 (P#38) L2 L#7 (1024KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 PU L#14 (P#7) PU L#15 (P#39) HostBridge PCI 00:11.5 (SATA) PCI 00:17.0 (SATA) PCIBridge PCIBridge PCI 02:00.0 (VGA) HostBridge PCIBridge PCI 33:00.0 (RAID) Block(Disk) "sda" Package L#1 NUMANode L#1 (P#1 31GB) L3 L#1 (36MB) L2 L#8 (1024KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 PU L#16 (P#8) PU L#17 (P#40) L2 L#9 (1024KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 PU L#18 (P#9) PU L#19 (P#41) L2 L#10 (1024KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 PU L#20 (P#10) PU L#21 (P#42) L2 L#11 (1024KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 PU L#22 (P#11) PU L#23 (P#43) L2 L#12 (1024KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 PU L#24 (P#12) PU L#25 (P#44) L2 L#13 (1024KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 PU L#26 (P#13) PU L#27 (P#45) L2 L#14 (1024KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 PU L#28 (P#14) PU L#29 (P#46) L2 L#15 (1024KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 PU L#30 (P#15) PU L#31 (P#47) HostBridge PCIBridge PCI 6d:00.0 (Ethernet) Net "ens5f0" PCI 6d:00.1 (Ethernet) Net "ens5f1" Package L#2 NUMANode L#2 (P#2 31GB) L3 L#2 (36MB) L2 L#16 (1024KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 PU L#32 (P#16) PU L#33 (P#48) L2 L#17 (1024KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 PU L#34 (P#17) PU L#35 (P#49) L2 L#18 (1024KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 PU L#36 (P#18) PU L#37 (P#50) L2 L#19 (1024KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 PU L#38 (P#19) PU L#39 (P#51) L2 L#20 (1024KB) + L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 PU L#40 (P#20) PU L#41 (P#52) L2 L#21 (1024KB) + L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 PU L#42 (P#21) PU L#43 (P#53) L2 L#22 (1024KB) + L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 PU L#44 (P#22) PU L#45 (P#54) L2 L#23 (1024KB) + L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 PU L#46 (P#23) PU L#47 (P#55) Package L#3 NUMANode L#3 (P#3 30GB) L3 L#3 (36MB) L2 L#24 (1024KB) + L1d L#24 (32KB) + L1i L#24 (32KB) + Core L#24 PU L#48 (P#24) PU L#49 (P#56) L2 L#25 (1024KB) + L1d L#25 (32KB) + L1i L#25 (32KB) + Core L#25 PU L#50 (P#25) PU L#51 (P#57) L2 L#26 (1024KB) + L1d L#26 (32KB) + L1i L#26 (32KB) + Core L#26 PU L#52 (P#26) PU L#53 (P#58) L2 L#27 (1024KB) + L1d L#27 (32KB) + L1i L#27 (32KB) + Core L#27 PU L#54 (P#27) PU L#55 (P#59) L2 L#28 (1024KB) + L1d L#28 (32KB) + L1i L#28 (32KB) + Core L#28 PU L#56 (P#28) PU L#57 (P#60) L2 L#29 (1024KB) + L1d L#29 (32KB) + L1i L#29 (32KB) + Core L#29 PU L#58 (P#29) PU L#59 (P#61) L2 L#30 (1024KB) + L1d L#30 (32KB) + L1i L#30 (32KB) + Core L#30 PU L#60 (P#30) PU L#61 (P#62) L2 L#31 (1024KB) + L1d L#31 (32KB) + L1i L#31 (32KB) + Core L#31 PU L#62 (P#31) PU L#63 (P#63) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Misc(MemoryModule) Created attachment 1721605 [details]
lstopo graphical output
Hi, > WARNING: The new version of `hwloc` is not binary compatible with the previous one. As a result, users with third > party applications compiled against `hwloc` existing in RHEL 8.0 up to 8.3 will need to re-compile, and possibly > re-adjust, their applications to work on RHEL 8.4 and onwards. However, hwlog is assumed to be Application Compatibility level 2 according to https://access.redhat.com/support/offerings/production/scope_moredetail#Red_Hat_Enterprise_Linux_version_8 and https://access.redhat.com/articles/rhel8-abi-compatibility - Compatibility level 2 APIs and ABIs are stable within the lifetime of a single major release. Compatibility level 2 application interfaces will not change from minor release to minor release and can be relied upon by the application to be stable for the duration of the major release. Compatibility level 2 is the default for packages in Red Hat Enterprise Linux 8. Packages not identified as having another compatibility level may be considered compatibility level 2. Regards, Ikarashi (In reply to fj-lsoft-kernel-it from comment #43) > > However, hwlog is assumed to be Application Compatibility level 2 Sorry, I mean hwloc here. Regards, Ikarashi Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (hwloc bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1589 Hi, Can we get a hwloc compat package for the previous hwloc version? As noted in previous comment 43 https://bugzilla.redhat.com/show_bug.cgi?id=1841354#c43 this change has broken ABI compatibility for an Application Compatibility level 2 package. Regards, Andrew (In reply to andrew.wellington from comment #48) > Can we get a hwloc compat package for the previous hwloc version? As noted > in previous comment 43 > https://bugzilla.redhat.com/show_bug.cgi?id=1841354#c43 this change has > broken ABI compatibility for an Application Compatibility level 2 package. Andrew, Could you please file a new bug specifically asking for a compat package and explain your specific requirements? Thank you. @andrew.wellington.au Did you file the bug report as requested in https://bugzilla.redhat.com/show_bug.cgi?id=1841354#c49? I'm actually really not quite sure how an ABI breaking package update was done in a minor release update. All of my (perhaps naïve) understand is that this is not supposed to be allowed to happen in a minor release update. A compat package seems like it should have been the least that should have been done. Yes, he filed bug 1979150. compat-hwloc1 is already available in CentOS Stream 8, and is expected to be added to RHEL when 8.5 is released. Yes that is correct. compat-hwloc1 should be available in centos-stream-8. P. So I should be finding it in http://mirror.centos.org/centos/8-stream/AppStream/x86_64/os/Packages/? I don't see it there. NVM. I found it in http://mirror.centos.org/centos/8-stream/BaseOS/x86_64/os/Packages/ |