Bug 1841354

Summary: Please rebase hwloc to 2.2
Product: Red Hat Enterprise Linux 8 Reporter: Ben Woodard <woodard>
Component: hwlocAssignee: Prarit Bhargava <prarit>
Status: CLOSED ERRATA QA Contact: Jeff Bastian <jbastian>
Severity: medium Docs Contact: Jaroslav Klech <jklech>
Priority: unspecified    
Version: 8.0CC: andrew.wellington, brian, carl, codonell, ernunes, fj-lsoft-kernel-it, infiniband-qe, jarod, jbastian, lgoncalv, mgrondona, prarit, rvr, tgummels, tmichael, triegel, trix, williams, zguo
Target Milestone: rcKeywords: Rebase
Target Release: 8.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
.hwloc rebased to version 2.2.0 The `hwloc` package has been upgraded to version 2.2.0, which provides the following change: * The `hwloc` functionality can report details on Nonvolatile Memory Express (NVMe) drives including total disk size and sector size.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-18 14:41:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1732982, 1850084, 1877365    
Attachments:
Description Flags
lstopo graphical output none

Description Ben Woodard 2020-05-28 23:15:46 UTC
Description of problem:
The current version of hwloc has several problems with the API which make it impossible to represent current architectures. One problem is the ability to detect on-node storage. This isn't available in the current version of hwloc. This is a complex issue which include that as part of the reason: https://github.com/flux-framework/flux-core/pull/2944 

Another place where this becomes evident: 
https://github.com/open-mpi/hwloc/issues/399#issuecomment-625048305

"Regarding the size, it was added in master when we started the big 2.0 rework after branching 1.11 in 2015. I was planning to backport it but the appropriate attributes weren't decided yet, so I waited and forgot about it, sorry :/ So looks you'll need 2.0 here. At least my git branch 1.11 still doesn't show the size."

Even though it does expose the existence of the block devices in RHEL7 it seems to scramble the XML info:

❯ ./lstopo --of xml | grep -A 10 -i -e nvme
            <object type="PCIDev" os_index="4096" name="HGST, Inc. Ultrastar SN200 Series NVMe SSD" pci_busid="0000:01:00.0" pci_type="0108 [1c58:0023] [1c58:0023] 02" pci_link_speed="3.938462">
              <info name="PCIVendor" value="HGST, Inc."/>
              <info name="PCIDevice" value="Ultrastar SN200 Series NVMe SSD"/>
              <info name="PCISlot" value="1"/>
              <object type="OSDev" name="nvme0n1" osdev_type="0">
                <info name="LinuxDeviceID" value="259:0"/>
                <info name="Model" value="HUSMR7616BDP301"/>
                <info name="SerialNumber" value="SDM000023759"/>
                <info name="Type" value="Disk"/>
              </object>
            </object>
          </object>
2.1.0:

❯ lstopo --of xml | grep -A 10 -i -e nvme
              <info name="PCIDevice" value="Ultrastar SN200 Series NVMe SSD"/>
              <object type="OSDev" gp_index="384" name="nvme0n1" subtype="Disk" osdev_type="0">
                <info name="Size" value="1562813784"/>
                <info name="SectorSize" value="4096"/>
                <info name="LinuxDeviceID" value="259:0"/>
                <info name="Model" value="HUSMR7616BDP301"/>
                <info name="SerialNumber" value="SDM000023759"/>
              </object>
            </object>
          </object>
Version-Release number of selected component (if applicable):
1.11.8

Comment 2 Jeff Bastian 2020-06-01 17:39:53 UTC
It looks like any system with an NVMe drive should work for testing.  Testing lstopo from RHEL-8.2.0 vs upstream git:

[root@raven ~]# lsblk
NAME          MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
nvme0n1       259:0    0 465.8G  0 disk 
├─nvme0n1p1   259:1    0     4M  0 part 
├─nvme0n1p2   259:2    0     1G  0 part /boot
└─nvme0n1p3   259:3    0 464.8G  0 part 
  ├─rhel-root 253:0    0    50G  0 lvm  /
  ├─rhel-swap 253:1    0    16G  0 lvm  [SWAP]
  └─rhel-home 253:2    0 398.8G  0 lvm  /home

[root@raven ~]# lstopo --version
lstopo 1.11.9

[root@raven ~]# lstopo --of xml | grep -A10 -i -e nvme

[root@raven ~]# /opt/hwloc/2.2.0/bin/lstopo --version
lstopo 2.2.0rc2-git

[root@raven ~]# /opt/hwloc/2.2.0/bin/lstopo --of xml | grep -A10 -i -e nvme
          <object type="OSDev" gp_index="63" name="nvme0n1" subtype="Disk" osdev_type="0">
            <info name="Size" value="488386584"/>
            <info name="SectorSize" value="512"/>
            <info name="LinuxDeviceID" value="259:0"/>
            <info name="Vendor" value="Samsung"/>
            <info name="Model" value="Samsung SSD 970 EVO Plus 500GB"/>
            <info name="SerialNumber" value="S4P2NF0M324748B"/>
          </object>
        </object>
      </object>
    </object>

Comment 7 Prarit Bhargava 2020-06-08 15:45:36 UTC
(In reply to Jeff Bastian from comment #5)
> Prarit,
> 
> What were you comparing?  Version 1.11.9 to 2.2.0?  I see a lot more than
> whitespace changes, for example:
> -    <object type="Package" os_index="0" cpuset="0x00000fff"
> complete_cpuset="0x00000fff" online_cpuset="0x00000fff"
> allowed_cpuset="0x00000fff">
> +    <object type="Package" os_index="0" cpuset="0x00000fff"
> complete_cpuset="0x00000fff" nodeset="0x00000001"
> complete_nodeset="0x00000001" gp_index="3">
> 

Yeah, I'm seeing the same thing.  I cannot decide if this is a good thing or not.  I think it's okay to have the extra data.  What do you think?

P.

Comment 8 Ben Woodard 2020-06-08 17:14:09 UTC
Opening this bugzilla up by explicit customer request.

Comment 9 Jeff Bastian 2020-06-08 17:31:26 UTC
The only thing I worry about is if someone has scripts or tools that were parsing this data.  In the example, online_cpuset and allowed_cpuset go away, and instead you get nodeset, complete_nodeset, and gp_index.  This change could break stuff.

Do we need to convert hwloc into a dnf module so we can have both version 1.x and 2.x installed in parallel?

Comment 10 Ben Woodard 2020-06-08 17:36:36 UTC
The thing is 1.x does not have a way to express the current architecture of the current generation of server class machines which was why 2.x was created. I think that it is arguable that wrong info is more of a problem than an compatibility break.

Comment 11 Prarit Bhargava 2020-06-09 14:36:49 UTC
(In reply to Ben Woodard from comment #10)
> The thing is 1.x does not have a way to express the current architecture of
> the current generation of server class machines which was why 2.x was
> created. I think that it is arguable that wrong info is more of a problem
> than an compatibility break.

I feel the same way jbastian.  We're in a "damned if we do, damned if we don't" case here.  I fall on the side of making sure the code returns the correct data.

P.

Comment 12 Prarit Bhargava 2020-06-10 13:20:42 UTC
(In reply to Jeff Bastian from comment #9)
> The only thing I worry about is if someone has scripts or tools that were
> parsing this data.  In the example, online_cpuset and allowed_cpuset go
> away, and instead you get nodeset, complete_nodeset, and gp_index.  This
> change could break stuff.
> 
> Do we need to convert hwloc into a dnf module so we can have both version
> 1.x and 2.x installed in parallel?

I'd prefer not to do that.  I think that it is still expected that RHEL8 will have functional changes like this.  If this were RHEL8.6 or later, I would agree that a dnf module is necessary.

P.

Comment 13 Jeff Bastian 2020-06-11 16:26:41 UTC
Ok.  Let's move forward with the rebase then!

Comment 24 Travis Gummels 2020-07-29 13:45:06 UTC
http://people.redhat.com/tgummels/.pdir/.llnl-ccc4ca1b257fe34cc317ace66264f86d
User: 6264f86d
Pass: %hP2shpR9

hwloc build at above location

Comment 25 Prarit Bhargava 2020-09-09 12:49:25 UTC
*** Bug 1732988 has been marked as a duplicate of this bug. ***

Comment 35 Jeff Bastian 2020-10-14 21:22:41 UTC
Basic functionality tests and the built-in self tests (*) passed:

https://beaker.engineering.redhat.com/jobs/4614350

(*) I need to report a bug upstream in the gather-topology self-test.  The test fails because it finds info on the memory modules (DIMMs) when scanning the raw /sys, but it does not record the memory module info when running /usr/bin/hwloc-gather-topology, and so the diff of the two outputs fails.  I don't think this is a show-stopper bug, though.  I've modified our Beaker task to skip gather-topology for now.

http://pkgs.devel.redhat.com/cgit/tests/hwloc/commit/?id=0887f5f0977d87f0be60687958a95addf26e048f


Further manual testing on a Lenovo 4-socket Cooper Lake server also looked good.  I noticed one odd bit of info, but this could be a firmware error on this particular Lenovo system: CPUs P#32 and P#34 are swapped.  lstopo graphical output is attached too.

[root@lenovo-ts7z60-01 ~]# rpm -qa hwloc\* | sort
hwloc-2.2.0-1.el8.x86_64
hwloc-debugsource-2.2.0-1.el8.x86_64
hwloc-devel-2.2.0-1.el8.x86_64
hwloc-gui-2.2.0-1.el8.x86_64
hwloc-libs-2.2.0-1.el8.x86_64
hwloc-plugins-2.2.0-1.el8.x86_64

[root@lenovo-ts7z60-01 ~]# lscpu | grep NUMA
NUMA node(s):        4
NUMA node0 CPU(s):   0-7,32-39
NUMA node1 CPU(s):   8-15,40-47
NUMA node2 CPU(s):   16-23,48-55
NUMA node3 CPU(s):   24-31,56-63

[root@lenovo-ts7z60-01 ~]# lstopo-no-graphics 
Machine (121GB total)
  Package L#0
    NUMANode L#0 (P#0 30GB)
    L3 L#0 (36MB)
      L2 L#0 (1024KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
        PU L#0 (P#0)
        PU L#1 (P#34)                                                # <-- should be 32?
      L2 L#1 (1024KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
        PU L#2 (P#1)
        PU L#3 (P#33)                                                # <-- ok
      L2 L#2 (1024KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
        PU L#4 (P#2)
        PU L#5 (P#32)                                                # <-- should be 34?
      L2 L#3 (1024KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
        PU L#6 (P#3)
        PU L#7 (P#35)
      L2 L#4 (1024KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
        PU L#8 (P#4)
        PU L#9 (P#36)
      L2 L#5 (1024KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
        PU L#10 (P#5)
        PU L#11 (P#37)
      L2 L#6 (1024KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
        PU L#12 (P#6)
        PU L#13 (P#38)
      L2 L#7 (1024KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
        PU L#14 (P#7)
        PU L#15 (P#39)
    HostBridge
      PCI 00:11.5 (SATA)
      PCI 00:17.0 (SATA)
      PCIBridge
        PCIBridge
          PCI 02:00.0 (VGA)
    HostBridge
      PCIBridge
        PCI 33:00.0 (RAID)
          Block(Disk) "sda"
  Package L#1
    NUMANode L#1 (P#1 31GB)
    L3 L#1 (36MB)
      L2 L#8 (1024KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
        PU L#16 (P#8)
        PU L#17 (P#40)
      L2 L#9 (1024KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
        PU L#18 (P#9)
        PU L#19 (P#41)
      L2 L#10 (1024KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
        PU L#20 (P#10)
        PU L#21 (P#42)
      L2 L#11 (1024KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
        PU L#22 (P#11)
        PU L#23 (P#43)
      L2 L#12 (1024KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12
        PU L#24 (P#12)
        PU L#25 (P#44)
      L2 L#13 (1024KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13
        PU L#26 (P#13)
        PU L#27 (P#45)
      L2 L#14 (1024KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14
        PU L#28 (P#14)
        PU L#29 (P#46)
      L2 L#15 (1024KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15
        PU L#30 (P#15)
        PU L#31 (P#47)
    HostBridge
      PCIBridge
        PCI 6d:00.0 (Ethernet)
          Net "ens5f0"
        PCI 6d:00.1 (Ethernet)
          Net "ens5f1"
  Package L#2
    NUMANode L#2 (P#2 31GB)
    L3 L#2 (36MB)
      L2 L#16 (1024KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16
        PU L#32 (P#16)
        PU L#33 (P#48)
      L2 L#17 (1024KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17
        PU L#34 (P#17)
        PU L#35 (P#49)
      L2 L#18 (1024KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18
        PU L#36 (P#18)
        PU L#37 (P#50)
      L2 L#19 (1024KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19
        PU L#38 (P#19)
        PU L#39 (P#51)
      L2 L#20 (1024KB) + L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20
        PU L#40 (P#20)
        PU L#41 (P#52)
      L2 L#21 (1024KB) + L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21
        PU L#42 (P#21)
        PU L#43 (P#53)
      L2 L#22 (1024KB) + L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22
        PU L#44 (P#22)
        PU L#45 (P#54)
      L2 L#23 (1024KB) + L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23
        PU L#46 (P#23)
        PU L#47 (P#55)
  Package L#3
    NUMANode L#3 (P#3 30GB)
    L3 L#3 (36MB)
      L2 L#24 (1024KB) + L1d L#24 (32KB) + L1i L#24 (32KB) + Core L#24
        PU L#48 (P#24)
        PU L#49 (P#56)
      L2 L#25 (1024KB) + L1d L#25 (32KB) + L1i L#25 (32KB) + Core L#25
        PU L#50 (P#25)
        PU L#51 (P#57)
      L2 L#26 (1024KB) + L1d L#26 (32KB) + L1i L#26 (32KB) + Core L#26
        PU L#52 (P#26)
        PU L#53 (P#58)
      L2 L#27 (1024KB) + L1d L#27 (32KB) + L1i L#27 (32KB) + Core L#27
        PU L#54 (P#27)
        PU L#55 (P#59)
      L2 L#28 (1024KB) + L1d L#28 (32KB) + L1i L#28 (32KB) + Core L#28
        PU L#56 (P#28)
        PU L#57 (P#60)
      L2 L#29 (1024KB) + L1d L#29 (32KB) + L1i L#29 (32KB) + Core L#29
        PU L#58 (P#29)
        PU L#59 (P#61)
      L2 L#30 (1024KB) + L1d L#30 (32KB) + L1i L#30 (32KB) + Core L#30
        PU L#60 (P#30)
        PU L#61 (P#62)
      L2 L#31 (1024KB) + L1d L#31 (32KB) + L1i L#31 (32KB) + Core L#31
        PU L#62 (P#31)
        PU L#63 (P#63)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)
  Misc(MemoryModule)

Comment 36 Jeff Bastian 2020-10-14 21:24:39 UTC
Created attachment 1721605 [details]
lstopo graphical output

Comment 43 Fujitsu kernel team 2021-03-03 02:38:30 UTC
Hi,

> WARNING: The new version of `hwloc` is not binary compatible with the previous one. As a result, users with third
> party applications compiled against `hwloc` existing in RHEL 8.0 up to 8.3 will need to re-compile, and possibly
> re-adjust, their applications to work on RHEL 8.4 and onwards.

However, hwlog is assumed to be Application Compatibility level 2
according to
https://access.redhat.com/support/offerings/production/scope_moredetail#Red_Hat_Enterprise_Linux_version_8
and
https://access.redhat.com/articles/rhel8-abi-compatibility

- Compatibility level 2
  APIs and ABIs are stable within the lifetime of a single major release. Compatibility level 2 application interfaces
  will not change from minor release to minor release and can be relied upon by the application to be stable for the
  duration of the major release. Compatibility level 2 is the default for packages in Red Hat Enterprise Linux 8.
  Packages not identified as having another compatibility level may be considered compatibility level 2.

Regards,
Ikarashi

Comment 44 Fujitsu kernel team 2021-03-03 02:47:55 UTC
(In reply to fj-lsoft-kernel-it from comment #43)
> 
> However, hwlog is assumed to be Application Compatibility level 2

Sorry, I mean hwloc here.

Regards,
Ikarashi

Comment 47 errata-xmlrpc 2021-05-18 14:41:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (hwloc bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1589

Comment 48 andrew.wellington 2021-07-02 00:41:19 UTC
Hi,

Can we get a hwloc compat package for the previous hwloc version? As noted in previous comment 43 https://bugzilla.redhat.com/show_bug.cgi?id=1841354#c43 this change has broken ABI compatibility for an Application Compatibility level 2 package.

Regards,
Andrew

Comment 49 Carlos O'Donell 2021-07-02 15:06:40 UTC
(In reply to andrew.wellington from comment #48)
> Can we get a hwloc compat package for the previous hwloc version? As noted
> in previous comment 43
> https://bugzilla.redhat.com/show_bug.cgi?id=1841354#c43 this change has
> broken ABI compatibility for an Application Compatibility level 2 package.

Andrew, Could you please file a new bug specifically asking for a compat package and explain your specific requirements? Thank you.

Comment 50 Brian J. Murrell 2021-09-01 18:26:55 UTC
@andrew.wellington.au Did you file the bug report as requested in https://bugzilla.redhat.com/show_bug.cgi?id=1841354#c49?

I'm actually really not quite sure how an ABI breaking package update was done in a minor release update.

All of my (perhaps naïve) understand is that this is not supposed to be allowed to happen in a minor release update.  A compat package seems like it should have been the least that should have been done.

Comment 51 Carl George 🤠 2021-09-01 22:23:22 UTC
Yes, he filed bug 1979150.  compat-hwloc1 is already available in CentOS Stream 8, and is expected to be added to RHEL when 8.5 is released.

Comment 52 Prarit Bhargava 2021-09-02 01:20:40 UTC
Yes that is correct.  compat-hwloc1 should be available in centos-stream-8.

P.

Comment 53 Brian J. Murrell 2021-09-02 17:30:13 UTC
So I should be finding it in http://mirror.centos.org/centos/8-stream/AppStream/x86_64/os/Packages/?  I don't see it there.

Comment 54 Brian J. Murrell 2021-09-02 17:33:39 UTC
NVM.  I found it in http://mirror.centos.org/centos/8-stream/BaseOS/x86_64/os/Packages/