Bug 1872708 - libvirt does not report numa info on amd epyc hosts [rhel-av-8.2.1z]
Summary: libvirt does not report numa info on amd epyc hosts [rhel-av-8.2.1z]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.2
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: 8.2
Assignee: Michal Privoznik
QA Contact: Jing Qi
URL:
Whiteboard:
Depends On: 1860231
Blocks: 1883580
TreeView+ depends on / blocked
 
Reported: 2020-08-26 13:27 UTC by Oneata Mircea Teodor
Modified: 2020-12-21 19:39 UTC (History)
36 users (show)

Fixed In Version: libvirt-6.0.0-25.4.el8
Doc Type: Bug Fix
Doc Text:
Cause: When generating "virsh capabilities" XML (which describes host capabilities and in the light of this bug host NUMA nodes), libvirt queries various system attributes. When constructing host NUMA layout libnuma is used. But as we learned (in 2010) NUMA nodes may be disjoint. To deal with that, libvirt misused a behavior of this numa_node_to_cpus() API which for a non existent NUMA node returned a bitmask with all bits set. But, as it turned out, with number of threads growing, it is possible for the API to return a bitmask with all bits set and yet still be a valid result => it simply means that all CPUs belong to given NUMA node. Consequence: Because of misusing behaviour of a libnuma API libvirt did not report NUMA node, even though the host had it. Fix: The fix consists of using proper libnuma API to check if a node with given ID exists and then not misusing the weird behaviour of numa_node_to_cpus() API. Result: Libvirt now reports NUMA node on CPUs with a lot of threads.
Clone Of: 1860231
: 1883580 (view as bug list)
Environment:
Last Closed: 2020-10-12 09:00:41 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4221 0 None None None 2020-10-12 09:01:01 UTC

Comment 8 Jing Qi 2020-09-11 08:23:07 UTC
Verified with libvirt-daemon-6.0.0-25.3.module+el8.2.1+8038+fbea6a05.x86_64 in a dell-per6515 machine-

# numactl -H
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
node 0 size: 31413 MB
node 0 free: 16381 MB
node distances:
node   0 
  0:  10 

Part of output from "virsh capabilities"- 

  <topology>
      <cells num='1'>
        <cell id='0'>
          <memory unit='KiB'>32167400</memory>
          <pages unit='KiB' size='4'>5420410</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>10</pages>
          <distances>
            <sibling id='0' value='10'/>
          </distances>
          <cpus num='128'>
            <cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0,64'/>
            <cpu id='1' socket_id='0' die_id='0' core_id='1' siblings='1,65'/>
            <cpu id='2' socket_id='0' die_id='0' core_id='2' siblings='2,66'/>
            <cpu id='3' socket_id='0' die_id='0' core_id='3' siblings='3,67'/>
            <cpu id='4' socket_id='0' die_id='0' core_id='4' siblings='4,68'/>
            <cpu id='5' socket_id='0' die_id='0' core_id='5' siblings='5,69'/>
            <cpu id='6' socket_id='0' die_id='0' core_id='6' siblings='6,70'/>
            <cpu id='7' socket_id='0' die_id='0' core_id='7' siblings='7,71'/>
            <cpu id='8' socket_id='0' die_id='0' core_id='8' siblings='8,72'/>
         ...
            <cpu id='127' socket_id='0' die_id='0' core_id='63' siblings='63,127'/>
          </cpus>
        </cell>
      </cells>

Comment 9 Jiri Denemark 2020-09-11 17:35:44 UTC
Looks like "virnuma: Use numa_nodes_ptr when checking available NUMA nodes"
should be also backported to to avoid regression described in bug 1876956.
Right, Michal?

Comment 10 Michal Privoznik 2020-09-11 18:13:42 UTC
Yes, we need that patch too.

Comment 12 Jing Qi 2020-09-15 01:11:07 UTC
Verified with libvirt-daemon-6.0.0-25.4.module+el8.2.1+8038+fbea6a05.x86_64 in a dell-per6515 machine-

# numactl -H
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
node 0 size: 31413 MB
node 0 free: 16381 MB
node distances:
node   0 
  0:  10 

Part of output from "virsh capabilities"- 

  <topology>
      <cells num='1'>
        <cell id='0'>
          <memory unit='KiB'>32167400</memory>
          <pages unit='KiB' size='4'>5420410</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>10</pages>
          <distances>
            <sibling id='0' value='10'/>
          </distances>
          <cpus num='128'>
            <cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0,64'/>
            <cpu id='1' socket_id='0' die_id='0' core_id='1' siblings='1,65'/>
            <cpu id='2' socket_id='0' die_id='0' core_id='2' siblings='2,66'/>
            <cpu id='3' socket_id='0' die_id='0' core_id='3' siblings='3,67'/>
            <cpu id='4' socket_id='0' die_id='0' core_id='4' siblings='4,68'/>
            <cpu id='5' socket_id='0' die_id='0' core_id='5' siblings='5,69'/>
            <cpu id='6' socket_id='0' die_id='0' core_id='6' siblings='6,70'/>
            <cpu id='7' socket_id='0' die_id='0' core_id='7' siblings='7,71'/>
            <cpu id='8' socket_id='0' die_id='0' core_id='8' siblings='8,72'/>
         ...
            <cpu id='127' socket_id='0' die_id='0' core_id='63' siblings='63,127'/>
          </cpus>
        </cell>
      </cells>

Verified in a hp-dl380g10 machine with libvirt-daemon-6.0.0-25.4.module+el8.2.1+8038+fbea6a05.x86_64

# numactl -H
available: 8 nodes (0-7)
node 0 cpus: 0 1 16 17
node 0 size: 15736 MB
node 0 free: 14312 MB
node 1 cpus: 2 3 18 19
node 1 size: 0 MB
node 1 free: 0 MB
node 2 cpus: 4 5 20 21
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus: 6 7 22 23
node 3 size: 0 MB
node 3 free: 0 MB
node 4 cpus: 8 9 24 25
node 4 size: 16104 MB
node 4 free: 15522 MB
node 5 cpus: 10 11 26 27
node 5 size: 0 MB
node 5 free: 0 MB
node 6 cpus: 12 13 28 29
node 6 size: 0 MB
node 6 free: 0 MB
node 7 cpus: 14 15 30 31
node 7 size: 0 MB
node 7 free: 0 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  16  16  16  32  32  32  32 
  1:  16  10  16  16  32  32  32  32 
  2:  16  16  10  16  32  32  32  32 
  3:  16  16  16  10  32  32  32  32 
  4:  32  32  32  32  10  16  16  16 
  5:  32  32  32  32  16  10  16  16 
  6:  32  32  32  32  16  16  10  16 
  7:  32  32  32  32  16  16  16  10 

Part from the output of "virsh capabilities" -
<topology>
      <cells num='8'>
        <cell id='0'>
          <memory unit='KiB'>16113944</memory>
          <pages unit='KiB' size='4'>4028486</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>0</pages>
          <distances>
            <sibling id='0' value='10'/>
            <sibling id='1' value='16'/>
            <sibling id='2' value='16'/>
            <sibling id='3' value='16'/>
            <sibling id='4' value='32'/>
            <sibling id='5' value='32'/>
            <sibling id='6' value='32'/>
            <sibling id='7' value='32'/>
          </distances>
          <cpus num='4'>
            <cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0,16'/>
            <cpu id='1' socket_id='0' die_id='0' core_id='4' siblings='1,17'/>
            <cpu id='16' socket_id='0' die_id='0' core_id='0' siblings='0,16'/>
            <cpu id='17' socket_id='0' die_id='0' core_id='4' siblings='1,17'/>
          </cpus>
        </cell>
        <cell id='1'>
          <pages unit='KiB' size='4'>0</pages>
          <distances>
            <sibling id='0' value='16'/>
            <sibling id='1' value='10'/>
            <sibling id='2' value='16'/>
            <sibling id='3' value='16'/>
            <sibling id='4' value='32'/>
            <sibling id='5' value='32'/>
            <sibling id='6' value='32'/>
            <sibling id='7' value='32'/>
          </distances>
          <cpus num='4'>
            <cpu id='2' socket_id='0' die_id='0' core_id='8' siblings='2,18'/>
            <cpu id='3' socket_id='0' die_id='0' core_id='12' siblings='3,19'/>
            <cpu id='18' socket_id='0' die_id='0' core_id='8' siblings='2,18'/>
            <cpu id='19' socket_id='0' die_id='0' core_id='12' siblings='3,19'/>
          </cpus>
        </cell>
....
  <cell id='7'>
          <pages unit='KiB' size='4'>0</pages>
          <distances>
            <sibling id='0' value='32'/>
            <sibling id='1' value='32'/>
            <sibling id='2' value='32'/>
            <sibling id='3' value='32'/>
            <sibling id='4' value='16'/>
            <sibling id='5' value='16'/>
            <sibling id='6' value='16'/>
            <sibling id='7' value='10'/>
          </distances>
          <cpus num='4'>
            <cpu id='14' socket_id='1' die_id='0' core_id='24' siblings='14,30'/>
            <cpu id='15' socket_id='1' die_id='0' core_id='28' siblings='15,31'/>
            <cpu id='30' socket_id='1' die_id='0' core_id='24' siblings='14,30'/>
            <cpu id='31' socket_id='1' die_id='0' core_id='28' siblings='15,31'/>
          </cpus>
        </cell>
      </cells>
    </topology>

Comment 18 errata-xmlrpc 2020-10-12 09:00:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:8.2 and virt-devel:8.2 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4221


Note You need to log in before you can comment on or make changes to this bug.