Bug 1872708

Summary: libvirt does not report numa info on amd epyc hosts [rhel-av-8.2.1z]
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Oneata Mircea Teodor <toneata>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Jing Qi <jinqi>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.2CC: ailan, apevec, berrange, cfields, cfontain, chhu, chrisw, cswanson, dgilbert, dhellard, dmarchan, dyuan, fbaudin, fiezzi, hakhande, jdenemar, jinqi, jsuchane, lmen, mkalinin, mprivozn, mtessun, mzhan, ovs-team, pmannidi, pveiga, rhos-maint, rpawlik, samasud, skramaja, smooney, toneata, virt-maint, xuzhang, yalzhang, ymankad
Target Milestone: rcKeywords: Upstream, ZStream
Target Release: 8.2Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-6.0.0-25.4.el8 Doc Type: Bug Fix
Doc Text:
Cause: When generating "virsh capabilities" XML (which describes host capabilities and in the light of this bug host NUMA nodes), libvirt queries various system attributes. When constructing host NUMA layout libnuma is used. But as we learned (in 2010) NUMA nodes may be disjoint. To deal with that, libvirt misused a behavior of this numa_node_to_cpus() API which for a non existent NUMA node returned a bitmask with all bits set. But, as it turned out, with number of threads growing, it is possible for the API to return a bitmask with all bits set and yet still be a valid result => it simply means that all CPUs belong to given NUMA node. Consequence: Because of misusing behaviour of a libnuma API libvirt did not report NUMA node, even though the host had it. Fix: The fix consists of using proper libnuma API to check if a node with given ID exists and then not misusing the weird behaviour of numa_node_to_cpus() API. Result: Libvirt now reports NUMA node on CPUs with a lot of threads.
Story Points: ---
Clone Of: 1860231
: 1883580 (view as bug list) Environment:
Last Closed: 2020-10-12 09:00:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1860231    
Bug Blocks: 1883580    

Comment 8 Jing Qi 2020-09-11 08:23:07 UTC
Verified with libvirt-daemon-6.0.0-25.3.module+el8.2.1+8038+fbea6a05.x86_64 in a dell-per6515 machine-

# numactl -H
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
node 0 size: 31413 MB
node 0 free: 16381 MB
node distances:
node   0 
  0:  10 

Part of output from "virsh capabilities"- 

  <topology>
      <cells num='1'>
        <cell id='0'>
          <memory unit='KiB'>32167400</memory>
          <pages unit='KiB' size='4'>5420410</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>10</pages>
          <distances>
            <sibling id='0' value='10'/>
          </distances>
          <cpus num='128'>
            <cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0,64'/>
            <cpu id='1' socket_id='0' die_id='0' core_id='1' siblings='1,65'/>
            <cpu id='2' socket_id='0' die_id='0' core_id='2' siblings='2,66'/>
            <cpu id='3' socket_id='0' die_id='0' core_id='3' siblings='3,67'/>
            <cpu id='4' socket_id='0' die_id='0' core_id='4' siblings='4,68'/>
            <cpu id='5' socket_id='0' die_id='0' core_id='5' siblings='5,69'/>
            <cpu id='6' socket_id='0' die_id='0' core_id='6' siblings='6,70'/>
            <cpu id='7' socket_id='0' die_id='0' core_id='7' siblings='7,71'/>
            <cpu id='8' socket_id='0' die_id='0' core_id='8' siblings='8,72'/>
         ...
            <cpu id='127' socket_id='0' die_id='0' core_id='63' siblings='63,127'/>
          </cpus>
        </cell>
      </cells>

Comment 9 Jiri Denemark 2020-09-11 17:35:44 UTC
Looks like "virnuma: Use numa_nodes_ptr when checking available NUMA nodes"
should be also backported to to avoid regression described in bug 1876956.
Right, Michal?

Comment 10 Michal Privoznik 2020-09-11 18:13:42 UTC
Yes, we need that patch too.

Comment 12 Jing Qi 2020-09-15 01:11:07 UTC
Verified with libvirt-daemon-6.0.0-25.4.module+el8.2.1+8038+fbea6a05.x86_64 in a dell-per6515 machine-

# numactl -H
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
node 0 size: 31413 MB
node 0 free: 16381 MB
node distances:
node   0 
  0:  10 

Part of output from "virsh capabilities"- 

  <topology>
      <cells num='1'>
        <cell id='0'>
          <memory unit='KiB'>32167400</memory>
          <pages unit='KiB' size='4'>5420410</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>10</pages>
          <distances>
            <sibling id='0' value='10'/>
          </distances>
          <cpus num='128'>
            <cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0,64'/>
            <cpu id='1' socket_id='0' die_id='0' core_id='1' siblings='1,65'/>
            <cpu id='2' socket_id='0' die_id='0' core_id='2' siblings='2,66'/>
            <cpu id='3' socket_id='0' die_id='0' core_id='3' siblings='3,67'/>
            <cpu id='4' socket_id='0' die_id='0' core_id='4' siblings='4,68'/>
            <cpu id='5' socket_id='0' die_id='0' core_id='5' siblings='5,69'/>
            <cpu id='6' socket_id='0' die_id='0' core_id='6' siblings='6,70'/>
            <cpu id='7' socket_id='0' die_id='0' core_id='7' siblings='7,71'/>
            <cpu id='8' socket_id='0' die_id='0' core_id='8' siblings='8,72'/>
         ...
            <cpu id='127' socket_id='0' die_id='0' core_id='63' siblings='63,127'/>
          </cpus>
        </cell>
      </cells>

Verified in a hp-dl380g10 machine with libvirt-daemon-6.0.0-25.4.module+el8.2.1+8038+fbea6a05.x86_64

# numactl -H
available: 8 nodes (0-7)
node 0 cpus: 0 1 16 17
node 0 size: 15736 MB
node 0 free: 14312 MB
node 1 cpus: 2 3 18 19
node 1 size: 0 MB
node 1 free: 0 MB
node 2 cpus: 4 5 20 21
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus: 6 7 22 23
node 3 size: 0 MB
node 3 free: 0 MB
node 4 cpus: 8 9 24 25
node 4 size: 16104 MB
node 4 free: 15522 MB
node 5 cpus: 10 11 26 27
node 5 size: 0 MB
node 5 free: 0 MB
node 6 cpus: 12 13 28 29
node 6 size: 0 MB
node 6 free: 0 MB
node 7 cpus: 14 15 30 31
node 7 size: 0 MB
node 7 free: 0 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  16  16  16  32  32  32  32 
  1:  16  10  16  16  32  32  32  32 
  2:  16  16  10  16  32  32  32  32 
  3:  16  16  16  10  32  32  32  32 
  4:  32  32  32  32  10  16  16  16 
  5:  32  32  32  32  16  10  16  16 
  6:  32  32  32  32  16  16  10  16 
  7:  32  32  32  32  16  16  16  10 

Part from the output of "virsh capabilities" -
<topology>
      <cells num='8'>
        <cell id='0'>
          <memory unit='KiB'>16113944</memory>
          <pages unit='KiB' size='4'>4028486</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>0</pages>
          <distances>
            <sibling id='0' value='10'/>
            <sibling id='1' value='16'/>
            <sibling id='2' value='16'/>
            <sibling id='3' value='16'/>
            <sibling id='4' value='32'/>
            <sibling id='5' value='32'/>
            <sibling id='6' value='32'/>
            <sibling id='7' value='32'/>
          </distances>
          <cpus num='4'>
            <cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0,16'/>
            <cpu id='1' socket_id='0' die_id='0' core_id='4' siblings='1,17'/>
            <cpu id='16' socket_id='0' die_id='0' core_id='0' siblings='0,16'/>
            <cpu id='17' socket_id='0' die_id='0' core_id='4' siblings='1,17'/>
          </cpus>
        </cell>
        <cell id='1'>
          <pages unit='KiB' size='4'>0</pages>
          <distances>
            <sibling id='0' value='16'/>
            <sibling id='1' value='10'/>
            <sibling id='2' value='16'/>
            <sibling id='3' value='16'/>
            <sibling id='4' value='32'/>
            <sibling id='5' value='32'/>
            <sibling id='6' value='32'/>
            <sibling id='7' value='32'/>
          </distances>
          <cpus num='4'>
            <cpu id='2' socket_id='0' die_id='0' core_id='8' siblings='2,18'/>
            <cpu id='3' socket_id='0' die_id='0' core_id='12' siblings='3,19'/>
            <cpu id='18' socket_id='0' die_id='0' core_id='8' siblings='2,18'/>
            <cpu id='19' socket_id='0' die_id='0' core_id='12' siblings='3,19'/>
          </cpus>
        </cell>
....
  <cell id='7'>
          <pages unit='KiB' size='4'>0</pages>
          <distances>
            <sibling id='0' value='32'/>
            <sibling id='1' value='32'/>
            <sibling id='2' value='32'/>
            <sibling id='3' value='32'/>
            <sibling id='4' value='16'/>
            <sibling id='5' value='16'/>
            <sibling id='6' value='16'/>
            <sibling id='7' value='10'/>
          </distances>
          <cpus num='4'>
            <cpu id='14' socket_id='1' die_id='0' core_id='24' siblings='14,30'/>
            <cpu id='15' socket_id='1' die_id='0' core_id='28' siblings='15,31'/>
            <cpu id='30' socket_id='1' die_id='0' core_id='24' siblings='14,30'/>
            <cpu id='31' socket_id='1' die_id='0' core_id='28' siblings='15,31'/>
          </cpus>
        </cell>
      </cells>
    </topology>

Comment 18 errata-xmlrpc 2020-10-12 09:00:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:8.2 and virt-devel:8.2 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4221