Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1786309

Summary: RFE: support for reporting NUMA HMAT information of host in capabilities
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Daniel Berrangé <berrange>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Jing Qi <jinqi>
Severity: high Docs Contact:
Priority: high    
Version: 8.0CC: jdenemar, jinqi, jsuchane, lhuang, lmen, mprivozn, xuzhang, yalzhang, yuhuang
Target Milestone: rcKeywords: FutureFeature, Triaged, Upstream
Target Release: 8.3Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-7.5.0-1.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1842612 (view as bug list) Environment:
Last Closed: 2021-11-16 07:49:56 UTC Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 7.5.0
Embargoed:
Bug Depends On: 1664185, 1730098, 1842612    
Bug Blocks: 1745059    

Description Daniel Berrangé 2019-12-24 10:31:06 UTC
Description of problem:
Libvirt capabilities XML currently reports NUMA node distance information as reported by the ACPI SLIT tables.

In Linxu 5.1 support for a new standard was merged known as ACPI HMAT. This provides various improvements over SLIT including ability to have memory only nodes and reporting cache locality information

https://lore.kernel.org/patchwork/cover/862903/

The libvirt capabilities XML needs to be extended to report the new information available from HMAT tables.

NB, AFAICT, RHEL-8 kernels do not yet have HMAT reporting backported & unclear if this will be done, or will wait to RHEL-9. So this BZ is mostly a placeholder to remind us of work needed in future at some point. It might be useful to implement it now as a way to test the impl of bug 1786303 in context of Fedora / upstream.

Version-Release number of selected component (if applicable):
libvirt-5.10.0

Comment 10 Jing Qi 2021-06-08 06:55:48 UTC
Michal, 
Sorry, I don't have such a baremetal machine with HMAT either. 

Jing Qi

Comment 11 Jing Qi 2021-06-10 04:15:13 UTC
Michal,
I loaned a machine from other team and it supports HMAT. I installed Fedora rawhigh and I'll send you the machine info. 

Jin Qi

Comment 13 Michal Privoznik 2021-06-10 13:58:35 UTC
(In reply to Jing Qi from comment #11)
> Michal,
> I loaned a machine from other team and it supports HMAT. I installed Fedora
> rawhigh and I'll send you the machine info. 
> 
> Jin Qi

Thank you so much! It really helped when preparing v2.

Comment 14 Michal Privoznik 2021-06-15 09:11:11 UTC
Merged upstream:

7d97d7af9e vircaps2xmltest: Introduce HMAT test case
0cc6f8931f capabilities: Expose NUMA interconnects
0d7e62348e numa_conf: Expose virNumaInterconnect formatter
6ad17e290e numa_conf: Rename virDomainNumaInterconnect* to virNumaInterconnect*
5c359377a0 capabilities: Expose NUMA memory side cache
03ba98b259 numa_conf: Expose virNumaCache formatter
b0b7554229 numa_conf: Rename virDomainCache* to virNumaCache*
d6a6ed94f2 capabilities: Separate <cpu/> formatting into a function
137e765891 schemas: Allow zero <cpu/> for capabilities
5899bfd795 tests: glib-ify vircaps2xmltest

v7.4.0-127-g7d97d7af9e

Comment 15 Jing Qi 2021-06-16 00:43:48 UTC
Verified with libvirt upstream version -v7.4.0-133-ga323c5e8b7

# numactl --ha
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
node 0 size: 64027 MB
node 0 free: 59994 MB
node 1 cpus: 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111
node 1 size: 64329 MB
node 1 free: 63997 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

#virsh capabilities

Part of the result:
<topology>
<cells num='2'>
<cell id='0'>
<memory unit='KiB'>65563940</memory>
<pages unit='KiB' size='4'>16390985</pages>
<pages unit='KiB' size='2048'>0</pages>
<pages unit='KiB' size='1048576'>0</pages>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='20'/>
</distances>
<cpus num='56'>
<cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0,56'/>
<cpu id='1' socket_id='0' die_id='0' core_id='1' siblings='1,57'/>
<cpu id='2' socket_id='0' die_id='0' core_id='2' siblings='2,58'/>
...
<cpu id='83' socket_id='0' die_id='0' core_id='27' siblings='27,83'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>65873408</memory>
<pages unit='KiB' size='4'>16468352</pages>
<pages unit='KiB' size='2048'>0</pages>
<pages unit='KiB' size='1048576'>0</pages>
<distances>
<sibling id='0' value='20'/>
<sibling id='1' value='10'/>
</distances>
<cpus num='56'>
<cpu id='28' socket_id='1' die_id='0' core_id='0' siblings='28,84'/>
<cpu id='29' socket_id='1' die_id='0' core_id='1' siblings='29,85'/>
..
<cpu id='107' socket_id='1' die_id='0' core_id='23' siblings='51,107'/>
<cpu id='108' socket_id='1' die_id='0' core_id='24' siblings='52,108'/>
<cpu id='109' socket_id='1' die_id='0' core_id='25' siblings='53,109'/>
<cpu id='110' socket_id='1' die_id='0' core_id='26' siblings='54,110'/>
<cpu id='111' socket_id='1' die_id='0' core_id='27' siblings='55,111'/>
</cpus>
</cell>
</cells>
<interconnects>
<latency initiator='0' target='0' type='read' value='7600'/>
<latency initiator='0' target='0' type='write' value='7600'/>
<latency initiator='1' target='1' type='read' value='7600'/>
<latency initiator='1' target='1' type='write' value='7600'/>
<bandwidth initiator='0' target='0' type='read' value='1832960' unit='KiB'/>
<bandwidth initiator='0' target='0' type='write' value='1955840' unit='KiB'/>
<bandwidth initiator='1' target='1' type='read' value='1832960' unit='KiB'/>
<bandwidth initiator='1' target='1' type='write' value='1955840' unit='KiB'/>
</interconnects>
</topology>

Comment 16 Jing Qi 2021-07-05 07:24:51 UTC
Tested in :
libvirt-daemon-7.5.0-1.module+el8.5.0+11664+59f87560.x86_64
qemu-kvm-6.0.0-21.module+el8.5.0+11555+e0ab0d09.x86_64

In L1 guest, add before configuration - 
1. <cpu mode='host-model' check='partial'>
    <numa>
      <cell id='0' cpus='0-23' memory='4194304' unit='KiB' discard='yes'>
        <cache level='1' associativity='direct' policy='writeback'>
          <size value='10' unit='KiB'/>
          <line value='8' unit='B'/>
        </cache>
        <cache level='2' associativity='full' policy='writethrough'>
          <size value='128' unit='KiB'/>
          <line value='16' unit='B'/>
        </cache>
      </cell>
      <cell id='1' memory='2097152' unit='KiB'>
        <cache level='1' associativity='direct' policy='writeback'>
          <size value='10' unit='KiB'/>
          <line value='8' unit='B'/>
        </cache>
      </cell>
      <interconnects>
        <latency initiator='0' target='0' type='access' value='5'/>
        <latency initiator='0' target='0' type='read' value='6'/>
        <latency initiator='0' target='0' type='write' value='7'/>
        <latency initiator='0' target='1' type='access' value='10'/>
        <latency initiator='0' target='1' type='read' value='11'/>
        <latency initiator='0' target='1' type='write' value='12'/>
        <bandwidth initiator='0' target='0' type='access' value='204800' unit='KiB'/>
        <bandwidth initiator='0' target='0' type='read' value='205824' unit='KiB'/>
        <bandwidth initiator='0' target='0' type='write' value='206848' unit='KiB'/>
        <bandwidth initiator='0' target='1' type='access' value='102400' unit='KiB'/>
        <bandwidth initiator='0' target='1' type='read' value='103424' unit='KiB'/>
        <bandwidth initiator='0' target='1' type='write' value='104448' unit='KiB'/>
      </interconnects>
    </numa>
  </cpu>

2. In L2 guest, HMAT info can be list by "virsh capabilities"

 <cells num='2'>
        <cell id='0'>
          <memory unit='KiB'>3821924</memory>
          <pages unit='KiB' size='4'>955481</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>0</pages>
          <distances>
            <sibling id='0' value='10'/>
            <sibling id='1' value='20'/>
          </distances>
          <cache level='1' associativity='direct' policy='writeback'>
            <size value='10' unit='KiB'/>
            <line value='8' unit='B'/>
          </cache>
          <cache level='2' associativity='full' policy='writethrough'>
            <size value='128' unit='KiB'/>
            <line value='16' unit='B'/>
          </cache>
          <cpus num='24'>
            <cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0'/>
            <cpu id='1' socket_id='1' die_id='0' core_id='0' siblings='1'/>
            <cpu id='2' socket_id='2' die_id='0' core_id='0' siblings='2'/>
            <cpu id='3' socket_id='3' die_id='0' core_id='0' siblings='3'/>
            <cpu id='4' socket_id='4' die_id='0' core_id='0' siblings='4'/>
            <cpu id='5' socket_id='5' die_id='0' core_id='0' siblings='5'/>
            <cpu id='6' socket_id='6' die_id='0' core_id='0' siblings='6'/>
            <cpu id='7' socket_id='7' die_id='0' core_id='0' siblings='7'/>
            <cpu id='8' socket_id='8' die_id='0' core_id='0' siblings='8'/>
            <cpu id='9' socket_id='9' die_id='0' core_id='0' siblings='9'/>
            <cpu id='10' socket_id='10' die_id='0' core_id='0' siblings='10'/>
            <cpu id='11' socket_id='11' die_id='0' core_id='0' siblings='11'/>
            <cpu id='12' socket_id='12' die_id='0' core_id='0' siblings='12'/>
            <cpu id='13' socket_id='13' die_id='0' core_id='0' siblings='13'/>
            <cpu id='14' socket_id='14' die_id='0' core_id='0' siblings='14'/>
            <cpu id='15' socket_id='15' die_id='0' core_id='0' siblings='15'/>
            <cpu id='16' socket_id='16' die_id='0' core_id='0' siblings='16'/>
            <cpu id='17' socket_id='17' die_id='0' core_id='0' siblings='17'/>
            <cpu id='18' socket_id='18' die_id='0' core_id='0' siblings='18'/>
            <cpu id='19' socket_id='19' die_id='0' core_id='0' siblings='19'/>
            <cpu id='20' socket_id='20' die_id='0' core_id='0' siblings='20'/>
            <cpu id='21' socket_id='21' die_id='0' core_id='0' siblings='21'/>
            <cpu id='22' socket_id='22' die_id='0' core_id='0' siblings='22'/>
            <cpu id='23' socket_id='23' die_id='0' core_id='0' siblings='23'/>
          </cpus>
        </cell>
        <cell id='1'>
          <memory unit='KiB'>2063748</memory>
          <pages unit='KiB' size='4'>515937</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>0</pages>
          <distances>
            <sibling id='0' value='20'/>
            <sibling id='1' value='10'/>
          </distances>
          <cache level='1' associativity='direct' policy='writeback'>
            <size value='10' unit='KiB'/>
            <line value='8' unit='B'/>
          </cache>
          <cpus num='0'/>
        </cell>
      </cells>
      <interconnects>
        <latency initiator='0' target='0' type='read' value='6'/>
        <latency initiator='0' target='0' type='write' value='7'/>
        <latency initiator='0' target='1' type='read' value='11'/>
        <latency initiator='0' target='1' type='write' value='12'/>
        <bandwidth initiator='0' target='0' type='read' value='205824' unit='KiB'/>
        <bandwidth initiator='0' target='0' type='write' value='206848' unit='KiB'/>
        <bandwidth initiator='0' target='1' type='read' value='103424' unit='KiB'/>
        <bandwidth initiator='0' target='1' type='write' value='104448' unit='KiB'/>
      </interconnects>
    </topology>

But there is an issue related to HMAT. Can you please help to confirm if it need to file a bug? -
Below configuration can be saved in "virsh edit", 
        <bandwidth initiator='0' target='0' cache='1' type='access' value='208896' unit='KiB'/>
        <bandwidth initiator='0' target='0' cache='1' type='read' value='209920' unit='KiB'/>
        <bandwidth initiator='0' target='0' cache='1' type='write' value='210944' unit='KiB'/>

        <bandwidth initiator='0' target='1' cache='1' type='access' value='105472' unit='KiB'/>
        <bandwidth initiator='0' target='1' cache='1' type='read' value='106496' unit='KiB'/>
        <bandwidth initiator='0' target='1' cache='1' type='write' value='107520' unit='KiB'/>

error: XML document failed to validate against schema: Unable to validate doc against /usr/share/libvirt/schemas/domain.rng
Extra element cpu in interleave
Element domain failed to validate content

Comment 17 Michal Privoznik 2021-07-07 12:13:05 UTC
(In reply to Jing Qi from comment #16)
> Tested in :
> libvirt-daemon-7.5.0-1.module+el8.5.0+11664+59f87560.x86_64
> qemu-kvm-6.0.0-21.module+el8.5.0+11555+e0ab0d09.x86_64
> 

> But there is an issue related to HMAT. Can you please help to confirm if it
> need to file a bug? -
> Below configuration can be saved in "virsh edit", 
>         <bandwidth initiator='0' target='0' cache='1' type='access'
> value='208896' unit='KiB'/>
>         <bandwidth initiator='0' target='0' cache='1' type='read'
> value='209920' unit='KiB'/>
>         <bandwidth initiator='0' target='0' cache='1' type='write'
> value='210944' unit='KiB'/>
> 
>         <bandwidth initiator='0' target='1' cache='1' type='access'
> value='105472' unit='KiB'/>
>         <bandwidth initiator='0' target='1' cache='1' type='read'
> value='106496' unit='KiB'/>
>         <bandwidth initiator='0' target='1' cache='1' type='write'
> value='107520' unit='KiB'/>
> 
> error: XML document failed to validate against schema: Unable to validate
> doc against /usr/share/libvirt/schemas/domain.rng
> Extra element cpu in interleave
> Element domain failed to validate content

Yes, this is a bug. You can file a new bug or return this one. The fix is trivial enough. I just had to double check ACPI specification whether such combination makes sense. And it does.

Comment 18 Jing Qi 2021-07-08 01:30:14 UTC
Filed a bug 1980162 for the syntax validation issue.

Comment 21 Jing Qi 2021-07-16 07:45:48 UTC
Set it to verified from comment 16

Comment 23 errata-xmlrpc 2021-11-16 07:49:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684