Bug 1819060 - RFE: support for configuring CPU 'dies' in guest topology - RHV side
Summary: RFE: support for configuring CPU 'dies' in guest topology - RHV side
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.4.0
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
: ---
Assignee: Michal Skrivanek
QA Contact: meital avital
URL:
Whiteboard: libvirt_RHV_INT
Depends On: 1785207 1813395 1821592
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-31 06:32 UTC by jiyan
Modified: 2021-08-17 16:40 UTC (History)
12 users (show)

Fixed In Version:
Clone Of: 1785207
Environment:
Last Closed: 2021-08-17 16:38:23 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.5?
pm-rhel: planning_ack?
pm-rhel: devel_ack?
pm-rhel: testing_ack?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-38014 0 None None None 2021-08-17 16:40:08 UTC

Description jiyan 2020-03-31 06:32:37 UTC
+++ This bug was initially created as a clone of Bug #1785207 +++

Description of problem:
Latest generation CPUs introduced a new level in the topology referred to as a "die", sitting between the socket & core. 

QEMU added support for this in the -smp arg in 4.1.0, and libvirt needs to expose this in the guest XML configuration

https://libvirt.org/formatdomain.html#elementsCPU

eg

  <vcpu placement='static'>12</vcpu>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='2' dies='3' cores='2' threads='1'/>
  </cpu>
 
With such a config we should get

  -smp 12,sockets=2,dies=3,cores=2,threads=1

And inside the guest we should see topology:


# hwloc-ls
Machine (7724MB total)
  NUMANode L#0 (P#0 7724MB)
  Package L#0
    L3 L#0 (16MB)
      L2 L#0 (4096KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L2 L#1 (4096KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
    L3 L#1 (16MB)
      L2 L#2 (4096KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L2 L#3 (4096KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L3 L#2 (16MB)
      L2 L#4 (4096KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
      L2 L#5 (4096KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
  Package L#1
    L3 L#3 (16MB)
      L2 L#6 (4096KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
      L2 L#7 (4096KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
    L3 L#4 (16MB)
      L2 L#8 (4096KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
      L2 L#9 (4096KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
    L3 L#5 (16MB)
      L2 L#10 (4096KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
      L2 L#11 (4096KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)


Note 'Package' here maps to 'socket' in libvirt terminology. So the first level below the package is the 'die' and next level is the 'core'. We could introduce hyperthreads if we want yet another level.

Note that in sysfs on latest upstream / Fedora kernels there's a new sysfs file "die_id" and "die_cpus" and "die_cpus_list" at 

  /sys/devices/system/cpu/cpuXXX/topology/

which can also be used to validate the guest topology.

I'm not sure if this is backported to RHEL8 kernels or not yet.

Version-Release number of selected component (if applicable):
libvirt-5.10.0-1

--- Additional comment from Daniel Berrangé on 2019-12-20 15:22:48 UTC ---

Patches at 

https://www.redhat.com/archives/libvir-list/2019-December/msg01249.html

Comment 1 jiyan 2020-03-31 06:38:54 UTC
I filed this bug because I just found that CPU dies is enabled by default in RHV side.

RHV host Version:
kernel-4.18.0-193.el8.x86_64
libvirt-6.0.0-15.module+el8.2.0+6106+b6345808.x86_64
vdsm-4.40.9-1.el8ev.x86_64
qemu-kvm-4.2.0-16.module+el8.2.0+6092+4f2391c1.x86_64

VM's dumpXML info:
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Cascadelake-Server</model>
    <topology sockets='16' dies='1' cores='8' threads='1'/>
    <feature policy='require' name='md-clear'/>
    <feature policy='require' name='mds-no'/>
    <feature policy='disable' name='hle'/>
    <feature policy='disable' name='rtm'/>
    <feature policy='require' name='tsx-ctrl'/>
    <feature policy='require' name='arch-capabilities'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='mpx'/>
    <feature policy='require' name='pku'/>
    <numa>
      <cell id='0' cpus='0-7,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126' memory='524288' unit='KiB'/>
      <cell id='1' cpus='8-15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127' memory='524288' unit='KiB'/>
    </numa>
  </cpu>

Output of "Virsh capabilities"
    <topology>
      <cells num='2'>
        <cell id='0'>
          <memory unit='KiB'>32416392</memory>
          <pages unit='KiB' size='4'>8104098</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>0</pages>
          <distances>
            <sibling id='0' value='10'/>
            <sibling id='1' value='21'/>
          </distances>
          <cpus num='20'>
            <cpu id='0' socket_id='0' die_id='0' core_id='0' siblings='0,20'/>
            <cpu id='2' socket_id='0' die_id='0' core_id='4' siblings='2,22'/>
            <cpu id='4' socket_id='0' die_id='0' core_id='1' siblings='4,24'/>

In libvirt side, we do not enable cpu dies by default, but RHV does.
Even on physical host which does not support this function, the "dies" will equal 1.

So I filed this bug to see whether RHV should do something about the dies, pls close this bug if I made a mistake.

Comment 2 Michal Skrivanek 2020-06-23 12:35:09 UTC
This request is not currently committed to 4.4.z, moving it to 4.5

Comment 3 Arik 2021-08-17 16:38:23 UTC
(In reply to jiyan from comment #1)
> In libvirt side, we do not enable cpu dies by default, but RHV does.
> Even on physical host which does not support this function, the "dies" will
> equal 1.
> 
> So I filed this bug to see whether RHV should do something about the dies,
> pls close this bug if I made a mistake.

RHV doesn't specify/enables 'dies', it's libvirt that sets it to 1 when omitted:
"The dies attribute is optional and will default to 1 if omitted, while the other attributes are all mandatory."


Note You need to log in before you can comment on or make changes to this bug.