Bug 1289368

Summary: [RFE] libvirt support for CAT
Product: Red Hat Enterprise Linux 7 Reporter: Marcelo Tosatti <mtosatti>
Component: libvirtAssignee: Martin Kletzander <mkletzan>
Status: CLOSED ERRATA QA Contact: Luyao Huang <lhuang>
Severity: high Docs Contact: Yehuda Zimmerman <yzimmerm>
Priority: high    
Version: 7.3CC: atelang, brpeters, dbayly, dyuan, fbaudin, jdenemar, jsuchane, juzhang, kchamart, ksanagi, lcapitulino, lhuang, lmiksik, mkletzan, mtessun, mtosatti, pezhang, pliu, pragyansri.pathi, rbalakri, sujith_pandel, tumeya, xfu, xiaolong.wang, xuzhang
Target Milestone: rcKeywords: FutureFeature
Target Release: 7.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-3.9.0-10.el7 Doc Type: Enhancement
Doc Text:
CAT support added to *libvirt* on specific CPU models The *libvirt* service now supports Cache Allocation Technology (CAT) on specific CPU models. This enables guest virtual machines to have part of their host's CPU cache allocated for their vCPU threads. For details on configuring this feature, see https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/index.html#sect_VTOG-vCPU_cache_reservation.
Story Points: ---
Clone Of:
: 1299678 1547015 (view as bug list) Environment:
Last Closed: 2018-04-10 10:33:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1288964, 1410144    
Bug Blocks: 1513282, 1299678, 1468650, 1469590, 1490967, 1522983, 1547015    

Description Marcelo Tosatti 2015-12-07 23:26:07 UTC
Kernel support for CAT requires libvirt support:


Bug 1288964 - [Intel 7.3 FEAT] Enable Cache Allocation Technology (CAT)

Comment 2 Brad Peters 2017-08-28 21:30:15 UTC
Marcelo, what's the scope of the work here? 

Is this just an update to a newer libvirt release?

Comment 3 Marcelo Tosatti 2017-10-07 00:24:50 UTC
(In reply to Brad Peters from comment #2)
> Marcelo, what's the scope of the work here? 
> 
> Is this just an update to a newer libvirt release?

Yes. Martin can comment on upstream libvirt acceptance.

Comment 4 Martin Kletzander 2017-10-10 10:29:13 UTC
(In reply to Brad Peters from comment #2)
Yes, hopefully it will be.  There were some spanners in the works, some rough edges here and there, but I have it planned for completion soon.

Comment 5 Martin Kletzander 2017-11-13 08:51:36 UTC
Patch for initial support posted upstream:

https://www.redhat.com/archives/libvir-list/2017-November/msg00424.html

Comment 6 Pragyan Pathi 2017-12-14 22:38:38 UTC
Karen Eli will talk with Martin for the details of testing requirements

Comment 10 Luyao Huang 2018-02-07 08:49:02 UTC
Verify this bug with libvirt-3.9.0-11.el7.x86_64:

S1, resctrl without cdp

0. prepare a machine which support intel CAT

# lscpu |grep cat
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts


1. mount resctrl without cdp:

# mount -t resctrl resctrl  /sys/fs/resctrl/

2. check the virsh caps output:

virsh # capabilities 

    <cache>
      <bank id='0' level='3' type='both' size='20' unit='MiB' cpus='0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30'>
        <control granularity='1' unit='MiB' type='both' maxAllocs='16'/>
      </bank>
      <bank id='1' level='3' type='both' size='20' unit='MiB' cpus='1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31'>
        <control granularity='1' unit='MiB' type='both' maxAllocs='16'/>
      </bank>
    </cache>

3. set some l3 cache for vm

# echo "L3:0=000ff;1=000ff" > /sys/fs/resctrl/schemata

4. add cachetune  in guest 1 xml:

# virsh edit vm1
...
  <vcpu placement='static' cpuset='0-2' current='2'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='10-20'/>
    <vcpupin vcpu='2' cpuset='0-9'/>
    <cachetune vcpus='0'>
      <cache id='0' level='3' type='both' size='1' unit='MiB'/>
      <cache id='1' level='3' type='both' size='2' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='1'>
      <cache id='0' level='3' type='both' size='2' unit='MiB'/>
      <cache id='1' level='3' type='both' size='1' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='3'>
      <cache id='0' level='3' type='both' size='1' unit='MiB'/>
      <cache id='1' level='3' type='both' size='1' unit='MiB'/>
    </cachetune>
  </cputune>
...

5. start guest:

# virsh start vm1
Domain vm1 started

6. check resctrl dir:

# ll /sys/fs/resctrl/
total 0
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus_list
dr-xr-xr-x. 4 root root 0 Feb  7 03:28 info
dr-xr-xr-x. 4 root root 0 Feb  7 03:28 mon_data
dr-xr-xr-x. 2 root root 0 Feb  7 03:28 mon_groups
drwxr-xr-x. 4 root root 0 Feb  7 03:28 qemu-1-vm1-vcpus_0
drwxr-xr-x. 4 root root 0 Feb  7 03:28 qemu-1-vm1-vcpus_1
drwxr-xr-x. 4 root root 0 Feb  7 03:28 qemu-1-vm1-vcpus_3
-rw-r--r--. 1 root root 0 Feb  7 03:19 schemata
-rw-r--r--. 1 root root 0 Feb  7 01:25 tasks

# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_0/tasks 
42754
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_0/schemata 
    L3:0=00100;1=00300
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_1/tasks 
42755
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_1/schemata 
    L3:0=00600;1=00400
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_3/tasks 
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_3/schemata 
    L3:0=00800;1=00800


7. hotplug 2 vcpus:

# virsh setvcpus vm1 4

8. recheck the vcpus 3 dir:

# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_3/tasks
43182

9. unplug vcpus:

# virsh setvcpus vm1 2

10. recheck the vcpus 3 dir:

# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_3/tasks

11. destroy guest:

# virsh destroy vm1
Domain vm1 destroyed

12. check the /sys/fs/resctrl/ dir and vm1 related group should been deleted:

# ll /sys/fs/resctrl/
total 0
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus_list
dr-xr-xr-x. 4 root root 0 Feb  7 03:28 info
dr-xr-xr-x. 4 root root 0 Feb  7 03:28 mon_data
dr-xr-xr-x. 2 root root 0 Feb  7 03:28 mon_groups
-rw-r--r--. 1 root root 0 Feb  7 03:19 schemata
-rw-r--r--. 1 root root 0 Feb  7 01:25 tasks

Comment 11 Luyao Huang 2018-02-07 09:05:11 UTC
S2 resctrl mount with cdp:

1. mount resctrl with cdp:

# mount -t resctrl resctrl -o cdp /sys/fs/resctrl/

2. restart libvirtd

# service libvirtd restart
Redirecting to /bin/systemctl restart libvirtd.service

3. check the virsh caps:

virsh # capabilities 
...
    <cache>
      <bank id='0' level='3' type='both' size='20' unit='MiB' cpus='0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30'>
        <control granularity='1' unit='MiB' type='code' maxAllocs='8'/>
        <control granularity='1' unit='MiB' type='data' maxAllocs='8'/>
      </bank>
      <bank id='1' level='3' type='both' size='20' unit='MiB' cpus='1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31'>
        <control granularity='1' unit='MiB' type='code' maxAllocs='8'/>
        <control granularity='1' unit='MiB' type='data' maxAllocs='8'/>
      </bank>
    </cache>
...

4. set some l3 cache for vm

# echo "L3DATA:0=ff;1=ff" > /sys/fs/resctrl/schemata
# echo "L3CODE:0=ff;1=ff" > /sys/fs/resctrl/schemata

5. add cachetune in guest xml:

  <vcpu placement='static' cpuset='0-2' current='2'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='10-20'/>
    <vcpupin vcpu='2' cpuset='0-9'/>
    <cachetune vcpus='0'>
      <cache id='1' level='3' type='code' size='2' unit='MiB'/>
      <cache id='0' level='3' type='data' size='1' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='1'>
      <cache id='0' level='3' type='code' size='2' unit='MiB'/>
      <cache id='1' level='3' type='code' size='1' unit='MiB'/>
      <cache id='0' level='3' type='data' size='2' unit='MiB'/>
      <cache id='1' level='3' type='data' size='1' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='3'>
      <cache id='0' level='3' type='code' size='1' unit='MiB'/>
      <cache id='0' level='3' type='data' size='1' unit='MiB'/>
    </cachetune>
  </cputune>

6. start guest:

# virsh start vm1
Domain vm1 started

7. check the value in the resctrl dir:

# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_0/schemata 
L3DATA:0=00100;1=000ff
L3CODE:0=000ff;1=00300
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_0/tasks 
46238
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_1/schemata 
L3DATA:0=00600;1=00100
L3CODE:0=00300;1=00400
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_1/tasks 
46239
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_3/schemata 
L3DATA:0=00800;1=000ff
L3CODE:0=00400;1=000ff
# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_3/tasks 

8. hotplug vcpu:

# virsh setvcpus vm1 4

9. recheck vcpus 3 resctrl dir:

# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_3/tasks 
46588

10. unplug vcpu:

# virsh setvcpus vm1 2


11. recheck vcpus 3 resctrl dir:

# cat /sys/fs/resctrl/qemu-1-vm1-vcpus_3/tasks

12. destroy guest :

# virsh destroy vm1
Domain vm1 destroyed

13. and check the resctrl dir:

# ll /sys/fs/resctrl/
total 0
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus_list
dr-xr-xr-x. 5 root root 0 Feb  7 03:49 info
dr-xr-xr-x. 4 root root 0 Feb  7 03:55 mon_data
dr-xr-xr-x. 2 root root 0 Feb  7 03:55 mon_groups
-rw-r--r--. 1 root root 0 Feb  7 03:56 schemata
-rw-r--r--. 1 root root 0 Feb  7 01:25 tasks

Comment 12 Luyao Huang 2018-02-07 09:13:57 UTC
S3 check the dir won't lose track after restart libvirtd:

1. start a guest which have cachetune:
# virsh dumpxml vm1
...
  <cputune>
    <vcpupin vcpu='0' cpuset='10-20'/>
    <vcpupin vcpu='2' cpuset='0-9'/>
    <cachetune vcpus='0'>
      <cache id='1' level='3' type='code' size='2' unit='MiB'/>
      <cache id='0' level='3' type='data' size='1' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='1'>
      <cache id='0' level='3' type='code' size='2' unit='MiB'/>
      <cache id='1' level='3' type='code' size='1' unit='MiB'/>
      <cache id='0' level='3' type='data' size='2' unit='MiB'/>
      <cache id='1' level='3' type='data' size='1' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='3'>
      <cache id='0' level='3' type='code' size='1' unit='MiB'/>
      <cache id='0' level='3' type='data' size='1' unit='MiB'/>
    </cachetune>
  </cputune>
...


2. start guest:

# virsh start vm1
Domain vm1 started


2, check the resctrl dir:

# ll /sys/fs/resctrl/
total 0
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus_list
dr-xr-xr-x. 5 root root 0 Feb  7 04:04 info
dr-xr-xr-x. 4 root root 0 Feb  7 04:04 mon_data
dr-xr-xr-x. 2 root root 0 Feb  7 04:04 mon_groups
drwxr-xr-x. 4 root root 0 Feb  7 04:08 qemu-2-vm1-vcpus_0
drwxr-xr-x. 4 root root 0 Feb  7 04:08 qemu-2-vm1-vcpus_1
drwxr-xr-x. 4 root root 0 Feb  7 04:08 qemu-2-vm1-vcpus_3
-rw-r--r--. 1 root root 0 Feb  7 03:56 schemata
-rw-r--r--. 1 root root 0 Feb  7 01:25 tasks


3. restart libvirtd:

# service libvirtd restart
Redirecting to /bin/systemctl restart libvirtd.service

4. destroy guest:

# virsh destroy vm1
Domain vm1 destroyed

5. check resctrl dir and make sure libvirt remove the subdir for guest:

# ll /sys/fs/resctrl/
total 0
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus
-rw-r--r--. 1 root root 0 Feb  7 01:25 cpus_list
dr-xr-xr-x. 5 root root 0 Feb  7 04:04 info
dr-xr-xr-x. 4 root root 0 Feb  7 04:04 mon_data
dr-xr-xr-x. 2 root root 0 Feb  7 04:04 mon_groups
-rw-r--r--. 1 root root 0 Feb  7 03:56 schemata
-rw-r--r--. 1 root root 0 Feb  7 01:25 tasks

Comment 13 Luyao Huang 2018-02-07 09:34:57 UTC
S4 start multi-guest which have cachetune to reach the limit of the l3 cache tune:

1. start 3 guest which have cachetune settings:


vm1:

  <cputune>
    <vcpupin vcpu='0' cpuset='10-20'/>
    <vcpupin vcpu='2' cpuset='0-9'/>
    <cachetune vcpus='0'>
      <cache id='0' level='3' type='both' size='1' unit='MiB'/>
      <cache id='1' level='3' type='both' size='2' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='1'>
      <cache id='0' level='3' type='both' size='2' unit='MiB'/>
      <cache id='1' level='3' type='both' size='1' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='3'>
      <cache id='0' level='3' type='both' size='1' unit='MiB'/>
    </cachetune>
  </cputune>

vm2:

  <cputune>
    <vcpupin vcpu='0' cpuset='10-20'/>
    <vcpupin vcpu='2' cpuset='0-9'/>
    <cachetune vcpus='0'>
      <cache id='0' level='3' type='both' size='1' unit='MiB'/>
      <cache id='1' level='3' type='both' size='2' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='1'>
      <cache id='0' level='3' type='both' size='2' unit='MiB'/>
      <cache id='1' level='3' type='both' size='1' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='3'>
      <cache id='0' level='3' type='both' size='1' unit='MiB'/>
      <cache id='1' level='3' type='both' size='1' unit='MiB'/>
    </cachetune>
  </cputune>


vm3:

  <cputune>
    <vcpupin vcpu='0' cpuset='10-20'/>
    <vcpupin vcpu='2' cpuset='0-9'/>
    <cachetune vcpus='0'>
      <cache id='0' level='3' type='both' size='3' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='1'>
      <cache id='1' level='3' type='both' size='5' unit='MiB'/>
    </cachetune>
    <cachetune vcpus='2'>
      <cache id='0' level='3' type='both' size='1' unit='MiB'/>
    </cachetune>
  </cputune>


2. check the schemata value for each cpu:

# cat /sys/fs/resctrl/*/schemata 
    L3:0=00100;1=00300
    L3:0=00600;1=00400
    L3:0=00800;1=000ff
    L3:0=01000;1=01800
    L3:0=06000;1=02000
    L3:0=08000;1=04000
    L3:0=70000;1=000ff
    L3:0=000ff;1=f8000
    L3:0=80000;1=000ff

3, try to start another guest:

# virsh start vm4
error: Failed to start domain vm4
error: unsupported configuration: Not enough room for allocation of 1048576 bytes for level 3 cache 0 scope type 'both'

Comment 14 Luyao Huang 2018-02-08 03:06:19 UTC
Verify this bug with comment 10, 11, 12 and 13

Comment 20 Takuma Umeya 2018-03-23 16:41:32 UTC
*** Bug 1546273 has been marked as a duplicate of this bug. ***

Comment 21 Karl Hastings 2018-03-24 05:34:29 UTC
*** Bug 1531405 has been marked as a duplicate of this bug. ***

Comment 23 errata-xmlrpc 2018-04-10 10:33:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0704