Bug 1542901 - Deprecate support for monitoring Intel CMT `perf` events `cmt`, `mbm_local` and ``mbm_total`
Summary: Deprecate support for monitoring Intel CMT `perf` events `cmt`, `mbm_local` ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: z3
: 14.0 (Rocky)
Assignee: Kashyap Chamarthy
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On: 1532553
Blocks: 1397165 1539427
TreeView+ depends on / blocked
 
Reported: 2018-02-07 10:14 UTC by Daniel Berrangé
Modified: 2019-09-09 16:53 UTC (History)
25 users (show)

Fixed In Version: openstack-nova-18.2.1-0.20190509150811.8e130e2.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1532553
Environment:
Last Closed: 2019-07-02 19:45:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1751073 0 None None None 2018-02-22 15:00:01 UTC
OpenStack gerrit 565242 0 None MERGED libvirt: Deprecate support for monitoring Intel CMT `perf` events 2020-03-11 10:36:23 UTC
Red Hat Product Errata RHBA-2019:1670 0 None None None 2019-07-02 19:45:32 UTC

Comment 1 Daniel Berrangé 2018-02-07 10:20:16 UTC
Summary for OpenStack impact:

 - Libvirt supports enabling perf event reporting per guest using <perf ../> XML in guest XML
    https://libvirt.org/formatdomain.html#elementsPerf
 - OpenStack has abiity to enable this support by using nova.conf setting "enabled_perf_events" in [libvirt] section
 - Although libvirt supports many events, openstack only supports the cmt, mbmt and mbml perf events
 - Upstream kernel decided the perf framework integration with cmt, mbmt and mbml  events was broken by design and entirely deleted it
 - Upstream kernel has provided a new approach to cmt, mbmt and mbml  info reporting that is *not* using perf framework
 - RHEL-7.5 kernel has backported this change
 - There's unlikely to be any way for libvirt to make this functionality magically re-appear, given the kernel changes. The new approach is completely incompatible with what was done before

IOW, if someone has set  "enabled_perf_events" in nova.conf previously, they will be unable to start any guest, once they upgrade to the RHEL-7.5 kernel.

Comment 2 Kashyap Chamarthy 2018-02-07 11:22:31 UTC
(In reply to Daniel Berrange from comment #1)
> Summary for OpenStack impact:
> 
>  - Libvirt supports enabling perf event reporting per guest using <perf ../>
> XML in guest XML
>     https://libvirt.org/formatdomain.html#elementsPerf
>  - OpenStack has abiity to enable this support by using nova.conf setting
> "enabled_perf_events" in [libvirt] section
>  - Although libvirt supports many events, openstack only supports the cmt,
> mbmt and mbml perf events
>  - Upstream kernel decided the perf framework integration with cmt, mbmt and
> mbml  events was broken by design and entirely deleted it
>  - Upstream kernel has provided a new approach to cmt, mbmt and mbml  info
> reporting that is *not* using perf framework
>  - RHEL-7.5 kernel has backported this change
>  - There's unlikely to be any way for libvirt to make this functionality
> magically re-appear, given the kernel changes. The new approach is
> completely incompatible with what was done before
> 
> IOW, if someone has set  "enabled_perf_events" in nova.conf previously, they
> will be unable to start any guest, once they upgrade to the RHEL-7.5 kernel.

So at least the consolation here is that 'enabled_perf_events' Nova config attribute defaults to an empty list, therefore not affecting out-of-the-box behavior.

So given this change in RHEL 7.5 Kernel behavior, RHOS Nova (and upstream docs too) should loudly and clearly document this.  And tell to _not_ enable to those 'enabled_perf_events' attribute?

Comment 3 Daniel Berrangé 2018-02-07 11:27:21 UTC
Yes, at the very least we need to release note and kbase this issue.

Upstream, there's three possible directions

 1. Extend nova's support for perf events, so it can enable more than just the cmt, mbmt, mbml features, to make it useful again. I'm unclear if there's any real benefit to this though - depends if there's any monitoring apps that actually care about collecting other perf data items

 2. Simply delete the perf events feature code from nova entirely

 3. Change to support whatever new way of reporting cmt/mbmt/mbml info libvirt provides (if any)

I'm leaning towards (2), but before doing that we should wait to see what, if anything, libvirt does wrt the new infrastructure for reporting cmt/mbmt/mbml information, so we can see if (3) is appropriate. It may take a while before this becomes clear.

Comment 4 Matthew Booth 2018-02-16 16:16:48 UTC
Dan, how does this affect live migration 7.4->7.5 or vice versa? I'm guessing the target domain will fail to come up if it has perf events, and migration will fail early? I guess we could pre-remove perf events for the target domain (same as we tweak a couple of other bits of target domain xml before starting migration), but that sounds like a huge headache of careful upgrade orchestration. Is there anything we can do at the destination such that a non-upgraded Nova sending domain xml with perf events that we can't support will still work?

Comment 5 Daniel Berrangé 2018-02-16 16:33:34 UTC
Yes, it would impact migration as the perf events are no longer supported. So Nova would need to drop those events from the config, or require that the admin has disabled them. There's not much we can nicely do at the target host.

FWIW, I expect the liklihood of anyone ever having enabled this feature is slim. I wouldn't be surprised if number of users is low single-figures.

Comment 7 tomesioan 2018-02-22 13:50:52 UTC
Hello, 

We are using Openstack newton, Ceilometer & Libvirt version >2, and we cannot collect the perf measurements (cmt, cpu_cycles, cache_references, cache_misses) even with enabled_perf_events= cmt, cpu_cycles, 
We are having Ubuntu 16.04 OS, do you think this can be due to a similar cause?

Comment 8 Kashyap Chamarthy 2018-02-22 15:00:01 UTC
(In reply to tomesioan from comment #7)
> Hello, 
> 
> We are using Openstack newton, Ceilometer & Libvirt version >2, and we
> cannot collect the perf measurements (cmt, cpu_cycles, cache_references,
> cache_misses) even with enabled_perf_events= cmt, cpu_cycles, 
> We are having Ubuntu 16.04 OS, do you think this can be due to a similar
> cause?

But what is the exact running kernel version?  Please post the output of the below:

   $ uname -r

This kernel regression only comes if you're running with Kernels 4.14 (or above).   So it's important to know what precise Linux kernel version you're running.

Comment 9 tomesioan 2018-02-22 15:16:04 UTC
Hi, 
The kernel is 4.4.0-96-generic, 

Also my Processor is XEON 2699, which according to https://software.intel.com/en-us/articles/how-to-use-cache-monitoring-technology-in-openstack supports perf measurements. 

BTW, by following the previous link instructions we have managed to make ceilometer poll for those metrics (also by adding them to the pipeline.yaml) and we get "Skip pollster perf.cpu.cycles, no new resources found in this cycle"  ... 
Any ideas?

Thank you very much!!

Comment 10 Kashyap Chamarthy 2018-04-09 14:55:16 UTC
(In reply to Daniel Berrange from comment #1)

[...]

>  - Although libvirt supports many events, openstack only supports the cmt,
> mbmt and mbml perf events

Just an update on the above point. 

tl;dr -- after auditing the code, Nova supports more than just those three Intel Cache Monitoring Technology based events ('cmt', 'mbmt' and 'mbml'), as the `enabled_perf_events` config attributes takes a string list.

Details:

(Looking at Git/master; `git describe`: 17.0.0.0rc1-648-g8b081453c5)

In the nova/virt/libvirt/driver.py, we see:

[...]
PERF_EVENTS_CPU_FLAG_MAPPING = {'cmt': 'cmt',
                                'mbml': 'mbm_local',
                                'mbmt': 'mbm_total',
                               }
[...]

But when you look at the _supported_perf_event() method in libvirt/driver.py,

   4816     def _supported_perf_event(self, event, cpu_features):
   4817 
   4818         libvirt_perf_event_name = LIBVIRT_PERF_EVENT_PREFIX + event.upper()
   4819 
   4820         if not hasattr(libvirt, libvirt_perf_event_name):
   4821             LOG.warning("Libvirt doesn't support event type %s.", event)
   4822             return False
   4823 
   4824         if (event in PERF_EVENTS_CPU_FLAG_MAPPING
   4825             and PERF_EVENTS_CPU_FLAG_MAPPING[event] not in cpu_features):
   4826             LOG.warning("Host does not support event type %s.", event)
   4827             return False
   4828 
   4829         return True


We will skip the `in cpu_features` check (line 4825) if an event is not in the PERF_EVENTS_CPU_FLAG_MAPPING list.  

So maybe we can't delete this feature from Nova wholesale.  But as Dan Berrangé  noted on IRC: "of course its a question of whether anyone uses what's left".

Comment 14 hejianle 2018-05-14 02:21:39 UTC
Now we want to do live migration in openstack , but meet the following error:

ERROR nova.virt.libvirt.driver [req-624be49e-41e2-43e3-aed0-4d93153f12fb 1377a9399b224d648cc7cbabb73db029 e72dfcd709764
71e965ded63a6adfab8 - - -] [instance: 5b8edb49-f9af-40de-adc8-1600ae40dc02] Live Migration failure: the CPU is incompatible with host CPU: Host CPU does not provide required features: cmt, mbm_total, mbm_loca

And the instance xml is the following:
<cpu mode='custom' match='exact' check='full'>
    <model fallback='allow'>Broadwell</model>
    <vendor>Intel</vendor>
    <topology sockets='2' cores='1' threads='1'/>
    <feature policy='require' name='vme'/>
    ....................
    <feature policy='disable' name='cmt'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='disable' name='mbm_total'/>
    <feature policy='disable' name='mbm_local'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='abm'/>
    <feature policy='require' name='hypervisor'/>
  </cpu>

Is that related aboout this bug.
Is there any solution to solve the problem.

Comment 15 hejianle 2018-05-14 02:24:02 UTC
Now we want to do live migration in openstack , but meet the following error:

ERROR nova.virt.libvirt.driver [req-624be49e-41e2-43e3-aed0-4d93153f12fb 1377a9399b224d648cc7cbabb73db029 e72dfcd709764
71e965ded63a6adfab8 - - -] [instance: 5b8edb49-f9af-40de-adc8-1600ae40dc02] Live Migration failure: the CPU is incompatible with host CPU: Host CPU does not provide required features: cmt, mbm_total, mbm_loca

And the instance xml is the following:
<cpu mode='custom' match='exact' check='full'>
    <model fallback='allow'>Broadwell</model>
    <vendor>Intel</vendor>
    <topology sockets='2' cores='1' threads='1'/>
    <feature policy='require' name='vme'/>
    ....................
    <feature policy='disable' name='cmt'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='disable' name='mbm_total'/>
    <feature policy='disable' name='mbm_local'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='abm'/>
    <feature policy='require' name='hypervisor'/>
  </cpu>

Is that related aboout this bug.
Is there any solution to solve the problem.

Comment 21 Kashyap Chamarthy 2019-06-17 14:54:40 UTC
The bug was fixed in this following commit:

    commit fc4794acc6b13afade1bb72a1ae9f574707d2f0d
    Author: Kashyap Chamarthy <kchamart>
    Date:   Tue May 8 10:52:17 2018 +0200
    
        libvirt: Deprecate support for monitoring Intel CMT `perf` events
        
        Upstream Linux kernel has deleted[*] the `perf` framework integration
        with Intel CMT (Cache Monitoring Technology; or "CQM" in Linux kernel
        parlance), because the feature was broken by design -- an
        incompatibility between Linux's `perf` infrastructure and Intel CMT
        hardware support.  It was removed in upstream kernel version v4.14; but
        bear in mind that downstream Linux distributions with lower kernel
        versions than 4.14 have backported the said change.
        
        Nova supports monitoring of the above mentioned Intel CMT events
        (namely: 'cmt', 'mbm_local', and 'mbm_total') via the configuration
        attribute `[libvirt]/enabled_perf_events`. Given that the underlying
        Linux kernel infrastructure for Intel CMT is removed, we should remove
        support for it in Nova too.  Otherwise enabling them in Nova, and
        updating to a Linux kernel 4.14 (or above) will result in instances
        failing to boot.
        
        To that end, deprecate support for the three Intel CMT events in "Rocky"
        release, with the intention to remove support for it in the upcoming
        "Stein" release.  Note that we cannot deprecate / remove
        `enabled_perf_events` config attribute altogether -- since there are
        other[+] `perf` events besides Intel CMT.  Whether anyone is using those
        other events with Nova is a good question to which we don't have an
        equally good answer for, if at all.
        
        [*] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c39a0e2
        [+] https://libvirt.org/formatdomain.html#elementsPerf
        
        Closes-Bug: #1751073
        Change-Id: I7e77f87650d966d605807c7be184e670259a81c1
        Signed-off-by: Kashyap Chamarthy <kchamart>

		- - -

Also adding the "deprecation" release note from the commit:

---
deprecations:
  - |
    Support to monitor performance events for Intel CMT (Cache
    Monitoring Technology, or "CQM" in Linux kernel parlance) -- namely
    ``cmt``, ``mbm_local`` and ``mbm_total`` -- via the config attribute
    ``[libvirt]/enabled_perf_events`` is now *deprecated* from Nova, and
    will be *removed* in the "Stein" release.  Otherwise, if you have
    enabled those events, and upgraded to Linux kernel 4.14 (or suitable
    downstream version), it will result in instances failing to boot.

    That is because the Linux kernel has deleted the `perf` framework
    integration with Intel CMT, as the feature was broken by design --
    an incompatibility between Linux's `perf` infrastructure and Intel
    CMT.  It was removed in upstream Linux version v4.14; but bear in
    mind that downstream Linux distributions with lower kernel versions
    than 4.14 have backported the said change.

Comment 22 Kashyap Chamarthy 2019-06-17 15:09:56 UTC
Steps for test verification
---------------------------

The functional test verification steps are quite trivial (and this is 
already verified by unit tests):

(1) On a compute node, set the Intel "CMT" perf evnets: `cmt`, 
    `mbm_local` and `mbm_total` events:
    
      [libvirt]
      enabled_perf_events = cmt, mbm_local, mbm_total

(2) Start a guest.

(3) In nova-compute.log, ensure that you see a warning similar to:

      "Host does not support event cmt|mbm_local|mbm_total" (or
      something like that)

Comment 23 Joe H. Rahme 2019-06-18 13:10:55 UTC
Following the steps described in #c22, I'm marking the BZ as verified.

From the nova-compute.log


2019-06-18 12:41:45.635 1 WARNING nova.virt.libvirt.driver [-] Monitoring Intel CMT `perf` event(s) cmt is deprecated and will be removed in the "Stein" release.  It was broken by design in the Linux kernel, so support for Intel CMT was removed from Linux 4.14 onwards. Therefore it is recommended to not enable them.
2019-06-18 12:41:45.635 1 WARNING nova.virt.libvirt.driver [-] Host does not support event type cmt.
2019-06-18 12:41:45.636 1 WARNING nova.virt.libvirt.driver [-] Libvirt doesn't support event type mbm_local.
2019-06-18 12:41:45.636 1 WARNING nova.virt.libvirt.driver [-] Libvirt doesn't support event type mbm_total.

Comment 25 errata-xmlrpc 2019-07-02 19:45:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1670


Note You need to log in before you can comment on or make changes to this bug.