Summary for OpenStack impact: - Libvirt supports enabling perf event reporting per guest using <perf ../> XML in guest XML https://libvirt.org/formatdomain.html#elementsPerf - OpenStack has abiity to enable this support by using nova.conf setting "enabled_perf_events" in [libvirt] section - Although libvirt supports many events, openstack only supports the cmt, mbmt and mbml perf events - Upstream kernel decided the perf framework integration with cmt, mbmt and mbml events was broken by design and entirely deleted it - Upstream kernel has provided a new approach to cmt, mbmt and mbml info reporting that is *not* using perf framework - RHEL-7.5 kernel has backported this change - There's unlikely to be any way for libvirt to make this functionality magically re-appear, given the kernel changes. The new approach is completely incompatible with what was done before IOW, if someone has set "enabled_perf_events" in nova.conf previously, they will be unable to start any guest, once they upgrade to the RHEL-7.5 kernel.
(In reply to Daniel Berrange from comment #1) > Summary for OpenStack impact: > > - Libvirt supports enabling perf event reporting per guest using <perf ../> > XML in guest XML > https://libvirt.org/formatdomain.html#elementsPerf > - OpenStack has abiity to enable this support by using nova.conf setting > "enabled_perf_events" in [libvirt] section > - Although libvirt supports many events, openstack only supports the cmt, > mbmt and mbml perf events > - Upstream kernel decided the perf framework integration with cmt, mbmt and > mbml events was broken by design and entirely deleted it > - Upstream kernel has provided a new approach to cmt, mbmt and mbml info > reporting that is *not* using perf framework > - RHEL-7.5 kernel has backported this change > - There's unlikely to be any way for libvirt to make this functionality > magically re-appear, given the kernel changes. The new approach is > completely incompatible with what was done before > > IOW, if someone has set "enabled_perf_events" in nova.conf previously, they > will be unable to start any guest, once they upgrade to the RHEL-7.5 kernel. So at least the consolation here is that 'enabled_perf_events' Nova config attribute defaults to an empty list, therefore not affecting out-of-the-box behavior. So given this change in RHEL 7.5 Kernel behavior, RHOS Nova (and upstream docs too) should loudly and clearly document this. And tell to _not_ enable to those 'enabled_perf_events' attribute?
Yes, at the very least we need to release note and kbase this issue. Upstream, there's three possible directions 1. Extend nova's support for perf events, so it can enable more than just the cmt, mbmt, mbml features, to make it useful again. I'm unclear if there's any real benefit to this though - depends if there's any monitoring apps that actually care about collecting other perf data items 2. Simply delete the perf events feature code from nova entirely 3. Change to support whatever new way of reporting cmt/mbmt/mbml info libvirt provides (if any) I'm leaning towards (2), but before doing that we should wait to see what, if anything, libvirt does wrt the new infrastructure for reporting cmt/mbmt/mbml information, so we can see if (3) is appropriate. It may take a while before this becomes clear.
Dan, how does this affect live migration 7.4->7.5 or vice versa? I'm guessing the target domain will fail to come up if it has perf events, and migration will fail early? I guess we could pre-remove perf events for the target domain (same as we tweak a couple of other bits of target domain xml before starting migration), but that sounds like a huge headache of careful upgrade orchestration. Is there anything we can do at the destination such that a non-upgraded Nova sending domain xml with perf events that we can't support will still work?
Yes, it would impact migration as the perf events are no longer supported. So Nova would need to drop those events from the config, or require that the admin has disabled them. There's not much we can nicely do at the target host. FWIW, I expect the liklihood of anyone ever having enabled this feature is slim. I wouldn't be surprised if number of users is low single-figures.
Hello, We are using Openstack newton, Ceilometer & Libvirt version >2, and we cannot collect the perf measurements (cmt, cpu_cycles, cache_references, cache_misses) even with enabled_perf_events= cmt, cpu_cycles, We are having Ubuntu 16.04 OS, do you think this can be due to a similar cause?
(In reply to tomesioan from comment #7) > Hello, > > We are using Openstack newton, Ceilometer & Libvirt version >2, and we > cannot collect the perf measurements (cmt, cpu_cycles, cache_references, > cache_misses) even with enabled_perf_events= cmt, cpu_cycles, > We are having Ubuntu 16.04 OS, do you think this can be due to a similar > cause? But what is the exact running kernel version? Please post the output of the below: $ uname -r This kernel regression only comes if you're running with Kernels 4.14 (or above). So it's important to know what precise Linux kernel version you're running.
Hi, The kernel is 4.4.0-96-generic, Also my Processor is XEON 2699, which according to https://software.intel.com/en-us/articles/how-to-use-cache-monitoring-technology-in-openstack supports perf measurements. BTW, by following the previous link instructions we have managed to make ceilometer poll for those metrics (also by adding them to the pipeline.yaml) and we get "Skip pollster perf.cpu.cycles, no new resources found in this cycle" ... Any ideas? Thank you very much!!
(In reply to Daniel Berrange from comment #1) [...] > - Although libvirt supports many events, openstack only supports the cmt, > mbmt and mbml perf events Just an update on the above point. tl;dr -- after auditing the code, Nova supports more than just those three Intel Cache Monitoring Technology based events ('cmt', 'mbmt' and 'mbml'), as the `enabled_perf_events` config attributes takes a string list. Details: (Looking at Git/master; `git describe`: 17.0.0.0rc1-648-g8b081453c5) In the nova/virt/libvirt/driver.py, we see: [...] PERF_EVENTS_CPU_FLAG_MAPPING = {'cmt': 'cmt', 'mbml': 'mbm_local', 'mbmt': 'mbm_total', } [...] But when you look at the _supported_perf_event() method in libvirt/driver.py, 4816 def _supported_perf_event(self, event, cpu_features): 4817 4818 libvirt_perf_event_name = LIBVIRT_PERF_EVENT_PREFIX + event.upper() 4819 4820 if not hasattr(libvirt, libvirt_perf_event_name): 4821 LOG.warning("Libvirt doesn't support event type %s.", event) 4822 return False 4823 4824 if (event in PERF_EVENTS_CPU_FLAG_MAPPING 4825 and PERF_EVENTS_CPU_FLAG_MAPPING[event] not in cpu_features): 4826 LOG.warning("Host does not support event type %s.", event) 4827 return False 4828 4829 return True We will skip the `in cpu_features` check (line 4825) if an event is not in the PERF_EVENTS_CPU_FLAG_MAPPING list. So maybe we can't delete this feature from Nova wholesale. But as Dan Berrangé noted on IRC: "of course its a question of whether anyone uses what's left".
Now we want to do live migration in openstack , but meet the following error: ERROR nova.virt.libvirt.driver [req-624be49e-41e2-43e3-aed0-4d93153f12fb 1377a9399b224d648cc7cbabb73db029 e72dfcd709764 71e965ded63a6adfab8 - - -] [instance: 5b8edb49-f9af-40de-adc8-1600ae40dc02] Live Migration failure: the CPU is incompatible with host CPU: Host CPU does not provide required features: cmt, mbm_total, mbm_loca And the instance xml is the following: <cpu mode='custom' match='exact' check='full'> <model fallback='allow'>Broadwell</model> <vendor>Intel</vendor> <topology sockets='2' cores='1' threads='1'/> <feature policy='require' name='vme'/> .................... <feature policy='disable' name='cmt'/> <feature policy='require' name='xsaveopt'/> <feature policy='disable' name='mbm_total'/> <feature policy='disable' name='mbm_local'/> <feature policy='require' name='pdpe1gb'/> <feature policy='require' name='abm'/> <feature policy='require' name='hypervisor'/> </cpu> Is that related aboout this bug. Is there any solution to solve the problem.
The bug was fixed in this following commit: commit fc4794acc6b13afade1bb72a1ae9f574707d2f0d Author: Kashyap Chamarthy <kchamart> Date: Tue May 8 10:52:17 2018 +0200 libvirt: Deprecate support for monitoring Intel CMT `perf` events Upstream Linux kernel has deleted[*] the `perf` framework integration with Intel CMT (Cache Monitoring Technology; or "CQM" in Linux kernel parlance), because the feature was broken by design -- an incompatibility between Linux's `perf` infrastructure and Intel CMT hardware support. It was removed in upstream kernel version v4.14; but bear in mind that downstream Linux distributions with lower kernel versions than 4.14 have backported the said change. Nova supports monitoring of the above mentioned Intel CMT events (namely: 'cmt', 'mbm_local', and 'mbm_total') via the configuration attribute `[libvirt]/enabled_perf_events`. Given that the underlying Linux kernel infrastructure for Intel CMT is removed, we should remove support for it in Nova too. Otherwise enabling them in Nova, and updating to a Linux kernel 4.14 (or above) will result in instances failing to boot. To that end, deprecate support for the three Intel CMT events in "Rocky" release, with the intention to remove support for it in the upcoming "Stein" release. Note that we cannot deprecate / remove `enabled_perf_events` config attribute altogether -- since there are other[+] `perf` events besides Intel CMT. Whether anyone is using those other events with Nova is a good question to which we don't have an equally good answer for, if at all. [*] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c39a0e2 [+] https://libvirt.org/formatdomain.html#elementsPerf Closes-Bug: #1751073 Change-Id: I7e77f87650d966d605807c7be184e670259a81c1 Signed-off-by: Kashyap Chamarthy <kchamart> - - - Also adding the "deprecation" release note from the commit: --- deprecations: - | Support to monitor performance events for Intel CMT (Cache Monitoring Technology, or "CQM" in Linux kernel parlance) -- namely ``cmt``, ``mbm_local`` and ``mbm_total`` -- via the config attribute ``[libvirt]/enabled_perf_events`` is now *deprecated* from Nova, and will be *removed* in the "Stein" release. Otherwise, if you have enabled those events, and upgraded to Linux kernel 4.14 (or suitable downstream version), it will result in instances failing to boot. That is because the Linux kernel has deleted the `perf` framework integration with Intel CMT, as the feature was broken by design -- an incompatibility between Linux's `perf` infrastructure and Intel CMT. It was removed in upstream Linux version v4.14; but bear in mind that downstream Linux distributions with lower kernel versions than 4.14 have backported the said change.
Steps for test verification --------------------------- The functional test verification steps are quite trivial (and this is already verified by unit tests): (1) On a compute node, set the Intel "CMT" perf evnets: `cmt`, `mbm_local` and `mbm_total` events: [libvirt] enabled_perf_events = cmt, mbm_local, mbm_total (2) Start a guest. (3) In nova-compute.log, ensure that you see a warning similar to: "Host does not support event cmt|mbm_local|mbm_total" (or something like that)
Following the steps described in #c22, I'm marking the BZ as verified. From the nova-compute.log 2019-06-18 12:41:45.635 1 WARNING nova.virt.libvirt.driver [-] Monitoring Intel CMT `perf` event(s) cmt is deprecated and will be removed in the "Stein" release. It was broken by design in the Linux kernel, so support for Intel CMT was removed from Linux 4.14 onwards. Therefore it is recommended to not enable them. 2019-06-18 12:41:45.635 1 WARNING nova.virt.libvirt.driver [-] Host does not support event type cmt. 2019-06-18 12:41:45.636 1 WARNING nova.virt.libvirt.driver [-] Libvirt doesn't support event type mbm_local. 2019-06-18 12:41:45.636 1 WARNING nova.virt.libvirt.driver [-] Libvirt doesn't support event type mbm_total.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1670