Bug 1430987 - No cpu model and feature in capabilities
No cpu model and feature in capabilities
Status: ASSIGNED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt (Show other bugs)
7.4
aarch64 Linux
medium Severity high
: rc
: ---
Assigned To: Andrea Bolognani
Virtualization Bugs
: FutureFeature, Reopened
Depends On:
Blocks: 1173757
  Show dependency treegraph
 
Reported: 2017-03-09 22:47 EST by weizhang
Modified: 2018-03-16 02:54 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-03-10 02:16:23 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description weizhang 2017-03-09 22:47:41 EST
Description of problem:
No cpu model and feature in capabilities

Version-Release number of selected component (if applicable):
libvirt-3.1.0-2.el7.aarch64
qemu-kvm-rhev-2.8.0-6.el7.aarch64
kernel-4.9.0-10.el7.aarch64

How reproducible:
100%

Steps to Reproduce:
1.# virsh capabilities
<capabilities>

  <host>
    <uuid>25ef0280-ec82-42b0-8fb6-000173021e9e</uuid>
    <cpu>
      <arch>aarch64</arch>
      <topology sockets='1' cores='8' threads='1'/>
      <pages unit='KiB' size='64'/>
      <pages unit='KiB' size='2048'/>
      <pages unit='KiB' size='524288'/>
    </cpu>
...
2.
3.

Actual results:
No cpu model and feature in capabilities

Expected results:
have cpu model and feature in capabilities

Additional info:
Comment 2 Jiri Denemark 2017-03-10 02:16:23 EST
This is expected. Libvirt doesn't know how to detect the host CPU model on AArch64. In the past "host" CPU model was wrongly reported there, which was a bug in our code. You can use virsh domcapabilities and look for the <cpu> element there. All supported CPU modes and models will be listed there. Specifically, host-passthrough mode should be supported, while host-model should not be supported. And a bunch of CPU models should be listed for custom mode.
Comment 3 Andrew Jones 2017-03-10 08:46:38 EST
(In reply to Jiri Denemark from comment #2)
> This is expected. Libvirt doesn't know how to detect the host CPU model on
> AArch64. In the past "host" CPU model was wrongly reported there, which was
> a bug in our code. You can use virsh domcapabilities and look for the <cpu>
> element there. All supported CPU modes and models will be listed there.
> Specifically, host-passthrough mode should be supported, while host-model
> should not be supported. And a bunch of CPU models should be listed for
> custom mode.

Hmm, shouldn't libvirt be able to list the type of cpu and features the host has, whether or not the guest will use host-passthrough? Those seem independent to me, and it seems like libvirt should learn how to probe AArch64 machines for the cpu name and features (likely from a combo of proc and SMBIOS info). IOW, unless I don't understand the point of <capabilities>/<host> (which is quite possible), then I think this BZ should be reopened and flagged as FutureFeature.

Thanks,
drew
Comment 4 Daniel Berrange 2017-03-10 08:54:30 EST
@drew: the 'virsh capabilities' output only cares about reporting the host CPU model & features. My understanding is that aarch64 doesn't have CPU features in the way x86 has named CPUID features. If there is a useful vendor model name for aarch64 though, we should at least expose that.

The second point about reporting whether a guest type can do host-passthrough/host-model/etc is dealt with by a separate libvirt command/api - see 'virsh domcapabilities'  - if that doesn't report host-passthrough for aarch64, that'd be a bug
Comment 5 Jiri Denemark 2017-03-10 09:19:13 EST
I remember I was discussing this with Andrea few months ago and we even looked at an AArch64 host and even /proc/cpuinfo didn't really show anything about the CPU model.

Since I know we were reporting a useless "host" CPU in the capabilities, I thought about this bug as "start reporting it again", which may actually be a wrong interpretation :-) The only thing affected by this is the host-model mode. Both host-passthrough and custom CPU modes are supported even if libvirt is not able to detect what the host CPU is.

Anyway, we could certainly start reporting something useful there, but it should be done through QEMU using the new query-cpu-model-expansion on "host" and "max" models. This interface is currently implemented for x86_64 and s390. This looks like an upstream material though.
Comment 6 Daniel Berrange 2017-03-10 09:27:50 EST
FYI, while /proc/cpuinfo doesn't show anything useful, CPUs do have names in SMBIOS. eg my mustang has

Handle 0x0004, DMI type 4, 48 bytes
Processor Information
        Socket Designation: XGene1
        Type: Other
        Family: ARM
        Manufacturer: AppliedMicro
        ID: 00 00 00 00 01 00 40 00
        Version: A3
        Voltage: Unknown
        External Clock: Unknown
        Max Speed: 2400 MHz
        Current Speed: Unknown
        Status: Populated, Enabled
        Upgrade: None
        L1 Cache Handle: Not Provided
        L2 Cache Handle: Not Provided
        L3 Cache Handle: Not Provided
        Serial Number: Unknown
        Asset Tag: Unknown
        Part Number: APM88XXXX
        Core Count: 8
        Core Enabled: 8
        Characteristics:
                64-bit capable
                Multi-Core
                Execute Protection
                Enhanced Virtualization
                Power/Performance Control


Where 'XGene1' is the process model name.  We could expose this as the CPU model name in the host capabilities even though its not a direct analogy of what we do on x86 - its closer to the model name "i7-6820HQ" on x86. THis would however match what we do on PPC where we report names like "POWER7_v2.1".
Comment 7 Jiri Denemark 2017-03-10 09:37:09 EST
However, we always report something that is possible to use as a guest CPU model to start a new domain with the same CPU. So is XGene1 something we can use this way?
Comment 8 Daniel Berrange 2017-03-10 09:46:11 EST
No, not at this time. QEMU has a fairly limited set of CPU model names accepted and those are only usable with TCG, not KVM - eg things like 'cortex-a57' - 'xgene1' is not on that list. KVM always requires use of host-passthrough though.

The idea that a CPU model listed in the host capabilities can be used for all guests is broken, even though that's what we told people in the past. If an app wants accurate info they must look at the domcapabilities instead.  So I think it would be valid to be explicit about the limitations of host capabilities. IOW, I think it is desirable to be able to list 'xgene1' in the host capabilities and apps just have to switch to dom capabilities if they want accurate info.
Comment 9 Jiri Denemark 2017-03-10 10:18:53 EST
I see, we can break the relation between the CPUs in host capabilities and domain capabilities even further and make them completely separate rather than just slightly different. Especially for AArch64 which never reported anything useful in domain capabilities.

What I still don't understand is the use case for this. I agree it's desirable to report the host CPU in capabilities, but is it only in a "nice to have" category or is this actually required for anything?
Comment 10 Daniel Berrange 2017-03-10 10:22:49 EST
I can see it being useful for host scheduling in openstack / ovirt. eg while you can't use '-cpu xgene1' directly, you still may want to ensure your VM runs on an xgene1 host. So you would set up an openstack scheduler filter that matches on a reported host cpu of 'xgene1' and then launch the vm with '-cpu host-passthrough'.
Comment 12 Andrew Jones 2017-03-10 10:39:00 EST
(In reply to Daniel Berrange from comment #4)
> @drew: the 'virsh capabilities' output only cares about reporting the host
> CPU model & features. My understanding is that aarch64 doesn't have CPU
> features in the way x86 has named CPUID features. If there is a useful
> vendor model name for aarch64 though, we should at least expose that.

Unfortunately AArch64 cpu features aren't as easily probed as on x86. Instead, they're supposed to be implied by the version of the ARM spec the processor implements. It might still be nice to display some features derived from the cpu name -> cpu spec version -> implied features list though. For example, a cpu that implements v8.1 of the ARM spec will have VHE (Virtualization Host Extensions). Something like that might be nice for higher level management to know about by simply checking host capabilities.

This is all upstream work, of course, but this BZ (being a feature request) could remain open to track it.

Thanks,
drew
Comment 20 Kevin Zhao 2017-10-09 03:47:31 EDT
how about that bug going?

AArch64 has the specified cpu features, which the features name is different with X86. Some of the CPU features can get from cat /proc/cpuinfo, but could not get it in libvirt capabilities.


processor	: 7
BogoMIPS	: 500.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0xd07
CPU revision	: 2


In OpenStack we are aiming to add AArch64 to os-traits, which will use cpu features.
Comment 21 Dan Zheng 2017-12-06 00:53:03 EST
This problem also exists on s390x.

# virsh capabilities
<capabilities>

  <host>
    <uuid>c1af783c-569a-41ce-9f8e-bb4e542e7d8a</uuid>
    <cpu>
      <arch>s390x</arch>
      <topology sockets='2' cores='1' threads='1'/>
      <pages unit='KiB' size='4'/>
      <pages unit='KiB' size='1024'/>
    </cpu>

Andrea, do you need a new bug for s390x?
Comment 22 Andrea Bolognani 2017-12-06 05:19:50 EST
(In reply to Dan Zheng from comment #21)
> This problem also exists on s390x.
> 
> # virsh capabilities
> <capabilities>
> 
>   <host>
>     <uuid>c1af783c-569a-41ce-9f8e-bb4e542e7d8a</uuid>
>     <cpu>
>       <arch>s390x</arch>
>       <topology sockets='2' cores='1' threads='1'/>
>       <pages unit='KiB' size='4'/>
>       <pages unit='KiB' size='1024'/>
>     </cpu>
> 
> Andrea, do you need a new bug for s390x?

Yes, please: the details are likely to be fairly different from
aarch64, plus I'm not the one taking care of that architecture :)
Comment 23 Marcin Juszkiewicz 2018-03-12 10:05:08 EDT
Can not libvirt make a use of /proc/cpuinfo and keep table of known aarch64 cpus like 'lscpu' from util-linux does [1]?

1. https://github.com/karelzak/util-linux/commit/744d62ee0c54963539832ec5943f3d25e0fccfbd

There are not so many cpus in aarch64 ecosystem so this would at least allow us to have live migration between same cpus.

/proc/cpuinfo tells:

processor	: 7
BogoMIPS	: 500.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0xd07
CPU revision	: 2

Features field is useful. CPU implementer/variant/part/revision could be used to identify same cpus. Still no compatibility between different cpu vendors but at least same processors will be identified.
Comment 24 Andrea Bolognani 2018-03-14 07:42:51 EDT
(In reply to Marcin Juszkiewicz from comment #23)
> Can not libvirt make a use of /proc/cpuinfo and keep table of known aarch64
> cpus like 'lscpu' from util-linux does [1]?
> 
> 1.
> https://github.com/karelzak/util-linux/commit/744d62ee0c54963539832ec5943f3d25e0fccfbd
> 
> There are not so many cpus in aarch64 ecosystem so this would at least allow
> us to have live migration between same cpus.

Live migration between identical hardware should already work.

I don't think we want to get in the business of decoding this
information ourselves, especially not when one of the sources for
the database used by lscpu is literally "ancient wisdom" :)

> /proc/cpuinfo tells:
> 
> processor	: 7
> BogoMIPS	: 500.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32
> CPU implementer	: 0x41
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0xd07
> CPU revision	: 2
> 
> Features field is useful. CPU implementer/variant/part/revision could be
> used to identify same cpus. Still no compatibility between different cpu
> vendors but at least same processors will be identified.

The features listed here, unlike x86 features such as eg. sse,
can't be toggled on or off so we can't expose them the same way.

IIUC the plan is to define generic vCPU models that are not tied
to any specific real CPU and can run on any host that supports
the required features, including migration. Once that's in place,
we can expose such vCPU models in libvirt.
Comment 25 Marcin Juszkiewicz 2018-03-14 08:03:53 EDT
(In reply to Andrea Bolognani from comment #24)

> Live migration between identical hardware should already work.

Last time we checked (libvirt 3.10, recent Nova, qemu 2.10) it was not:

------------------------------------------------------------------------------------------------------------------
estuary@ref-compute-2:~$ openstack server migrate --live ref-compute-1 --wait 38da6986-2f76-486d-877b-438560d7aa05
Migration pre-check error: CPU doesn't have compatibility.

XML error: Missing CPU model name

Refer to http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult (HTTP 400) (Request-ID: req-cea6f860-e78b-469b-9527-a703114b372b)
------------------------------------------------------------------------------------------------------------------

Once I get back access to my test setup at Linaro will provide more details.

> I don't think we want to get in the business of decoding this
> information ourselves, especially not when one of the sources for
> the database used by lscpu is literally "ancient wisdom" :)

It still base on data from /proc/cpuinfo which nowadays allows to identify which cpu we have (not which board).

 
> The features listed here, unlike x86 features such as eg. sse,
> can't be toggled on or off so we can't expose them the same way.

Still are useful when you want to migrate from CPU with crypto extensions to one which lacks them.

> IIUC the plan is to define generic vCPU models that are not tied
> to any specific real CPU and can run on any host that supports
> the required features, including migration. Once that's in place,
> we can expose such vCPU models in libvirt.

Would be good.

Now if you set cpu_mode = 'custom' + cpu_model = 'cortex-a53' (lowest nominator) libvirt (3.10) refuses to work.
Comment 26 Daniel Berrange 2018-03-14 08:13:07 EDT
(In reply to Marcin Juszkiewicz from comment #25)
> (In reply to Andrea Bolognani from comment #24)
> 
> > Live migration between identical hardware should already work.
> 
> Last time we checked (libvirt 3.10, recent Nova, qemu 2.10) it was not:
> 
> -----------------------------------------------------------------------------
> -------------------------------------
> estuary@ref-compute-2:~$ openstack server migrate --live ref-compute-1
> --wait 38da6986-2f76-486d-877b-438560d7aa05
> Migration pre-check error: CPU doesn't have compatibility.
> 
> XML error: Missing CPU model name
> 
> Refer to
> http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult (HTTP
> 400) (Request-ID: req-cea6f860-e78b-469b-9527-a703114b372b)
> -----------------------------------------------------------------------------
> -------------------------------------
> 
> Once I get back access to my test setup at Linaro will provide more details.

FYI, Nova does a bunch of checks before even telling libvirt to try live migration, so it is entirely possible that Nova needs fixing for this situation, rather than libvirt

To check whether libvirt is at fault or not, please try live migration using 'virsh' instead. If it is not, then best to file a bug against Nova.
Comment 27 Marcin Juszkiewicz 2018-03-14 08:43:25 EDT
Will do once able to reach to my test setup. Thanks for pointers.
Comment 28 Marcin Juszkiewicz 2018-03-15 11:20:23 EDT
Nova. ARGH.

root@cb-r1-m1-c1n1:/var/log/libvirt# virsh migrate --copy-storage-all --live debian-cloud-image qemu+ssh://root@10.101.3.103/system tcp://10.101.3.103

2018-03-15 15:15:23.217+0000: initiating migration
2018-03-15 15:15:25.740+0000: shutting down, reason=migrated
2018-03-15T15:15:25.741113Z qemu-system-aarch64: terminating on signal 15 from pid 573 (/usr/sbin/libvirtd)

root@debian:/var/lib/libvirt/images# virsh list --all
 Id    Name                           State
----------------------------------------------------
 9     debian-cloud-image             running


virsh migrated between two XGene1 systems. Will dig in nova next week.

Thanks again!
Comment 29 Marcin Juszkiewicz 2018-03-16 02:54:15 EDT
Filled https://bugs.launchpad.net/nova/+bug/1756118 - will continue there with Nova changes.

Note You need to log in before you can comment on or make changes to this bug.