Bug 1779078 - RHVH 4.4: Failed to run VM on 4.3/4.4 engine (Exit message: the CPU is incompatible with host CPU: Host CPU does not provide required features: hle, rtm)
Summary: RHVH 4.4: Failed to run VM on 4.3/4.4 engine (Exit message: the CPU is incomp...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 8.2
Assignee: Eduardo Habkost
QA Contact: Yumei Huang
URL:
Whiteboard:
Depends On:
Blocks: 1787291 1788122
TreeView+ depends on / blocked
 
Reported: 2019-12-03 09:02 UTC by cshao
Modified: 2021-09-08 02:48 UTC (History)
30 users (show)

Fixed In Version: qemu-kvm-4.2.0-10.module+el8.2.0+5740+c3dff59e
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1787291 1788122 (view as bug list)
Environment:
Last Closed: 2020-05-05 09:52:05 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
all log info (4.26 MB, application/gzip)
2019-12-03 09:02 UTC, cshao
no flags Details
cpuinfo (28.07 KB, text/plain)
2019-12-05 01:30 UTC, cshao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-30722 0 None None None 2021-09-08 02:48:01 UTC
Red Hat Product Errata RHBA-2020:2017 0 None None None 2020-05-05 09:54:15 UTC

Description cshao 2019-12-03 09:02:40 UTC
Created attachment 1641619 [details]
all log info

Description of problem:
RHVH 4.4: Failed to run VM on 4.3 engine (Exit message: the CPU is incompatible with host CPU: Host CPU does not provide required features: hle, rtm)

Version-Release number of selected component (if applicable):
redhat-virtualization-host-4.4.0-20191201.0.el8_1
imgbased-1.2.6-0.1.el8ev.noarch
vdsm-4.40.0-154.git4e13ea9.el8ev.x86_64
ovirt-engine-4.3.7.2-0.1.el7.noarch


How reproducible:
100%

Steps to Reproduce:
1. Install RHVH-4.4-20191201.7-RHVH-x86_64-dvd1.iso via Anaconda GUI.
2. Register RHVH to 4.3 engine
3. Create VM.

Actual results:
RHVH 4.4: Failed to run VM on 4.3 engine:

VM is down with error. Exit message: the CPU is incompatible with host CPU: Host CPU does not provide required features: hle, rtm.

Expected results:
Run VM on 4.3 engine can successful.

Additional info:

Comment 1 cshao 2019-12-04 07:05:25 UTC
Also can reproduce on 4.4 engine.
ovirt-engine-4.4.0-0.6.master.el7.noarch

Comment 12 cshao 2019-12-05 01:30:39 UTC
Created attachment 1642259 [details]
cpuinfo

Comment 22 Michal Skrivanek 2019-12-05 09:25:18 UTC
Thank you. It's clearer now.
Apparently domcaps is saying that Haswell(the one with TSX) is supported guest CPU. Jiri, any idea why? cpas is saying it's noTSX, cpu flags do not have hle,rtm, so why is libvirt saying "Haswell" is a valid type?

Comment 25 Michal Skrivanek 2019-12-05 09:26:58 UTC
@cshao, as a workaround you can easily select Haswell-noTSX as a Cluster CPU type and VMs should work just fine. It's just the autodetection that selects a non-working model by default.

Comment 28 cshao 2019-12-05 10:34:28 UTC
(In reply to Michal Skrivanek from comment #25)
> @cshao, as a workaround you can easily select Haswell-noTSX as a Cluster CPU
> type and VMs should work just fine. It's just the autodetection that selects
> a non-working model by default.

You are right.
VM can run successful after selected Haswell-noTSX as a Cluster CPU type.

Thanks.

Comment 31 Jiri Denemark 2019-12-05 10:39:30 UTC
Sigh, this mess is a result of versioned CPU models introduced in QEMU 4.1.0
without proper support for introspection (see bug 1697663, mainly in comments
4, 5, and 6) and thus no support for this in libvirt. Unfortunately, it seems
the needed machine type parameter for query-cpu-definitions QMP command is not
present even in QEMU v4.2.0-rc4.

Anyway, libvirt probes for supported CPU models and their usability on the
current host by probing QEMU with machine type "none". In this case, QEMU
reports "Haswell" CPU model is in fact an alias of "Haswell-v4"
("Haswell-noTSX-IBRS" using the original naming) and thus it is marked as
runnable as it does not require the TSX features. See below for the QMP log.

If I try to probe QEMU with machine type pc-i440fx-rhel7.6.0, which is the
default machine used when starting a domain, I get a completely different
results (see the second QMP log). The "Haswell" CPU model is not runnable
anymore because of missing "hle" and "rtm" features. Also I don't see any
"alias-of" fields there (currently unused by libvirt, though).

I don't know what the right fix should be here, but until bug 1697663 is fully
implemented and libvirt starts using it, the CPU model introspection is
completely broken.


# /usr/libexec/qemu-kvm -machine none -nodefaults -nographic -qmp unix:/tmp/ble,server &
# socat STDIO UNIX-CONNECT:/tmp/ble
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 1, "major": 4}, "package": "qemu-kvm-4.1.0-14.module+el8.1.0+4548+ed1300f4"}, "capabilities": ["oob"]}}
{"execute": "qmp_capabilities"}
{"return": {}}
{"execute": "query-cpu-definitions"}
{
    "return": [
        ...
        {
            "name": "Haswell-v4",
            "typename": "Haswell-v4-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-v3",
            "typename": "Haswell-v3-x86_64-cpu",
            "unavailable-features": [
                "hle",
                "rtm"
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-v2",
            "typename": "Haswell-v2-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-v1",
            "typename": "Haswell-v1-x86_64-cpu",
            "unavailable-features": [
                "hle",
                "rtm"
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-noTSX-IBRS",
            "typename": "Haswell-noTSX-IBRS-x86_64-cpu",
            "unavailable-features": [
            ],
            "alias-of": "Haswell-v4",
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-noTSX",
            "typename": "Haswell-noTSX-x86_64-cpu",
            "unavailable-features": [
            ],
            "alias-of": "Haswell-v2",
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-IBRS",
            "typename": "Haswell-IBRS-x86_64-cpu",
            "unavailable-features": [
                "hle",
                "rtm"
            ],
            "alias-of": "Haswell-v3",
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell",
            "typename": "Haswell-x86_64-cpu",
            "unavailable-features": [
            ],
            "alias-of": "Haswell-v4",
            "static": false,
            "migration-safe": true
        },
        ...
    ]
}


# /usr/libexec/qemu-kvm -machine pc-i440fx-rhel7.6.0 -nodefaults -nographic -qmp unix:/tmp/ble,server &
# socat STDIO UNIX-CONNECT:/tmp/ble
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 1, "major": 4}, "package": "qemu-kvm-4.1.0-14.module+el8.1.0+4548+ed1300f4"}, "capabilities": ["oob"]}}
{"execute": "qmp_capabilities"}
{"return": {}}
{"execute": "query-cpu-definitions"}
{
    "return": [
        ...
        {
            "name": "Haswell-v4",
            "typename": "Haswell-v4-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-v3",
            "typename": "Haswell-v3-x86_64-cpu",
            "unavailable-features": [
                "hle",
                "rtm"
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-v2",
            "typename": "Haswell-v2-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-v1",
            "typename": "Haswell-v1-x86_64-cpu",
            "unavailable-features": [
                "hle",
                "rtm"
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-noTSX-IBRS",
            "typename": "Haswell-noTSX-IBRS-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-noTSX",
            "typename": "Haswell-noTSX-x86_64-cpu",
            "unavailable-features": [
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell-IBRS",
            "typename": "Haswell-IBRS-x86_64-cpu",
            "unavailable-features": [
                "hle",
                "rtm"
            ],
            "static": false,
            "migration-safe": true
        },
        {
            "name": "Haswell",
            "typename": "Haswell-x86_64-cpu",
            "unavailable-features": [
                "hle",
                "rtm"
            ],
            "static": false,
            "migration-safe": true
        },
        ...
    ]
}

Comment 37 Eduardo Habkost 2019-12-05 22:36:20 UTC
Fix submitted upstream:
https://lore.kernel.org/qemu-devel/20191205223339.764534-1-ehabkost@redhat.com/

Comment 44 Michal Skrivanek 2019-12-06 11:15:07 UTC
      <model usable='yes'>Haswell-noTSX-IBRS</model>
      <model usable='yes'>Haswell-noTSX</model>
      <model usable='no'>Haswell-IBRS</model>
      <model usable='no'>Haswell</model>

seems to work well. Thanks Eduardo!

Comment 45 Danilo de Paula 2019-12-11 14:44:50 UTC
QA_ACK, please?

Comment 64 Ademar Reis 2020-02-05 23:09:26 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 69 Yumei Huang 2020-02-26 04:28:07 UTC
Reproduce:
qemu-kvm-4.2.0-9.module+el8.2.0+5699+b5331ee5
libvirt-client-6.0.0-7.module+el8.2.0+5869+c23fe68b.x86_64
kernel-4.18.0-179.el8.x86_64

host: dell-per730-28.lab.eng.pek2.redhat.com (Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz)

Haswell is usable in `virsh domcapabilities` while host doesn't support hle and rtm.

# virsh domcapabilities
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>Haswell-noTSX-IBRS</model>
      <vendor>Intel</vendor>
      ...
    </mode>
    <mode name='custom' supported='yes'>
      ...
      <model usable='yes'>Haswell-noTSX-IBRS</model>
      <model usable='yes'>Haswell-noTSX</model>
      <model usable='no'>Haswell-IBRS</model>
      <model usable='yes'>Haswell</model>             ---------> Haswell is usable
      ...
    </mode>
  </cpu>

# /usr/libexec/qemu-kvm -cpu Haswell
qemu-kvm: warning: host doesn't support requested feature: CPUID.07H:EBX.hle [bit 4]
qemu-kvm: warning: host doesn't support requested feature: CPUID.07H:EBX.rtm [bit 11]



Verify:
qemu-kvm-4.2.0-12.module+el8.2.0+5858+afd073bc
libvirt-client-6.0.0-7.module+el8.2.0+5869+c23fe68b.x86_64
kernel-4.18.0-179.el8.x86_64

On same host, Haswell is not usable anymore.

# virsh domcapabilities | grep -A80 '<cpu'
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>Haswell-noTSX-IBRS</model>
      <vendor>Intel</vendor>
      ...
    </mode>
    <mode name='custom' supported='yes'>
      ...
      <model usable='yes'>Haswell-noTSX-IBRS</model>
      <model usable='yes'>Haswell-noTSX</model>
      <model usable='no'>Haswell-IBRS</model>
      <model usable='no'>Haswell</model>      ----------> Haswell is not usable.
      ...
    </mode>
  </cpu>

Comment 71 errata-xmlrpc 2020-05-05 09:52:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017


Note You need to log in before you can comment on or make changes to this bug.