Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1523414 - [POWER guests] Verify compatible CPU & hypervisor capabilities across migration
[POWER guests] Verify compatible CPU & hypervisor capabilities across migration
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.5
ppc64le Linux
urgent Severity urgent
: rc
: 7.5
Assigned To: David Gibson
xianwang
: Patch
: 1526266 (view as bug list)
Depends On:
Blocks: 1399177 1476742 1527213 1396114 1517546 1525303 1525599 1526266 1532050
  Show dependency treegraph
 
Reported: 2017-12-07 17:58 EST by David Gibson
Modified: 2018-04-10 20:54 EDT (History)
17 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.10.0-18.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-04-10 20:52:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 162805 None None None 2017-12-21 09:50 EST
Red Hat Product Errata RHSA-2018:1104 None None None 2018-04-10 20:54 EDT

  None (edit)
Description David Gibson 2017-12-07 17:58:35 EST
Description of problem:

At the moment there are some capabilities that qemu conditionally presents to the guests based on host cpu and/or hypervisor capabilities.  That's bad, but is there for historical reasons.  This means that if two hosts have e different capabilities it can break migration between them.

So far we've more or less gotten away with this because the capabilities have always been consistent for all supported hosts.  However, with POWER9 there are limitations in Transactional Memory (TM) which means it's not quite compatible with POWER8 (even in allegedly POWER8 compatible mode).  This means we effectively have a different cpu capability depending on the host, so we now need to care about verifying this.

Bug 1517546 is trying to remove the incompatibility.  But it's not clear that we'll be able to do that in time (it can't be tested until DD2.2 chips become available).  So we need the qemu side checking as a back-up plan.  It also fixes some similar potential problems.
Comment 2 David Gibson 2017-12-12 22:45:44 EST
I have posted a first draft upstream, and I'm working on a second spin.
Comment 3 Qunfang Zhang 2017-12-20 21:00:05 EST
Hi, David

Any suggestion on how to verify this? 

Thanks,
Qunfang
Comment 4 David Gibson 2017-12-20 21:45:52 EST
Qunfang,

Here are some checks that should verify this.

A. It should no longer be possible to start a POWER8 compatibility mode guest on POWER9 with the latest machine type

   1. Start guest with "-M pseries-rhel7.4.0,max-cpu-compat=power8" on a POWER9 host

    before fix: guest will start, but HTM will be disabled unexpectedly
    after fix: guest will start, because POWER9 can't supply HTM

B. A migration from a POWER8 guest with pseries-rhel7.4.0 or earlier machine type to a POWER9 host should fail (leaving the source guest running)

  1. Start a guest with "-M pseries-rhel7.4.0,max-cpu-compat=power8" on a POWER8 host
  2. migrate guest POWER9 (same parameters)

    before fix: guest will appear to migrate ok, but HTM will no longer work properly which could cause problems later
    after fix: migration will fail, leaving guest intact on the source
Comment 5 Qunfang Zhang 2017-12-21 00:37:08 EST
Thanks David.  And after some irc confirmation, we'll:

(1) Test P8<->P9 migration with latest rhel7.5.0 machine type. This should succeed since 7.5 machine type disable HTM on both P8 and P9.

(2) Test rhel7.4.0 and earlier machine type according to comment 4, which should make migration failed from P8 to P9.
Comment 8 David Gibson 2018-01-17 19:11:52 EST
Pull request send upstream, waiting for merge.
Comment 9 David Gibson 2018-01-18 20:23:19 EST
Draft backport made, brewing at:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=15049440
Comment 10 David Gibson 2018-01-18 21:28:23 EST
Had to fix a few problems with backport, successful brew at:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=15049550
Comment 12 Miroslav Rezanina 2018-01-23 08:00:05 EST
Fix included in qemu-kvm-rhev-2.10.0-18.el7
Comment 14 David Gibson 2018-01-28 17:47:55 EST
*** Bug 1526266 has been marked as a duplicate of this bug. ***
Comment 15 xianwang 2018-01-30 02:23:58 EST
Bug verification:
version:
4.14.0-33.el7a.ppc64le
qemu-kvm-rhev-2.10.0-18.el7.ppc64le
SLOF-20170724-2.git89f519f.el7.noarch
# ppc64_cpu --smt=off
# echo N > /sys/module/kvm_hv/parameters/indep_threads_mode

machine type:rhel7.5.0
# /usr/libexec/qemu-kvm -M pseries-rhel7.5.0,max-cpu-compat=power8
VNC server running on ::1:5900

machine type:rhel7.4.0
# /usr/libexec/qemu-kvm -M pseries-rhel7.4.0,max-cpu-compat=power8
VNC server running on ::1:5900
qemu-kvm: KVM implementation does not support Transactional Memory, try cap-htm=off

machine type:rhel7.3.0
# /usr/libexec/qemu-kvm -M pseries-rhel7.3.0,max-cpu-compat=power8
VNC server running on ::1:5900
qemu-kvm: KVM implementation does not support Transactional Memory, try cap-htm=off

machine type:rhel7.2.0
# /usr/libexec/qemu-kvm -M pseries-rhel7.2.0,max-cpu-compat=power8
VNC server running on ::1:5900
qemu-kvm: KVM implementation does not support Transactional Memory, try cap-htm=off

So, this bug is fixed.
Comment 17 errata-xmlrpc 2018-04-10 20:52:14 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104

Note You need to log in before you can comment on or make changes to this bug.