Bug 1525599

Summary: Support pseries cap-htm qemu option in libvirt
Product: Red Hat Enterprise Linux 7 Reporter: Andrea Bolognani <abologna>
Component: libvirtAssignee: Andrea Bolognani <abologna>
Status: CLOSED ERRATA QA Contact: Dan Zheng <dzheng>
Severity: high Docs Contact: Jiri Herrmann <jherrman>
Priority: unspecified    
Version: 7.6CC: abologna, bugproxy, dgibson, dzheng, haizhao, hannsj_uhl, jherrman, junli, sursingh
Target Milestone: rc   
Target Release: 7.6   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-4.5.0-2.el7 Doc Type: Known Issue
Doc Text:
HTM is disabled for guests running on IBM POWER systems The Hardware Transactional Memory (HTM) feature currently prevents migrating guest virtual machines from IBM POWER8 to IBM POWER9 hosts, and has therefore been disabled by default. As a consequence, guest virtual machines running on IBM POWER8 and IBM POWER9 hosts cannot use HTM, unless the feature is manually enabled. To do so, change the default `pseries-rhel7.5` machine type of these guests to `pseries-rhel7.4`. Note that guests configured this way cannot be migrated from an IBM POWER8 host to an IBM POWER9 host.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 09:52:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1523414    
Bug Blocks: 1513404, 1558351    

Description Andrea Bolognani 2017-12-13 16:24:12 UTC
QEMU will soon give users the ability to control certain CPU and
hypervisor capabilities. libvirt should expose (a subset of) the
relevant knobs.

Comment 2 Andrea Bolognani 2018-01-23 15:47:41 UTC
Initial patches posted upstream. They don't cover the entire feature.

  https://www.redhat.com/archives/libvir-list/2018-January/msg00779.html

Comment 3 Andrea Bolognani 2018-01-23 16:33:44 UTC
David, which of the optional capabilities do you think should be
exposed through libvirt? The ones I'm aware of are

  cap-vsx=bool (Allow Vector Scalar Extensions (VSX))
  cap-htm=bool (Allow Hardware Transactional Memory (HTM))
  cap-dfp=bool (Allow Decimal Floating Point (DFP))
  cap-sbbc=string (Speculation Barrier Bounds Checking (broken, workaround, fixed))
  cap-cfpc=string (Cache Flush on Privilege Change (broken, workaround, fixed))
  cap-ibs=string (Indirect Branch Serialisation (broken, workaround, fixed))

HTM definitely needs to be in, as it's going to be disabled by
default and people might reasonably want to enable it.

SBBC, CFPC and IBS should probably also be exposed? I'm actually
not clear on how they would be used in practice.

VSX and DFP look like they could be skipped, since IIUC they're
going to be enabled by default and there aren't any compelling
reasons to ever disable them.

Comment 4 David Gibson 2018-01-28 22:55:43 UTC
I concur.  I guess you want HTM so it can be enabled (at least until we get HTM-without-suspend sorted out, which could be a while).

You need SBBC, CFPC and IBS for taking explicit control of spectre/workaround mitigation.  The only likely use case I see is that if you definitely want the mitigations in your VM you should set them all to "workaround", as well as update your guest kernel.

Comment 5 Andrea Bolognani 2018-02-06 16:57:27 UTC
Second iteration posted upstream.

  https://www.redhat.com/archives/libvir-list/2018-February/msg00310.html

Comment 11 Andrea Bolognani 2018-03-09 15:22:43 UTC
First non-RFC version posted upstream. Only support for the HTM
feature is implemented.

  https://www.redhat.com/archives/libvir-list/2018-March/msg00474.html

Comment 12 David Gibson 2018-04-24 00:30:27 UTC
I've discussed this with Andrea, and the current scope of this BZ is pretty broad: there are a bunch of capabilities flags that have been added to pseries on the qemu side, and we probably don't need support for all (or even most) of them in libvirt.

Refocussing this bug on the HTM capability.  We're not sure if we'll need that one in libvirt either, but it's the most likely.

Comment 13 David Gibson 2018-06-12 05:01:15 UTC
Hi Andrea,

I talked about the HTM emulation stuff to Suraj today.  The kernel emulation is now merged upstream, and Suraj is going to do a backport shortly (see bug 1517546).

We still have some decisions to make about how to handle this at the higher levels, though.

The HTM capability is turned off in qemu by default for the released 2.12 upstream and rhel7.5.0 downstream machine types.  So, I think we basically have two options:

A)
    * Change qemu default back to cap_htm=on the 3.0 and rhel7.6.0 machine types
    * No support for setting the flag for libvirt
    * The new machine type won't work on POWER9 unless you have a newer kernel with the emulation code in (this is already of the <=2.11/rhel7.4.0 machine types)
        * To run with POWER9 on an older kernel you'd need to either explicitly set the qemu option or use the 2.12 / rhel7.5.0 machine type.

B)
    * Leave the qemu default as it is - htm available by default only on <=2.11/rhel7.4.0 machine types
    * libvirt adds support for the cap_htm flag and decides on some policy for setting it


Opinions?

Comment 14 Andrea Bolognani 2018-06-12 12:35:38 UTC
(In reply to David Gibson from comment #13)
> Hi Andrea,
> 
> I talked about the HTM emulation stuff to Suraj today.  The kernel emulation
> is now merged upstream, and Suraj is going to do a backport shortly (see bug
> 1517546).
> 
> We still have some decisions to make about how to handle this at the higher
> levels, though.
> 
> The HTM capability is turned off in qemu by default for the released 2.12
> upstream and rhel7.5.0 downstream machine types.  So, I think we basically
> have two options:
> 
> A)
>     * Change qemu default back to cap_htm=on the 3.0 and rhel7.6.0 machine
> types
>     * No support for setting the flag for libvirt
>     * The new machine type won't work on POWER9 unless you have a newer
> kernel with the emulation code in (this is already of the <=2.11/rhel7.4.0
> machine types)
>         * To run with POWER9 on an older kernel you'd need to either
> explicitly set the qemu option or use the 2.12 / rhel7.5.0 machine type.

I think it's fair to expect POWER9 machines to run a recent kernel:
it's recent hardware after all :)

I also expect that, if someone tried to migrate a guest with HTM
enabled to a POWER9 hosts that is running an older kernel, they
would get back a reasonable error message, since that's a situation
you can already find yourself in by trying to migrate a RHEL 7.4
guest to a RHEL 7.5 POWER9 host.

Overall, choosing this path would keep things slightly awkward in
the short term, with the corresponding workaround mirroring the
one we already provided and documented for RHEL 7.5 (use a
different machine type), but make sure they Just Work™ in the
longer run.

> B)
>     * Leave the qemu default as it is - htm available by default only on
> <=2.11/rhel7.4.0 machine types
>     * libvirt adds support for the cap_htm flag and decides on some policy
> for setting it

libvirt only provides mechanisms, not policies, so it would be up
to the user or the management application to enable the correct
knob; in other words, going down this road would mean anyone
needing HTM will have to explicitly opt-in to it.

On the other hand, part of the reason we disabled HTM by default
in RHEL 7.5 is that we had a hunch not many people actually used
it in practice, and the lack of loud complaints so far seems to
validate the theory, so perhaps it's not that big of a deal if
opt-in is required? Plus it would avoid flip-flopping on the QEMU
default, which can certainly cause confusion.


At the end of the day, neither solution is completely free of
drawbacks. Perhaps the first one would be preferrable since it is
easier to understand: "HTM is enabled by default, except for RHEL
7.5", as opposed to "HTM is enabled by default until RHEL 7.4,
disabled by default with no way to turn it on in RHEL 7.5, and
disabled by default but can be turned on with a knob in RHEL 7.6
and forward".

Comment 15 Andrea Bolognani 2018-06-18 07:57:20 UTC
(In reply to David Gibson from comment #13)
> The HTM capability is turned off in qemu by default for the released 2.12
> upstream and rhel7.5.0 downstream machine types.  So, I think we basically
> have two options:
> 
> A)
>     * Change qemu default back to cap_htm=on the 3.0 and rhel7.6.0 machine
> types
>     * No support for setting the flag for libvirt
>     * The new machine type won't work on POWER9 unless you have a newer
> kernel with the emulation code in (this is already of the <=2.11/rhel7.4.0
> machine types)
>         * To run with POWER9 on an older kernel you'd need to either
> explicitly set the qemu option or use the 2.12 / rhel7.5.0 machine type.

IIUC from today's meeting emulated MT needs not only a recent
kernel, but also DD2.2 hardware with up-to-date firmware; given how
much pre-DD2.2 hardware is still around and actively used at least
for development, that would tip the scale towards leaving MT
disabled for the time being.

Comment 17 David Gibson 2018-06-18 23:43:46 UTC
Yeah, that makes sense to me.  Especially since we don't seem to have heard any screams for HTM support.

This is the do-nothing option from the qemu perspective, what does it mean from the libvirt perspective?

Comment 18 Andrea Bolognani 2018-06-19 07:47:52 UTC
(In reply to David Gibson from comment #17)
> Yeah, that makes sense to me.  Especially since we don't seem to have heard
> any screams for HTM support.
> 
> This is the do-nothing option from the qemu perspective, what does it mean
> from the libvirt perspective?

Implementing a knob that allows users to enable HTM :)

I have code for that that I posted as RFC a while ago, I'll brush
it up and post it in earnest this time around.

Comment 19 Andrea Bolognani 2018-06-19 13:31:40 UTC
v2 series posted upstream.

  https://www.redhat.com/archives/libvir-list/2018-June/msg01374.html

Comment 20 Andrea Bolognani 2018-06-26 10:21:43 UTC
v3 series posted upstream.

  https://www.redhat.com/archives/libvir-list/2018-June/msg01655.html

Comment 21 Andrea Bolognani 2018-07-03 08:09:59 UTC
Patches merged upstream.

commit d4c11171076edfb2e603804e79edf7ccc3cce5dc
Author: Andrea Bolognani <abologna>
Date:   Mon Jul 2 10:37:09 2018 +0200

    qemu: Format the HTM pSeries feature
    
    This makes the feature fully operational.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1525599
    
    Signed-off-by: Andrea Bolognani <abologna>
    Reviewed-by: John Ferlan <jferlan>

v4.5.0-19-gd4c1117107

Comment 24 Dan Zheng 2018-08-10 06:27:12 UTC
Package:
libvirt-4.5.0-6.el7.ppc64le
qemu-kvm-rhev-2.12.0-9.el7.ppc64le
kernel-4.14.0-101.el7a.ppc64le


Case 1: Set 'htm' feature to 'on' and 'off' respectively on ppc64le host and start guest  --- PASS
XML:
  <features>
    <htm state='on'/>
  </features>
Check qemu command line:
-machine pseries-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off,cap-htm=on

XML:
  <features>
    <htm state='off'/>
  </features>
Check qemu command line:
-machine pseries-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off,cap-htm=off

Case 2: Set 'htm' feature to 'on' on x86_64 host  --- PASS
# virsh edit guest
unsupported configuration: The 'htm' feature is not supported for architecture 'x86_64' or machine type 'pc-i440fx-rhel7.6.0'

Case 3: Set invalid syntax for htm feature        --- PASS
Use 'htmm' 'offf' 'sstate' to modify the guest xml for <htm state='on'/>, the invalid syntax is not allowed.

error: XML document failed to validate against schema: Unable to validate doc against /usr/share/libvirt/schemas/domain.rng
Extra element features in interleave
Element domain failed to validate content


Case 4: Migrate guest with htm enabled
Prepare nfs shared storage environment
Edit guest with   <features> <htm state='on'/> </features>
Start guest and do migration
# virsh migrate --live --p2p guest qemu+ssh://<target_host>/system --verbose

The migration should be successful
Check qemu command line on target host, which should include 'cap-htm=on'

Comment 25 Dan Zheng 2018-08-10 06:37:31 UTC
For Case 4, the test result will be updated after we get two Power machines. @junli will help on it when I am on vacation.

Andrea, What other deep checking or test scenarios do you think as necessary?

Comment 26 Andrea Bolognani 2018-08-10 08:11:15 UTC
(In reply to Dan Zheng from comment #25)
> Andrea, What other deep checking or test scenarios do you think as necessary?

Not really, once you have tested migration I'd say we're all set :)

Comment 27 Junxiang Li 2018-08-28 09:27:29 UTC
Case 4 test pass with the enviroment:
# rpm -q kernel libvirt qemu-kvm-rhev
kernel-3.10.0-933.el7.ppc64le
libvirt-4.5.0-7.el7.ppc64le
qemu-kvm-rhev-2.12.0-11.el7.ppc64le

Comment 28 Dan Zheng 2018-08-28 09:59:34 UTC
Based on comment 24 ~ 27, I mark it verified.

Comment 30 errata-xmlrpc 2018-10-30 09:52:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3113