Bug 1614948 - qemu-system-ppc64: No Transactional Memory support in TCG, try cap-htm=off
Summary: qemu-system-ppc64: No Transactional Memory support in TCG, try cap-htm=off
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 29
Hardware: ppc64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: TRACKER-bugs-affecting-libguestfs PPCTracker
TreeView+ depends on / blocked
 
Reported: 2018-08-10 20:21 UTC by Richard W.M. Jones
Modified: 2018-09-27 17:28 UTC (History)
17 users (show)

Fixed In Version: qemu-2.11.2-4.fc28
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-27 17:28:06 UTC


Attachments (Terms of Use)
root.log (210.99 KB, text/plain)
2018-08-10 20:22 UTC, Richard W.M. Jones
no flags Details
build.log (6.06 MB, text/plain)
2018-08-10 20:23 UTC, Richard W.M. Jones
no flags Details

Description Richard W.M. Jones 2018-08-10 20:21:28 UTC
Description of problem:

qemu-system-ppc64 can't run guests under TCG by default, it says:

  qemu-system-ppc64: No Transactional Memory support in TCG, try cap-htm=off

If qemu knows there's no transactional memory support in TCG
why doesn't it add the option itself instead of giving this
error message?  In any case I suspect the actual problem might
be libvirt which is adding the option even though it (presumably)
knows it's trying to run a TCG guest.

Version-Release number of selected component (if applicable):

qemu 2:2.11.2-1.fc28
(other versions of things can be found in the attached root.log)

How reproducible:

100%

Steps to Reproduce:
1. Run libguestfs-test-tool in a ppc64 guest.

Additional info:

https://bugzilla.redhat.com/show_bug.cgi?id=1525599

Comment 1 Richard W.M. Jones 2018-08-10 20:22:26 UTC
Created attachment 1475156 [details]
root.log

Comment 2 Richard W.M. Jones 2018-08-10 20:23:42 UTC
Created attachment 1475157 [details]
build.log

Comment 3 Richard W.M. Jones 2018-08-10 21:24:21 UTC
Fails differently in Rawhide, it hangs.

(qemu 2:3.0.0-0.1.rc3.fc29)

Comment 4 Cole Robinson 2018-08-21 19:19:16 UTC
Hitting this too. That message comes from:

commit ee76a09fc72cfbfab2bb5529320ef7e460adffd8
Author: David Gibson <david@gibson.dropbear.id.au>
Date:   Mon Dec 11 13:10:44 2017 +1100

    spapr: Treat Hardware Transactional Memory (HTM) as an optional capability

Libvirt has XML support for setting the cap-htm flag but this really should do the right thing out of the box for tcg guests. That qemu commit message makes it sound like cap-htm=off shouldn't be explicitly required but maybe I'm misreading.

David or Andrea do you know what the solution is?

Comment 5 David Gibson 2018-08-22 00:36:14 UTC
There used to be a bunch of places, including this, where we advertised different capabilities depending on what the accelerator could support.  The problem is that this means we're presenting an (incompatibly) different environment to the guest depending on the accelerator.  That's caused us gradually increasing amounts of grief.

So the new approach we're trying to take is to always present exactly the requested environment to the guest, and simply fail if the accelerator (or whatever) can't support it.  It leads to oddities like this, but overall it's the less bad option.

That said, in practice, this particular one should go away with qemu-2.12.  It turns out that hardly anyone actually uses HTM in guests anyway, and there were also complications with KVM on POWER9, and so we've disabled it by default in the pseries-2.12 and later machine types.

Comment 6 Cole Robinson 2018-08-22 13:38:20 UTC
Thanks for the explanation David. Do you have a suggestion how to 'fix' this for 2.11? That's the version we are stuck with for the life of f28. Could be a backport, revert, custom patch, etc.

Comment 7 David Gibson 2018-08-23 04:58:24 UTC
Ugh.. that's pretty tricky.  We can't really apply the machine type change to Fedora, because that would mean Fedora's pseries-2.11 behaves differently from an upstream one, which I'm pretty sure we don't want.

AFAICT ee76a09fc72cfbfab2bb5529320ef7e460adffd8 was only in 2.12, not 2.11 upstream.  Is there a reason you need it in Fedora, or can you revert?

Comment 8 Cole Robinson 2018-08-23 14:18:39 UTC
Looks like it was pulled into the v2.11.1 upstream stable release. If it's safe to revert I can give that a try

Comment 9 Kevin Fenzi 2018-08-30 01:25:15 UTC
Looks like this is causing our composes to fail as well (our image builder ppc buildvm's updated to this f28 version). See: 

https://pagure.io/releng/issue/7703

so yes, a revert would be welcome. ;)

Comment 10 David Gibson 2018-08-30 03:49:48 UTC
Ok, so, I was wondering why this change was included in the v2.11 stable series at all, but having had a look through the git history, I think I see what's happened.

The mechanism for the new behaviour is these "machine capabilities" flags we've added (these are what's used to explicitly control available features, rather than just advertising based on what we can support).

We wanted the backport of those, because they're necessary for the handling of the spectre/meltdown mitigations, and the HTM change came along for the ride.

Because of the way the patches interrelate, I think a simple revert will cause a bunch of conflicts, and resolving them carelessly might also break the the spectre/meltdown fixes, which we don't want (although they already aren't enabled by default).


So, options,

1) Change to HTM=no by default in Fedora.  As in comment 7 this kinda makes Fedora migration-incompatible with other qemu-2.11 builds.   But... since v2.11.0 didn't have these changes, it's already kind of a mess.  Setting HTM=no would make Fedora migration incompatible with external v2.11.1 builds, and external v2.11.0 builds running on KVM HV, but make it migration compatible (when it wasn't previously) with other v2.11.0 builds running on TCG (or KVM PR).

[This mess is exactly why we made these changes in the first place, but by its nature there is no way to get compatibility perfect, we just have to pick a poison]

2) Change Fedora and/or upstream to change the setting based on whether its supported in the available KVM.  This improves out of the box "will this run" compatiblity with earlier versions longer term, but it means whether a "pseries-2.11" machine is migratable to another depends on non-obvious factors, like the KVM version (if any) each end is running on.  That's ugly, but no uglier than v2.11.0 was to begin with.


Thoughts?

Comment 11 Kevin Fenzi 2018-08-30 19:37:19 UTC
I guess I'd say option 1 sounds easier to me, but either would work. 

Is there a config option/change while this is sorted out we can use to get back working?

Comment 12 Cole Robinson 2018-08-30 21:49:10 UTC
(In reply to Kevin Fenzi from comment #11)
> I guess I'd say option 1 sounds easier to me, but either would work. 
> 
> Is there a config option/change while this is sorted out we can use to get
> back working?

Downgrade qemu. There's probably a way to inject a '-global' qemu option to override cap-htm but I couldn't figure out the invocation.

Regardless I've submitted a build now with this patch:

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a74eb2dc68..003d522f0e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3843,7 +3843,10 @@ static void spapr_machine_2_11_class_options(MachineClass *mc)
     sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
 
     spapr_machine_2_12_class_options(mc);
-    smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_ON;
+    /* Custom Fedora change to fix qemu 2.11.2 + TCG:
+     * https://bugzilla.redhat.com/show_bug.cgi?id=1614948
+     */
+    // smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_ON;
     SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
 }


Which seems to cover option #1.

Generally in Fedora I don't think migration compatibility is super important. The only case I expect it to impact is: user migrates to file (virsh save), updates to next major fedora release, attempts to restore VM, restore fails. Actual live migration across hosts is not a common thing.

What's worse in this case is: user creates working VM config with pseries-2.12 (with < qemu 2.11.2 or after this patch), updates to next Fedora release with qemu 3.0, but now their VMs don't start with this same cap-htm error. The fix is just changing the machine type but that is quite unfriendly to users.

So I think there should be some qemu upstream work to avoid the error in that case, which sounds like David's #2 option

Comment 13 David Gibson 2018-08-31 01:56:14 UTC
The option you're looking for here is not -global, but:
     -machine cap-htm=off

Comment 14 Richard W.M. Jones 2018-08-31 08:06:47 UTC
But we'd have to specify that only when we know TCG is going to be used
(which only qemu knows).

In any case I believe this is fixed in qemu 2.12 so I'd be inclined to
close this bug closed -> rawhide (or whatever) and tell people who
encounter this bug that they must pull in the Fedora n+1 release of
qemu to fix this problem.

Comment 15 Andrea Bolognani 2018-08-31 08:22:14 UTC
(In reply to Cole Robinson from comment #12)
> Generally in Fedora I don't think migration compatibility is super
> important. The only case I expect it to impact is: user migrates to file
> (virsh save), updates to next major fedora release, attempts to restore VM,
> restore fails. Actual live migration across hosts is not a common thing.

Unless you just carry the pseries-2.11 tweak forever downstream:
it's literally a one-liner if you exclude the comments, and since
you're already diverging from upstream might as well go the extra
mile. If that makes migration between downstream and upstream
builds break, so be it: people trying that are already basically
asking for trouble anyway :)

> What's worse in this case is: user creates working VM config with
> pseries-2.12 (with < qemu 2.11.2 or after this patch), updates to next
> Fedora release with qemu 3.0, but now their VMs don't start with this same
> cap-htm error. The fix is just changing the machine type but that is quite
> unfriendly to users.

Wait, pseries-2.12 has htm=off by default, no? So a guest using
that machine type instead of the default pseries-2.11 would be
able to run under TCG already, and would also migrate to later
QEMUs just fine. (I just tried the former, and it works.)

That is, assuming HTM configuration was the only thing preventing
migration: since the pseries-2.12 we're talking about is still
out of a patched 2.11 release, I'm not sure it's realistic to
expect it to behave like the pseries-2.12 you'd get from a proper
2.12 build, so quite possibly migration would fail anyway.

You'll probably be able to start the guest with a later QEMU,
though, so the concern you raise above is most likely not going
to be a problem in practice.

> So I think there should be some qemu upstream work to avoid the error in
> that case, which sounds like David's #2 option

I think that would be a big step backwards, and would almost
certainly cause other startup / migration scenarios to break.
We should not go down that route unless it proves to be
absolutely necessary.

Comment 16 Cole Robinson 2018-08-31 13:38:38 UTC
(In reply to David Gibson from comment #13)
> The option you're looking for here is not -global, but:
>      -machine cap-htm=off

Ah okay, I was thinking about this from the libvirt qemu-commandline perspective which can only append options. I figured tacking on an extra -machine would overwrite libvirt's values there but it seems to merge them, so indeed sticking that on the end of a qemu-system-ppc64 invocation will work

Comment 17 Cole Robinson 2018-08-31 13:48:28 UTC
(In reply to Andrea Bolognani from comment #15)
> (In reply to Cole Robinson from comment #12)
> > Generally in Fedora I don't think migration compatibility is super
> > important. The only case I expect it to impact is: user migrates to file
> > (virsh save), updates to next major fedora release, attempts to restore VM,
> > restore fails. Actual live migration across hosts is not a common thing.
> 
> Unless you just carry the pseries-2.11 tweak forever downstream:
> it's literally a one-liner if you exclude the comments, and since
> you're already diverging from upstream might as well go the extra
> mile. If that makes migration between downstream and upstream
> builds break, so be it: people trying that are already basically
> asking for trouble anyway :)
> 

IMO It's a maintenance nightmare to carry downstream-forever patches. But again I'm not too concerned about this issue in practice, we dealt with this with x86 in the past and it was doable;

http://blog.wikichoon.com/2013/12/kvm-migration-from-fedora-17-to-fedora.html

> > What's worse in this case is: user creates working VM config with
> > pseries-2.12 (with < qemu 2.11.2 or after this patch), updates to next
> > Fedora release with qemu 3.0, but now their VMs don't start with this same
> > cap-htm error. The fix is just changing the machine type but that is quite
> > unfriendly to users.
> 
> Wait, pseries-2.12 has htm=off by default, no? So a guest using
> that machine type instead of the default pseries-2.11 would be
> able to run under TCG already, and would also migrate to later
> QEMUs just fine. (I just tried the former, and it works.)
> 
> That is, assuming HTM configuration was the only thing preventing
> migration: since the pseries-2.12 we're talking about is still
> out of a patched 2.11 release, I'm not sure it's realistic to
> expect it to behave like the pseries-2.12 you'd get from a proper
> 2.12 build, so quite possibly migration would fail anyway.
> 
> You'll probably be able to start the guest with a later QEMU,
> though, so the concern you raise above is most likely not going
> to be a problem in practice.
> 
> > So I think there should be some qemu upstream work to avoid the error in
> > that case, which sounds like David's #2 option
> 
> I think that would be a big step backwards, and would almost
> certainly cause other startup / migration scenarios to break.
> We should not go down that route unless it proves to be
> absolutely necessary.

I didn't fully follow David's suggestion so maybe I got it wrong. Here's what I'm trying to avoid:

If libvirt has encoded -machine pseries-2.11 in VM XML, that was working with 2.11.0 in f28, it's now failing to start ('fixed' with the pending qemu update). It also will fail to start on qemu.git, pseries-2.11 has the same behavior there. That stinks, because it forces users to change their VM configs on the next Fedora upgrade.

So is there an upstream way to convert pseries-2.11 back to the 2.11.0 behavior, maybe even specific to tcg? If tcg migration compat has to suffer I say so be it, causing startup to fail for previously working configs is a worse sin IMO

Comment 18 Andrea Bolognani 2018-08-31 14:51:32 UTC
(In reply to Cole Robinson from comment #17)
> (In reply to Andrea Bolognani from comment #15)
> > (In reply to Cole Robinson from comment #12)
> > > Generally in Fedora I don't think migration compatibility is super
> > > important. The only case I expect it to impact is: user migrates to file
> > > (virsh save), updates to next major fedora release, attempts to restore VM,
> > > restore fails. Actual live migration across hosts is not a common thing.
> > 
> > Unless you just carry the pseries-2.11 tweak forever downstream:
> > it's literally a one-liner if you exclude the comments, and since
> > you're already diverging from upstream might as well go the extra
> > mile. If that makes migration between downstream and upstream
> > builds break, so be it: people trying that are already basically
> > asking for trouble anyway :)
> 
> IMO It's a maintenance nightmare to carry downstream-forever patches.

I agree in general, but we're talking about a literal one-liner
here so I have trouble imagining that forward-porting it is going
to be anything approximating a nightmare :)

> But
> again I'm not too concerned about this issue in practice, we dealt with this
> with x86 in the past and it was doable;
> 
> http://blog.wikichoon.com/2013/12/kvm-migration-from-fedora-17-to-fedora.html

Yeah, that looks workable too: wait long enough that it won't be
affecting pretty much anyone, then get rid of it. The amount of
people running ppc64 TCG guests is going to be much smaller, too,
so you can have a shorter grace period.

> > > So I think there should be some qemu upstream work to avoid the error in
> > > that case, which sounds like David's #2 option
> > 
> > I think that would be a big step backwards, and would almost
> > certainly cause other startup / migration scenarios to break.
> > We should not go down that route unless it proves to be
> > absolutely necessary.
> 
> I didn't fully follow David's suggestion so maybe I got it wrong. Here's
> what I'm trying to avoid:
> 
> If libvirt has encoded -machine pseries-2.11 in VM XML, that was working
> with 2.11.0 in f28, it's now failing to start ('fixed' with the pending qemu
> update). It also will fail to start on qemu.git, pseries-2.11 has the same
> behavior there. That stinks, because it forces users to change their VM
> configs on the next Fedora upgrade.
> 
> So is there an upstream way to convert pseries-2.11 back to the 2.11.0
> behavior, maybe even specific to tcg? If tcg migration compat has to suffer
> I say so be it, causing startup to fail for previously working configs is a
> worse sin IMO

We're trying to get rid of all differences between KVM and TCG
guests, that's why I said that doing something TCG-specific
would be a step backwards.

Unfortunately that causes a few issues like this one, but that
can't really be avoided if we want to have a sane situation in
the longer term. I think carrying a trivial downstream patch
for a few releases is a reasonable price to pay.

Plus the current behavior has been in place for two upstream
QEMU releases now, so backtracking at this point would be
difficult to justify and almost certainly introduce more
migration fun down the road due to having to take into account
the now *many* possible combinations. Let's just not go there.

Comment 19 Cole Robinson 2018-08-31 16:47:56 UTC
Okay I'll leave the current state as is and deal with any fallout in f29 and beyond. qemu-2.11.2-3.fc28 will fix this for f28

Comment 20 Fedora Update System 2018-08-31 16:49:25 UTC
qemu-2.11.2-3.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-5cf65abded

Comment 21 Fedora Update System 2018-08-31 22:28:05 UTC
qemu-2.11.2-3.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-5cf65abded

Comment 22 David Gibson 2018-09-03 01:13:22 UTC
Ugh.. I realised there's a nastier problem here than breaking migration with upstream builds of qemu.

Unless you do carry the downstream change forever (and in fact, a slightly more complex change than the current one), then you're going to break migration between pseries-2.11 machine type started on F28 and pseries-2.11 machine type running on F29+ with qemu-2.12+.  That's because even though the newer qemus in newer Fedoras will default to no HTM *with the then current machine type*, they'll revert to HTM being available with pseries-2.11 which is what that downstream change altered.

I'm starting to be more inclined to making an upstream change to revert the behaviour on pseries-2.11 and earlier to detecting the KVM capability, problematic as that is/was.

Comment 23 David Gibson 2018-09-03 01:15:47 UTC
Richard, I'm not sure what you're getting at with:

> But we'd have to specify that only when we know TCG is going to be used
> (which only qemu knows).

-machine cap-htm=off is safe to specify with KVM as well as TCG.  In fact, it will also be necessary to specify that if you're running with KVM on POWER9 and a not-quite-recent-enough kernel.  (The POWER9 issue is one reason we made the change to defaulting to HTM off).

The only exception would be if you expect the guest to actively use the transactional memory features, which doesn't seem to be common at all.

Comment 24 Richard W.M. Jones 2018-09-03 08:09:32 UTC
Is it safe to add this option on every ppc64le machine?
I'm not sure how to detect that the host is specifically POWER9.

Anyway, this kind of patch:

https://www.redhat.com/archives/libguestfs/2018-September/msg00001.html ?

Comment 25 David Gibson 2018-09-04 03:32:44 UTC
> Is it safe to add this option on every ppc64le machine?

Not sure what context you mean "machine" here.  It's safe for any type of host, but it's only implemented for pseries* guest machine types (that's why it's a machine option, rather than a cpu option).

> I'm not sure how to detect that the host is specifically POWER9.

I don't think you need to.

> Anyway, this kind of patch:

As Andrew pointed out, testing qemu and/or libvirt versions here is very fragile :(.

Comment 26 David Gibson 2018-09-04 03:34:58 UTC
s/Andrew/Andrea/, sorry.

Comment 27 Fedora Update System 2018-09-04 19:11:34 UTC
qemu-2.11.2-4.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-f06af0ec34

Comment 28 Fedora Update System 2018-09-07 00:06:59 UTC
qemu-2.11.2-4.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-f06af0ec34

Comment 29 Fedora Update System 2018-09-07 00:07:29 UTC
qemu-2.11.2-4.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-f06af0ec34

Comment 30 Fedora Update System 2018-09-27 17:28:06 UTC
qemu-2.11.2-4.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.