Description of problem: qemu-system-ppc64 can't run guests under TCG by default, it says: qemu-system-ppc64: No Transactional Memory support in TCG, try cap-htm=off If qemu knows there's no transactional memory support in TCG why doesn't it add the option itself instead of giving this error message? In any case I suspect the actual problem might be libvirt which is adding the option even though it (presumably) knows it's trying to run a TCG guest. Version-Release number of selected component (if applicable): qemu 2:2.11.2-1.fc28 (other versions of things can be found in the attached root.log) How reproducible: 100% Steps to Reproduce: 1. Run libguestfs-test-tool in a ppc64 guest. Additional info: https://bugzilla.redhat.com/show_bug.cgi?id=1525599
Created attachment 1475156 [details] root.log
Created attachment 1475157 [details] build.log
Fails differently in Rawhide, it hangs. (qemu 2:3.0.0-0.1.rc3.fc29)
Hitting this too. That message comes from: commit ee76a09fc72cfbfab2bb5529320ef7e460adffd8 Author: David Gibson <david.id.au> Date: Mon Dec 11 13:10:44 2017 +1100 spapr: Treat Hardware Transactional Memory (HTM) as an optional capability Libvirt has XML support for setting the cap-htm flag but this really should do the right thing out of the box for tcg guests. That qemu commit message makes it sound like cap-htm=off shouldn't be explicitly required but maybe I'm misreading. David or Andrea do you know what the solution is?
There used to be a bunch of places, including this, where we advertised different capabilities depending on what the accelerator could support. The problem is that this means we're presenting an (incompatibly) different environment to the guest depending on the accelerator. That's caused us gradually increasing amounts of grief. So the new approach we're trying to take is to always present exactly the requested environment to the guest, and simply fail if the accelerator (or whatever) can't support it. It leads to oddities like this, but overall it's the less bad option. That said, in practice, this particular one should go away with qemu-2.12. It turns out that hardly anyone actually uses HTM in guests anyway, and there were also complications with KVM on POWER9, and so we've disabled it by default in the pseries-2.12 and later machine types.
Thanks for the explanation David. Do you have a suggestion how to 'fix' this for 2.11? That's the version we are stuck with for the life of f28. Could be a backport, revert, custom patch, etc.
Ugh.. that's pretty tricky. We can't really apply the machine type change to Fedora, because that would mean Fedora's pseries-2.11 behaves differently from an upstream one, which I'm pretty sure we don't want. AFAICT ee76a09fc72cfbfab2bb5529320ef7e460adffd8 was only in 2.12, not 2.11 upstream. Is there a reason you need it in Fedora, or can you revert?
Looks like it was pulled into the v2.11.1 upstream stable release. If it's safe to revert I can give that a try
Looks like this is causing our composes to fail as well (our image builder ppc buildvm's updated to this f28 version). See: https://pagure.io/releng/issue/7703 so yes, a revert would be welcome. ;)
Ok, so, I was wondering why this change was included in the v2.11 stable series at all, but having had a look through the git history, I think I see what's happened. The mechanism for the new behaviour is these "machine capabilities" flags we've added (these are what's used to explicitly control available features, rather than just advertising based on what we can support). We wanted the backport of those, because they're necessary for the handling of the spectre/meltdown mitigations, and the HTM change came along for the ride. Because of the way the patches interrelate, I think a simple revert will cause a bunch of conflicts, and resolving them carelessly might also break the the spectre/meltdown fixes, which we don't want (although they already aren't enabled by default). So, options, 1) Change to HTM=no by default in Fedora. As in comment 7 this kinda makes Fedora migration-incompatible with other qemu-2.11 builds. But... since v2.11.0 didn't have these changes, it's already kind of a mess. Setting HTM=no would make Fedora migration incompatible with external v2.11.1 builds, and external v2.11.0 builds running on KVM HV, but make it migration compatible (when it wasn't previously) with other v2.11.0 builds running on TCG (or KVM PR). [This mess is exactly why we made these changes in the first place, but by its nature there is no way to get compatibility perfect, we just have to pick a poison] 2) Change Fedora and/or upstream to change the setting based on whether its supported in the available KVM. This improves out of the box "will this run" compatiblity with earlier versions longer term, but it means whether a "pseries-2.11" machine is migratable to another depends on non-obvious factors, like the KVM version (if any) each end is running on. That's ugly, but no uglier than v2.11.0 was to begin with. Thoughts?
I guess I'd say option 1 sounds easier to me, but either would work. Is there a config option/change while this is sorted out we can use to get back working?
(In reply to Kevin Fenzi from comment #11) > I guess I'd say option 1 sounds easier to me, but either would work. > > Is there a config option/change while this is sorted out we can use to get > back working? Downgrade qemu. There's probably a way to inject a '-global' qemu option to override cap-htm but I couldn't figure out the invocation. Regardless I've submitted a build now with this patch: diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a74eb2dc68..003d522f0e 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -3843,7 +3843,10 @@ static void spapr_machine_2_11_class_options(MachineClass *mc) sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc); spapr_machine_2_12_class_options(mc); - smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_ON; + /* Custom Fedora change to fix qemu 2.11.2 + TCG: + * https://bugzilla.redhat.com/show_bug.cgi?id=1614948 + */ + // smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_ON; SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11); } Which seems to cover option #1. Generally in Fedora I don't think migration compatibility is super important. The only case I expect it to impact is: user migrates to file (virsh save), updates to next major fedora release, attempts to restore VM, restore fails. Actual live migration across hosts is not a common thing. What's worse in this case is: user creates working VM config with pseries-2.12 (with < qemu 2.11.2 or after this patch), updates to next Fedora release with qemu 3.0, but now their VMs don't start with this same cap-htm error. The fix is just changing the machine type but that is quite unfriendly to users. So I think there should be some qemu upstream work to avoid the error in that case, which sounds like David's #2 option
The option you're looking for here is not -global, but: -machine cap-htm=off
But we'd have to specify that only when we know TCG is going to be used (which only qemu knows). In any case I believe this is fixed in qemu 2.12 so I'd be inclined to close this bug closed -> rawhide (or whatever) and tell people who encounter this bug that they must pull in the Fedora n+1 release of qemu to fix this problem.
(In reply to Cole Robinson from comment #12) > Generally in Fedora I don't think migration compatibility is super > important. The only case I expect it to impact is: user migrates to file > (virsh save), updates to next major fedora release, attempts to restore VM, > restore fails. Actual live migration across hosts is not a common thing. Unless you just carry the pseries-2.11 tweak forever downstream: it's literally a one-liner if you exclude the comments, and since you're already diverging from upstream might as well go the extra mile. If that makes migration between downstream and upstream builds break, so be it: people trying that are already basically asking for trouble anyway :) > What's worse in this case is: user creates working VM config with > pseries-2.12 (with < qemu 2.11.2 or after this patch), updates to next > Fedora release with qemu 3.0, but now their VMs don't start with this same > cap-htm error. The fix is just changing the machine type but that is quite > unfriendly to users. Wait, pseries-2.12 has htm=off by default, no? So a guest using that machine type instead of the default pseries-2.11 would be able to run under TCG already, and would also migrate to later QEMUs just fine. (I just tried the former, and it works.) That is, assuming HTM configuration was the only thing preventing migration: since the pseries-2.12 we're talking about is still out of a patched 2.11 release, I'm not sure it's realistic to expect it to behave like the pseries-2.12 you'd get from a proper 2.12 build, so quite possibly migration would fail anyway. You'll probably be able to start the guest with a later QEMU, though, so the concern you raise above is most likely not going to be a problem in practice. > So I think there should be some qemu upstream work to avoid the error in > that case, which sounds like David's #2 option I think that would be a big step backwards, and would almost certainly cause other startup / migration scenarios to break. We should not go down that route unless it proves to be absolutely necessary.
(In reply to David Gibson from comment #13) > The option you're looking for here is not -global, but: > -machine cap-htm=off Ah okay, I was thinking about this from the libvirt qemu-commandline perspective which can only append options. I figured tacking on an extra -machine would overwrite libvirt's values there but it seems to merge them, so indeed sticking that on the end of a qemu-system-ppc64 invocation will work
(In reply to Andrea Bolognani from comment #15) > (In reply to Cole Robinson from comment #12) > > Generally in Fedora I don't think migration compatibility is super > > important. The only case I expect it to impact is: user migrates to file > > (virsh save), updates to next major fedora release, attempts to restore VM, > > restore fails. Actual live migration across hosts is not a common thing. > > Unless you just carry the pseries-2.11 tweak forever downstream: > it's literally a one-liner if you exclude the comments, and since > you're already diverging from upstream might as well go the extra > mile. If that makes migration between downstream and upstream > builds break, so be it: people trying that are already basically > asking for trouble anyway :) > IMO It's a maintenance nightmare to carry downstream-forever patches. But again I'm not too concerned about this issue in practice, we dealt with this with x86 in the past and it was doable; http://blog.wikichoon.com/2013/12/kvm-migration-from-fedora-17-to-fedora.html > > What's worse in this case is: user creates working VM config with > > pseries-2.12 (with < qemu 2.11.2 or after this patch), updates to next > > Fedora release with qemu 3.0, but now their VMs don't start with this same > > cap-htm error. The fix is just changing the machine type but that is quite > > unfriendly to users. > > Wait, pseries-2.12 has htm=off by default, no? So a guest using > that machine type instead of the default pseries-2.11 would be > able to run under TCG already, and would also migrate to later > QEMUs just fine. (I just tried the former, and it works.) > > That is, assuming HTM configuration was the only thing preventing > migration: since the pseries-2.12 we're talking about is still > out of a patched 2.11 release, I'm not sure it's realistic to > expect it to behave like the pseries-2.12 you'd get from a proper > 2.12 build, so quite possibly migration would fail anyway. > > You'll probably be able to start the guest with a later QEMU, > though, so the concern you raise above is most likely not going > to be a problem in practice. > > > So I think there should be some qemu upstream work to avoid the error in > > that case, which sounds like David's #2 option > > I think that would be a big step backwards, and would almost > certainly cause other startup / migration scenarios to break. > We should not go down that route unless it proves to be > absolutely necessary. I didn't fully follow David's suggestion so maybe I got it wrong. Here's what I'm trying to avoid: If libvirt has encoded -machine pseries-2.11 in VM XML, that was working with 2.11.0 in f28, it's now failing to start ('fixed' with the pending qemu update). It also will fail to start on qemu.git, pseries-2.11 has the same behavior there. That stinks, because it forces users to change their VM configs on the next Fedora upgrade. So is there an upstream way to convert pseries-2.11 back to the 2.11.0 behavior, maybe even specific to tcg? If tcg migration compat has to suffer I say so be it, causing startup to fail for previously working configs is a worse sin IMO
(In reply to Cole Robinson from comment #17) > (In reply to Andrea Bolognani from comment #15) > > (In reply to Cole Robinson from comment #12) > > > Generally in Fedora I don't think migration compatibility is super > > > important. The only case I expect it to impact is: user migrates to file > > > (virsh save), updates to next major fedora release, attempts to restore VM, > > > restore fails. Actual live migration across hosts is not a common thing. > > > > Unless you just carry the pseries-2.11 tweak forever downstream: > > it's literally a one-liner if you exclude the comments, and since > > you're already diverging from upstream might as well go the extra > > mile. If that makes migration between downstream and upstream > > builds break, so be it: people trying that are already basically > > asking for trouble anyway :) > > IMO It's a maintenance nightmare to carry downstream-forever patches. I agree in general, but we're talking about a literal one-liner here so I have trouble imagining that forward-porting it is going to be anything approximating a nightmare :) > But > again I'm not too concerned about this issue in practice, we dealt with this > with x86 in the past and it was doable; > > http://blog.wikichoon.com/2013/12/kvm-migration-from-fedora-17-to-fedora.html Yeah, that looks workable too: wait long enough that it won't be affecting pretty much anyone, then get rid of it. The amount of people running ppc64 TCG guests is going to be much smaller, too, so you can have a shorter grace period. > > > So I think there should be some qemu upstream work to avoid the error in > > > that case, which sounds like David's #2 option > > > > I think that would be a big step backwards, and would almost > > certainly cause other startup / migration scenarios to break. > > We should not go down that route unless it proves to be > > absolutely necessary. > > I didn't fully follow David's suggestion so maybe I got it wrong. Here's > what I'm trying to avoid: > > If libvirt has encoded -machine pseries-2.11 in VM XML, that was working > with 2.11.0 in f28, it's now failing to start ('fixed' with the pending qemu > update). It also will fail to start on qemu.git, pseries-2.11 has the same > behavior there. That stinks, because it forces users to change their VM > configs on the next Fedora upgrade. > > So is there an upstream way to convert pseries-2.11 back to the 2.11.0 > behavior, maybe even specific to tcg? If tcg migration compat has to suffer > I say so be it, causing startup to fail for previously working configs is a > worse sin IMO We're trying to get rid of all differences between KVM and TCG guests, that's why I said that doing something TCG-specific would be a step backwards. Unfortunately that causes a few issues like this one, but that can't really be avoided if we want to have a sane situation in the longer term. I think carrying a trivial downstream patch for a few releases is a reasonable price to pay. Plus the current behavior has been in place for two upstream QEMU releases now, so backtracking at this point would be difficult to justify and almost certainly introduce more migration fun down the road due to having to take into account the now *many* possible combinations. Let's just not go there.
Okay I'll leave the current state as is and deal with any fallout in f29 and beyond. qemu-2.11.2-3.fc28 will fix this for f28
qemu-2.11.2-3.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-5cf65abded
qemu-2.11.2-3.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-5cf65abded
Ugh.. I realised there's a nastier problem here than breaking migration with upstream builds of qemu. Unless you do carry the downstream change forever (and in fact, a slightly more complex change than the current one), then you're going to break migration between pseries-2.11 machine type started on F28 and pseries-2.11 machine type running on F29+ with qemu-2.12+. That's because even though the newer qemus in newer Fedoras will default to no HTM *with the then current machine type*, they'll revert to HTM being available with pseries-2.11 which is what that downstream change altered. I'm starting to be more inclined to making an upstream change to revert the behaviour on pseries-2.11 and earlier to detecting the KVM capability, problematic as that is/was.
Richard, I'm not sure what you're getting at with: > But we'd have to specify that only when we know TCG is going to be used > (which only qemu knows). -machine cap-htm=off is safe to specify with KVM as well as TCG. In fact, it will also be necessary to specify that if you're running with KVM on POWER9 and a not-quite-recent-enough kernel. (The POWER9 issue is one reason we made the change to defaulting to HTM off). The only exception would be if you expect the guest to actively use the transactional memory features, which doesn't seem to be common at all.
Is it safe to add this option on every ppc64le machine? I'm not sure how to detect that the host is specifically POWER9. Anyway, this kind of patch: https://www.redhat.com/archives/libguestfs/2018-September/msg00001.html ?
> Is it safe to add this option on every ppc64le machine? Not sure what context you mean "machine" here. It's safe for any type of host, but it's only implemented for pseries* guest machine types (that's why it's a machine option, rather than a cpu option). > I'm not sure how to detect that the host is specifically POWER9. I don't think you need to. > Anyway, this kind of patch: As Andrew pointed out, testing qemu and/or libvirt versions here is very fragile :(.
s/Andrew/Andrea/, sorry.
qemu-2.11.2-4.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-f06af0ec34
qemu-2.11.2-4.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-f06af0ec34
qemu-2.11.2-4.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.