Bug 1697663

Summary:	Safe mechanism for updating CPU models
Product:	Red Hat Enterprise Linux 9	Reporter:	Eduardo Habkost <ehabkost>
Component:	qemu-kvm	Assignee:	Markus Armbruster <armbru>
qemu-kvm sub component:	QMP Monitor and CLI	QA Contact:	liunana <nanliu>
Status:	CLOSED DEFERRED	Docs Contact:
Severity:	unspecified
Priority:	high	CC:	ailan, berrange, chayang, chhu, dgilbert, dyuan, hpopal, jdenemar, jferlan, jinzhao, jsuchane, juzhang, kchamart, knoel, lhuang, ljelinko, lmen, michal.skrivanek, mvanderw, nanliu, nilal, virt-maint, xuzhang, yfu, yuhuang, zhguo
Version:	9.0	Keywords:	FutureFeature, Reopened, Triaged
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-02-13 21:00:44 UTC	Type:	Feature Request
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:	5.2
Embargoed:
Bug Depends On:
Bug Blocks:	1727122, 1731918, 1856522

Description Eduardo Habkost 2019-04-08 23:13:56 UTC

Today, we have two limitations in the CPU model system in QEMU and libvirt:

* Updates to CPU models need to be coupled to machine-types, making CPU security updates more risky because they may bring unexpected ABI changes to VMs.

* New features may require updates to host hardware or software (e.g. CPU microcode updates and/or KVM module updates) and can't be added unconditionally to existing CPU models (otherwise existing VM configurations that are runnable may become not runnable).

We need management software to update CPU model configuration in VMs in a convenient way without the limitations above.


Below is a proposal sent to qemu-devel and libvir-list by Daniel P. Berrangé:

> This post is to raise question about helping use of named CPU models with
> KVM ie any case not using -cpu host.
> 
> In the old days (ie before 2018), the world was innocent and we had a nice
> set of named CPU models that corresponded to different Intel/AMD physical
> CPU families/generations (lets temporarily ignore the -noTSX fiasco).
> 
> An application could query libvirt to determine what the host CPU model
> was/is and use that model name in the guest XML and be fairly happy. If
> they wanted to, they could explicitly include the extra features listed
> by capabilities XML, or just rely on the host-model.
> 
> Then Spectre happened, and QEMU took the decision to almost double the
> number of x86 models, adding in -IBRS / -IBPB variants for most CPU model,
> so that applications could get the spec_ctrl / ibpb flags set without
> having to manually list them.
> 
> In retrospect this was somewhat pointless, at least at the QEMU level,
> because there is little difference in complexity between the two approaches:
> 
>    -cpu Westmere,+spec-ctrl
>    -cpu Westmere-IBRS
> 
> At a higher level the extra named CPU models were slightly useful in so much
> as many application developers had taken a lazy approach and not provided
> users any way to explicitly turn on extra flags. This affected oVirt,
> OpenStack and virt-manager, and probably more. Though OpenStack since added
> ability to turn on arbitrary flags in response to the Spectre flaw, others
> have not.
> 
> Then a recently along came the Speculative Store Bypass hardware vulnerability
> requiring addition of yet another CPU flag to guest configs. This required use
> of 'ssbd' on Intel and 'virt-ssbd' on AMD. While QEMU could have now added yet
> more CPU models, eg Westmere-SSBD, this does not feel like a winning strategy
> long term. Looking at the models how would a user have any clue whether the
> -IBRS or -SSBD or -NEXT-FLAW or -YET-ANOTHER-FLAW suffix is "better" ? So QEMU
> and libvirt took the joint decision to stop adding new named CPU models when
> CPU vulnerabilities are discovered from this point forwards. Applications /
> users would be expected to turn on CPU features explicitly as needed and are
> considered broken if they don't provide this functionality.
> 
> As briefly mentioned above though, even before Spectre we had the pain of
> dealing with the -noTSX CPU models working around brokenness in the Intel TSX
> impl where they had to delete a CPU feature during microcode updates. This was
> rather painful to roll out at the time.
> 
> An alternative to adding CPU models is to change meaning of existing CPU
> models. QEMU has a way todo this by tieing the change to machine types, and
> it has in fact been used to correct mistakes in the specification of CPU
> models in the past, when those mistakes have not had dependancies on microcode
> changes. This is not a particularly attractice way to deal with the errata.
> Short life distros tend to stick with upstream QEMU machine types and won't
> want to diverge by adding their own machine types. This gates them on having
> upstream define the extra machine types which is tricky under embargo. Long
> life distros do typically take on the burden of defining custom machine types,
> but usually only add them when doing major updates.
> 
> The pain point with machine types is that the testing matrix grows at O(n^2)
> Using machine types for CPU security errata would significant increase the
> number of machine types and thus the testing matrix. eg if a security fix
> is needed in rhel-7.3, 7.4, 7.5 we can't just add a pc-rhel-7.5.1 machine
> with the fix, as it would not be possible to implement that in 7.3. So we
> would need would need pc-rhel-7.3.1,  pc-rhel-7.4.1,  pc-rhel-7.5.1, machine
> types, with 7.5 gaining all three. Finally CPU model changes have host
> hardware dependancies and machine types need to be independant of the host,
> since they are decided statically are build time. The only nice thing about
> machine type is that it is reasonably obvious what the "best" machine type
> is as they include a version number in the name, and users automatically get
> the best if they use an unversioned name.
> 
> 
> What if we can borrow the concept of versioning from machine types and apply
> it to CPU models directly. For example, considering the history of "Haswell"
> in QEMU, if we had versioned things, we would by now have:
> 
>      Haswell-1.3.0 - first version (37507094f350b75c62dc059f998e7185de3ab60a)
>      Haswell-2.2.0 - added 'rdrand' (78a611f1936b3eac8ed78a2be2146a742a85212c_
>      Haswell-2.3.0 - removed 'hle' & 'rtm' 
> (a356850b80b3d13b2ef737dad2acb05e6da03753)
>      Haswell-2.5.0 - added 'abm' (becb66673ec30cb604926d247ab9449a60ad8b11
>      Haswell-2.12.0 - added 'spec-ctrl' 
> (ac96c41354b7e4c70b756342d9b686e31ab87458)
>      Haswell-3.0.0  - added 'ssbd' (never done)
> 
> If we followed the machine type approach, then a bare "Haswell" would
> statically resolve at build time to the most recent Haswell-X.X.X version
> associated with the QEMU release. This is unhelpful as we have a direct
> dependancy on the host hardware features. Better would be for a bare
> "Haswell" to be dynamically resolved at runtime, picking the most recent
> version that is capable of launching given the current hardware, KVM/TCG impl
> and QEMU version.
> 
>   ie -cpu  Haswell
> 
> should use Haswell-2.5.0  if on silicon with the TSX errata applied,
> but use Haswell-2.12.0 if the Spectre errata is applied in microcode,
> and use Haswell-3.0.0 once Intel finally releases SSBD microcode errata.
> 
> Versioning of CPU models as opposed to using arbitrary string suffixes
> (-noTSX, -IBRS) has a number of usability improvements that we would
> gain with versioned machine types, while avoiding exploding the machine
> type matrix. With versioned CPU models we can
> 
>  - Automatically tailor the best model based on hardware support
> 
>  - Users always get the best model if they use the bare CPU name
> 
>  - It is obvious to users which is the "best" / "newest" CPU model
> 
>  - Avoid combinatorial expansion of machines since same CPU model
>    version can be added to all releases without adding machine types.
> 
>  - Users can still force a specific downgraded model by using the
>    fully versioned name.
> 
> Such versioning of CPU models would largely "just work" with existing
> libvirt versions, but to libvirt would really want to expand the bare
> CPU name to a versioned CPU name when recording new guest XML, so the
> ABI is preserved long term.
> 
> An application like virt-manager which wants a simple UI can forever be
> happy simply giving users a list of bare CPU model names, and allowing
> libvirt / QEMU to automatically expand to the best versioned model for
> their host.
> 
> An application like oVirt/OpenStack which wants direct control can allow
> the admin to choice if a bare name, or explicitly picking a versioned name
> if they need to cope with possibility of outdated hosts.

Comment 3 Eduardo Habkost 2019-07-08 21:00:36 UTC

Merged upstream, at:

commit 3a1acf5d47295d22ffdae0982a2fd808b802a7da
Merge: d2c5f91ca9 af135030e3
Author: Peter Maydell <peter.maydell>
Date:   Mon Jul 8 09:46:19 2019 +0100

    Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
    
    Machine and x86 queue, 2019-07-05
    
    * CPU die topology support (Like Xu)
    * Deprecation of features (Igor Mammedov):
      * 'mem' parameter of '-numa node' option
      * implict memory distribution between NUMA nodes
      * deprecate -mem-path fallback to anonymous RAM
    * x86 versioned CPU models (Eduardo Habkost)
    * SnowRidge CPU model (Paul Lai)
    * Add deprecation information to query-machines (Eduardo Habkost)
    * Other i386 fixes
    
    # gpg: Signature made Fri 05 Jul 2019 23:12:09 BST
    # gpg:                using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
    # gpg:                issuer "ehabkost"
    # gpg: Good signature from "Eduardo Habkost <ehabkost>" [full]
    # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6
    
    * remotes/ehabkost/tags/machine-next-pull-request: (42 commits)
      tests: use -numa memdev option in tests instead of legacy 'mem' option
      numa: allow memory-less nodes when using memdev as backend
      numa: Make deprecation warnings conditional on !qtest_enabled()
      i386: Add Cascadelake-Server-v2 CPU model
      docs: Deprecate CPU model runnability guarantees
      i386: Make unversioned CPU models be aliases
      i386: Replace -noTSX, -IBRS, -IBPB CPU models with aliases
      i386: Define -IBRS, -noTSX, -IBRS versions of CPU models
      i386: Register versioned CPU models
      i386: Get model-id from CPU object on "-cpu help"
      i386: Add x-force-features option for testing
      qmp: Add "alias-of" field to query-cpu-definitions
      i386: Introduce SnowRidge CPU model
      qmp: Add deprecation information to query-machines
      vl.c: Add -smp, dies=* command line support and update doc
      machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()
      target/i386: Add CPUID.1F generation support for multi-dies PCMachine
      i386: Remove unused host_cpudef variable
      x86/cpu: use FeatureWordArray to define filtered_features
      i386: make 'hv-spinlocks' a regular uint32 property
      ...
    
    Signed-off-by: Peter Maydell <peter.maydell>

Comment 4 Jiri Denemark 2019-07-19 09:36:03 UTC

As discussed via IRC yesterday, the versioned CPU model to which a
corresponding non-versioned model resolves may differ with machine type.
Mainly, old machine types should keep non-versioned CPU models unchanged while
new machine types will resolve them to a newer version of the CPU models. All
this to make migration from new QEMU using old machine type to old QEMU (which
knows nothing about versioned CPU models) possible.

And since libvirt will want to resolve CPU models to their alias-of versions
when a domain is defined (to match what we do with machine types), we need to
get the (CPU_model, machine_type) -> CPU_model_with_version mapping when we
probe QEMU with machine type "none".

Apparently query-cpu-definitions will fill the alias-of fields according to
the currently used machine type. To be able to probe for the mapping for all
existing machine types we either need a machine type parameter for
query-cpu-definitions so that we can call it for each machine type or
query-machines needs to be enhanced to provide the alias-of for non-versioned
CPU models. Or I guess even a completely new interface for getting this info
may be introduced if this makes more sense for QEMU.

I think libvirt would prefer the addition to query-machines as it could be
more compact (each machine type would only list CPU models with alias-of set)
and we wouldn't need to call query-cpu-definitions several times. But we can
use whatever interface QEMU comes up with as long as it allows us to get the
required mapping.

Comment 5 Jiri Denemark 2019-07-19 14:50:00 UTC

Actually after thinking about it a bit more, the machine type parameter for
query-cpu-definitions would likely be the best option since we'd want to use
more that just alias-of for each CPU model for a given machine type. We'd also
want to present runnability of specific CPU models to users and this I believe
now depends on machine type too.

Comment 6 Eduardo Habkost 2019-07-19 23:33:27 UTC

(In reply to Jiri Denemark from comment #5)
> Actually after thinking about it a bit more, the machine type parameter for
> query-cpu-definitions would likely be the best option since we'd want to use
> more that just alias-of for each CPU model for a given machine type. We'd
> also
> want to present runnability of specific CPU models to users and this I
> believe
> now depends on machine type too.

Agreed.  I will propose this upstream soon, targeting QEMU 4.2.

Comment 14 Eduardo Habkost 2020-01-27 16:43:30 UTC

Needinfo answered in comment #12.

Comment 15 Ademar Reis 2020-02-05 22:56:19 UTC

QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 20 RHEL Program Management 2021-03-15 07:34:57 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 21 Daniel Berrangé 2021-03-15 09:56:30 UTC

We still need a solution for this problem in QEMU and libvirt.

Comment 23 liunana 2021-06-29 07:44:39 UTC

I set ITM to 22 now and we can change it if the code is still not ready at that time. Thanks!



Best regards
Liu Nana

Comment 29 John Ferlan 2021-09-08 19:07:04 UTC

Bulk update: Move RHEL-AV bugs to RHEL8

Comment 30 RHEL Program Management 2021-09-15 08:26:12 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 31 liunana 2021-09-15 09:35:42 UTC

Hi Eduardo,


Do we have a plan to support this feature?
If so, please help to re-open this bug, thanks.



Best regards
Liu Nana

Comment 32 Eduardo Habkost 2021-09-15 18:22:34 UTC

BZ was incorrectly closed, reopening.

Comment 34 Eduardo Habkost 2021-11-10 22:32:06 UTC

Moving back to virt-maint, as I won't be able to work on this.

For reference, the upstream work for this was at:
https://lore.kernel.org/qemu-devel/20201013230457.150630-1-ehabkost@redhat.com

The series above got zero replies, but the original series had received some objections at:
https://lore.kernel.org/qemu-devel/20191025022553.25298-1-ehabkost@redhat.com/

The next step is to resubmit the series and insist the QMP/QAPI maintainers ACK it, and hope that they won't have any additional objections to the series.

Comment 36 Dr. David Alan Gilbert 2022-02-14 11:28:44 UTC

We really do need this; can't close yet

Comment 37 liunana 2022-05-17 10:30:59 UTC

(In reply to Dr. David Alan Gilbert from comment #36)
> We really do need this; can't close yet

Do we have a plan to support this on RHEL.8.7.0?
If so can you please help to set the TargetRelease or ITR?

Thanks.


Best regards
Liu Nana

Comment 39 Nitesh Narayan Lal 2022-05-17 12:12:17 UTC

(In reply to liunana from comment #37)
> (In reply to Dr. David Alan Gilbert from comment #36)
> > We really do need this; can't close yet
> 
> Do we have a plan to support this on RHEL.8.7.0?
> If so can you please help to set the TargetRelease or ITR?
> 

Daniel has moved the BZ to rhel9.
Also, since we don't have a target release or an assignee I think it should be in the backlog.
This is assigned to sst_virt, hence, I would assume John or Yash will be looking at this.

Since there is no assignee, move it back to NEW.

Comment 40 John Ferlan 2022-07-20 11:08:30 UTC

NB: Removed RHV as dependent product since this is a RHEL9 issue

Comment 41 John Ferlan 2022-09-02 19:58:57 UTC

Markus - is there a "safe" way to resurrect the changes that Eduardo refers to or does this need a complete redo? 

I've added ITR=9.2.0 so that we attempt to come to a resolution as this feature request has been out there for a while. 

There are some bugs being blocked by this with more info and other related bugs.

Comment 42 RHEL Program Management 2022-10-03 07:27:42 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 43 John Ferlan 2022-10-04 12:09:43 UTC

Resurrecting from the aging list since there's a request from Kubevirt/CNV in order to implement something within this space - this may turn into other bz's, but let's just keep it in the discussion until 9.2 closes.

Comment 46 John Ferlan 2023-02-09 11:41:49 UTC

(In reply to liunana from comment #45)
> (In reply to John Ferlan from comment #43)
> > Resurrecting from the aging list since there's a request from Kubevirt/CNV
> > in order to implement something within this space - this may turn into other
> > bz's, but let's just keep it in the discussion until 9.2 closes.
> 
> Hi John,
> 
> Would you please help to check the patches' status for this bug?
> Can we still catch this on the 9.2.0?
> 
> 
> Thanks.

(nothing private, so I unclicked that for my response)

Long story short is we need to take more time to work with the CNV team to develop a "more actionable" feature request for this. The concept is an old idea that Eduardo was promoting soon after Spectre/Meltdown when a proliferation of CPU Models with various features on and off were created which was very confusing.

We cannot say with certainty something like that wouldn't happen again, but Eduardo's idea and thoughts were not completely accepted upstream and once he left Red Hat no one had the same drive as him to resolve the core issue.

I suspect this bug will be closed deferred shortly as Yash and I will go through all the aging bugs that are currently "in plan" and make the decision that there are just some bugs or features that have been open for so long and no one has found the time or priority to resolve, so it's time to just move on from them.

Comment 47 John Ferlan 2023-02-13 21:00:44 UTC

This aging bug was planned to be resolved within the 9.2.0 release timeframe; however, since it has not been addressed by ITM 24, we are closing as deferred. 

If this bug is reopened, there must be a downstream commit ready to be posted for merge or the bug must be part of some rebase. When reopening, add the release ITR and appropriate DTM.