1585798 – [RFE] A single default (QEMU) machine type won't work for all guest images

Bug 1585798 - [RFE] A single default (QEMU) machine type won't work for all guest images

Summary: [RFE] A single default (QEMU) machine type won't work for all guest images

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	OSP DFG:Compute
QA Contact:	OSP DFG:Compute
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1340726 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-06-04 18:52 UTC by Eduardo Habkost
Modified:	2023-03-21 18:51 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-09-29 09:35:05 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Eduardo Habkost 2018-06-04 18:52:25 UTC

Description of problem:

1) When importing preexisting guest images that expect a "pc" machine, OpenStack can't use "q35" by default.

2) When importing new guest images that that expect a "q35" machine, OpenStack can't use "pc" by default.

This means we can't assume a single default will be good for everybody.


Possible solutions:

There are multiple ways this can be addressed in the virtualization stack.  The ones that are being discussed are:

1) Doing nothing and require the user to manually select the machine-type explicitly in at least one of the cases above.

2) Encoding a recommended machine-type family inside the guest image file (e.g. using another container format for guest images, or enconding additional data on qcow2 images). See qemu-devel discussion: <https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg04494.html>;

3) Choosing a smarter default based on analysis of the guest image (e.g. using virt-inspector and a database similar to libosinfo).

The implementation of these solutions will necessarily involve multiple components and require additional BZs.  I'm creating this BZ so we can track the tasks related to the problem.

Comment 1 Kashyap Chamarthy 2018-06-07 15:56:28 UTC

For context, this bug from Eduardo builds on top of this other one he 
filed:

    https://bugzilla.redhat.com/show_bug.cgi?id=1581414#c9 -- OpenStack
    shouldn't break if the default machine-type in QEMU is "q35"  


I am following the 'qemu-devel' 15 KM-long thread ("storing machine data 
in qcow images").  And the emerging consensus appears[*] to be _not_
commit to a QCOW2 specification straight away, but to first come up with
VM description -- that can work with existing formats such as "tar".

[*] Based on your proposal here:
<https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg01223.html>.
And DanPB's response here:
<https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg01231.html>.
Quoting that thread below for convenience:
--------------------------------------------------------------------------------
On Wed, Jun 06, 2018 at 03:24:50PM +0100, Daniel P. Berrangé wrote:
> On Wed, Jun 06, 2018 at 11:14:32AM -0300, Eduardo Habkost wrote:
> > On Wed, Jun 06, 2018 at 02:50:10PM +0100, Daniel P. Berrangé wrote:
> > > On Wed, Jun 06, 2018 at 03:45:10PM +0200, Michal Suchánek wrote:
> > > > 
> > > > I think that *if* we want an 'appliance' format that stores a whole VM
> > > > in a single file to ease VM distribution then the logical place to look
> > > > in qemu is qcow. The reason have been explained at length.
> > > 
> > > I rather disagree. This is a common problem beyond just QEMU and everyone
> > > just uses an existing archive format (TAR, ZIP) for bundling together
> > > one or more disk images, metdata for config, and whatever other resources
> > > are applicable for the vendor.  This works with any disk format (raw,
> > > qcow2, vmdk, vpc, etc) so is preferrable to inventing someting that is
> > > specific to qcow2 IMHO.
> > 
> > Now we have N+1 appliance file formats.  :)
> > 
> > (We like it or not, qcow2 is already used as an appliance format
> > for single-disk VMs in practice.)
> > 
> > But I agree this must not be specific to qcow2.  The same VM
> > description format we agree upon should work with other disk
> > formats or with multi-disk appliances.
> > 
> > If we specify a reasonable VM description format for appliances
> > and make it work inside (e.g.) tar files, we will still have the
> > option of allowing the description be placed inside qcow2 if we
> > really want to.  I don't think we need to finish this qcow2
> > bikeshedding exercise right now.
> 
> Yes, I think that is sensible, as once we actually try it out in real
> world cases, we might then find a tar/zip is sufficient after all and
> we don't need to do something extra for qcow2. Also means we can do
> experiments without committing to a qcow2 format spec change right
> away.
--------------------------------------------------------------------------------

Comment 2 Kashyap Chamarthy 2018-08-28 17:12:16 UTC

[Based on an IRC discussion with  Dan Berrangé, and Matt Booth.]

The preferred order in which to select appropriate machine type for Nova 
instances:

(1) Use the Nova metadata property: 'hw_machine_type' to set the machine
    type on the guest. 

(2) Ask libosinfo, and pick q35 if it says guest can do both 'pc' or
    'q35'

(3) Use 'q35' (this doesn't necessarily need code changes, and it can be
    forced via nova.conf if desired).

Related info
------------

(a) Note that the upstream libvirt completely ignore QEMU's default:

        https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=26cfb1a
        "qemu: ensure default machine types don't change if QEMU
        changes"

    Where the commit message says:

        [...] 
        "Libvirt promises to isolate applications from hypervisor
        changes that may cause incompatibilities, so we must ensure that
        we always use the "pc" machine type if it is available. Only use
        QEMU's own reported default machine type if "pc" does not exist.

        "This issue is not x86-only, other arches are liable to change
        their default machine, while some arches don't report any
        default at all causing libvirt to pick the first machine in the
        list. Thus to guarantee stability to applications, declare a
        preferred default machine for all architectures we currently
        support with QEMU."
        [...]

(b) Dan Berrangé writes: libosinfo would only ever report 'q35' if we
    knew it to work -- so for example, we wouldn't report 'q35' support
    for RHEL6.  I bet it would impact NFV folks in particular, because
    they need to figure out which device to use from the role device
    tagging metadata Nova exposes, and with q35 the PCI topology they
    need to traverse is totally different.

Comment 3 Lee Yarwood 2018-09-04 11:39:16 UTC

*** Bug 1340726 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.