This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours

Bug 725080

Summary: [RH Engineering 6.5 Feature] virt-manager should default CPU type to host's level for maximum performance
Product: Red Hat Enterprise Linux 6 Reporter: Andrew Cathrow <acathrow>
Component: virt-managerAssignee: Gunannan Ren <gren>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: berrange, bsarathy, chorn, dallan, dyuan, hyao, ichute, jdenemar, knoel, lcui, ltroan, mkletzan, mzhan, perfbz, rbalakri, rwu, tburke, tzheng
Target Milestone: betaKeywords: FutureFeature
Target Release: 6.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 853338 922719 (view as bug list) Environment:
Last Closed: 2013-06-24 10:09:47 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On: 824989    
Bug Blocks: 835616, 848463, 896704, 830946, 833603, 840699, 853338, 922719    

Description Andrew Cathrow 2011-07-22 14:58:17 EDT
We want to ensure that out-of-the-box without tuning KVM guests run at the optimum speed.

Currently we default to lowest (most compatible) option.

We should have either match the host (eg Nehalem, Westmere, etc) or another option would be the suggestion in bz 700272
Comment 1 Cole Robinson 2011-07-22 16:18:37 EDT
This isn't what Dor suggested in that thread. CCing him

AIUI he said a reasonable default would be to pick some baseline CPU (possibly 2 baseline CPUs, one Intel one AMD) that would give some performance improvements  over the default qemu64-whatever model, and at the same time be nothing too new + fancy so it's very likely our users all have HW that matches the feature flags. This way we get a performance boost, but in the majority of cases users can still migrate guests between their non-uniform hardware.

I'm not saying that's what needs to be done, just that's what Dor seemed to be recommending. For this case I would need input from KVM guys as to what exactly the default CPU model(s) should be.

But whether we use that idea or -cpu host or something else needs to be decided by virt-devel + PM + marketing + performance guys IMO

The main problem here is that any model we choose which isn't the default qemu model is probably going to cause migration issues for someone. Which means someone will eventually file a BZ my way. Before we make any change this stuff needs to be thoroughly documented:

- What the previous default was, why it was the default, its impact on migration and performance
- What the new default is, why it changed, its impact on migration and performace
- What if a migration that worked with a VM created on 6.1 doesn't work with a VM created on 6.2? What is the error the user will see? Are there solutions for this problem, and if so are they different depending on guest OS? How can they avoid this problem for future migrations?
- What do we support for migration? Machines must be same exact hw? same CPU vendor?

Maybe some/all of this is already available.

When all that info is in place, I have no problem changing the virt-manager default to whatever is decided. It's just that eventually the default change will cause migration issues for someone, and when they file a bug I want to be able to point them at the docs and close the bug (provided its not a _real_ bug :) )

Also, it will probably be a bit more work than just changing the default in virt-manager, we will need to provide some way for users to override the default if migration compatibility (or improved performance) is more important to them. Simplest way would be to allow users to manually choose what CPU model to use by default, more user friendly UI would probably be some prechosen list of CPU options that range from maximum migration compatibility (qemu64) to maximum performance (-cpu host). This doesn't need to be perfect for 6.2 as long as things are documented clearly.
Comment 2 Andrew Cathrow 2011-07-22 17:46:08 EDT
We should optimize for the most common case.
Right now that's a user who wants to run a VM locally in which case they will got sub-par performance from KVM.

If a user want to have migration there's more to do than match the CPU, they have to setup shared luns/clustered file systems, ensure they have consistent network topology, etc, etc. Checking the CPU level as part of that checklist is a very small price to pay for a smaller number of users who'll need it. 
Also the most common scenario for clustered VMs is to have a cluster of identical machines so the chances are selecting Nehalem as the cpu would work in the whole cluster.

I DON'T think we should use -cpu host, the suggested change in 700272 is different from the qemu-kvm "-cpu host" option. 

Answering your specific questions

- What the previous default was, why it was the default, its impact on
migration and performance

Previous default is qemu64 which basically lobotomizes a modern CPU to look like a pentium 4. Bad for performance but means that a virtual machine can migrate from a Core2Due from 2008 to a modern Westmere machine can still get the same poor performance.

- What the new default is, why it changed, its impact on migration and
performace

New default should be the maximum supported by the machine - either based on copying the CPU type or using the #700272 approach.
For migration it means that *IF* the user wants to migrate the VM then they'll need to either make sure the hosts are "compatible" or lower the cpu type.

- What if a migration that worked with a VM created on 6.1 doesn't work with a
VM created on 6.2? What is the error the user will see? Are there solutions for
this problem, and if so are they different depending on guest OS? How can they
avoid this problem for future migrations?

The CPU's won't match. Same kind of errors they'd get if they mismatched other resources.

- What do we support for migration? Machines must be same exact hw? same CPU
vendor?
 
No they'd have to be in the same family and of the same class or higher
eg. An Opteron G2 could migrate to a G3 but not to a G1
eg. An Intel Nehalem could migrate to a Westmere but not a core2Duo

Virt-Manager should 'just work' out of the box as fast as possible with little to no tuning.
Comment 3 Cole Robinson 2011-07-22 18:38:40 EDT
(In reply to comment #2)
> We should optimize for the most common case.
> Right now that's a user who wants to run a VM locally in which case they will
> got sub-par performance from KVM.
> 

Not sure if the implication here is that I was implying otherwise, because I wasn't trying to convince one way or the other. I'm just saying all the relevant people need to get together, agree on what our story is, and document the hell out of it.

Personally, I think it's pretty reasonable to tell people that if they want to use migration by default they need to a) have uniform hardware or b) plan ahead of time before installing their guests and set them up accordingly.

> If a user want to have migration there's more to do than match the CPU, they
> have to setup shared luns/clustered file systems, ensure they have consistent
> network topology, etc, etc. Checking the CPU level as part of that checklist is
> a very small price to pay for a smaller number of users who'll need it. 
> Also the most common scenario for clustered VMs is to have a cluster of
> identical machines so the chances are selecting Nehalem as the cpu would work
> in the whole cluster.
> 

There is a difference here though between the rest of migration config and a VMs CPU. You can always move storage and network config around transparently to a VM. But you CANT change the CPU after initial guest creation, correct? There's a big difference between "that won't work, move the disk image to shared storage" and 'that won't work, you will need to recreate the VM with a different CPU'

> I DON'T think we should use -cpu host, the suggested change in 700272 is
> different from the qemu-kvm "-cpu host" option. 
> 
> Answering your specific questions
<snip>

I appreciate the answers, but the motivation wasn't really my own edification. I was saying that's what our documentation would need to adequately cover if we are going to change the default. So things like detailed copies of error messages, etc. Your answers are a start but understandably the end user docs should be much more thorough. I'm pretty sure changing the default will piss off some release-note-ignoring user, so we need docs here for CYA at least.
Comment 4 Cole Robinson 2011-07-23 20:43:40 EDT
*** Bug 724866 has been marked as a duplicate of this bug. ***
Comment 5 Dave Allan 2011-07-24 22:56:49 EDT
(In reply to comment #2)
> I DON'T think we should use -cpu host, the suggested change in 700272 is
> different from the qemu-kvm "-cpu host" option. 

Agreed.  The 700272 approach should provide the highest performance on the host on which the guest was started as well as allowing libvirt to validate migration targets and fail the migration if the target host has an unsuitable CPU.  We must, of course, also provide a very concrete error message explaining what's wrong if the user attempts to migrate to a less capable CPU and we reject the migration for that reason.

> New default should be the maximum supported by the machine - either based on
> copying the CPU type or using the #700272 approach.
> For migration it means that *IF* the user wants to migrate the VM then they'll
> need to either make sure the hosts are "compatible" or lower the cpu type.

I agree with this logic.  IMO, it's a lot easier to throw an error saying, sorry, we can't migrate your VM to that nasty old box in the corner because it's got an ancient CPU than it is to try to ask users up front whether they want maximum performance or the broadest possible set of migration targets, although I could be convinced that it's possible to do UI presenting the user with that choice in a desirable way.
Comment 6 RHEL Product and Program Management 2011-10-07 12:11:09 EDT
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.
Comment 7 Cole Robinson 2012-02-07 19:51:52 EST
*** Bug 725084 has been marked as a duplicate of this bug. ***
Comment 10 RHEL Product and Program Management 2012-07-10 03:46:19 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 11 RHEL Product and Program Management 2012-07-10 21:56:11 EDT
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.
Comment 14 Dave Allan 2012-09-04 13:41:59 EDT
Jiri, what's the domain XML to copy the host's capabilities?
Comment 15 Jiri Denemark 2012-09-06 05:45:36 EDT
The XML element which tells libvirt to use a CPU model as close as possible to
host CPU and which allows libvirt to check the CPU during migration is

    <cpu mode='host-model'/>
Comment 18 Larry Troan 2012-10-11 13:54:08 EDT
Dave Allan has commented that this request is more difficult than he'd initially scoped and has pushed it to 6.5 for consideration.
Comment 20 Martin Kletzander 2013-03-28 05:57:15 EDT
One thing occurred on my mind when reading this BZ.
How about letting the user decide with a simple radio button?

  Set CPU of the guest to maximize:
   (a) migration compatibility [default]
   (b) raw performance

There would be also help strings describing that for detailed CPU specification, the user can customize the configuration before install and what each of these options do.  (a) would be what virt-manager does till now, (b) would do the same as clicking the 'Copy host CPU specification' in the Processor settings.

Just my $0.02.
Comment 22 Gunannan Ren 2013-04-17 10:33:31 EDT
https://www.redhat.com/archives/virt-tools-list/2013-April/msg00178.html
The patches have been sent out
Comment 23 Gunannan Ren 2013-04-26 03:43:13 EDT
The 'host-passthrough' is not an option. libvirt 'taints' the VM when using this mode, so virt-manager is not going to place it on UI.

After a discussion in upstream
https://www.redhat.com/archives/virt-tools-list/2013-April/msg00207.html

the 'host-model' is not mature enough to be used in UI currently. The reverting patch has been sent out to revert the above patches.

So keep original, re-evaluate the possibility of change later.
Thanks for Cole, Martin and Jiri's suggestion.
Comment 24 Dave Allan 2013-06-24 10:09:47 EDT
(In reply to Gunannan Ren from comment #23)
> The 'host-passthrough' is not an option. libvirt 'taints' the VM when using
> this mode, so virt-manager is not going to place it on UI.
> 
> After a discussion in upstream
> https://www.redhat.com/archives/virt-tools-list/2013-April/msg00207.html
> 
> the 'host-model' is not mature enough to be used in UI currently. The
> reverting patch has been sent out to revert the above patches.
> 
> So keep original, re-evaluate the possibility of change later.
> Thanks for Cole, Martin and Jiri's suggestion.

Given that, I think that we should not change this behavior in RHEL6, but I agree that this is a useful RFE and should be continued upstream and in future RHEL versions.