Bug 526862 - [RHEL5 Xen]: Mask out CPU features by default
Summary: [RHEL5 Xen]: Mask out CPU features by default
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Andrew Jones
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 711322 (view as bug list)
Depends On:
Blocks: 514489 514490 711070
TreeView+ depends on / blocked
 
Reported: 2009-10-02 07:08 UTC by Chris Lalancette
Modified: 2013-01-08 13:38 UTC (History)
13 users (show)

Fixed In Version: kernel-2.6.18-294.el5
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-02-21 03:26:24 UTC
Target Upstream Version:


Attachments (Terms of Use)
cpuid whitelist function (8.21 KB, patch)
2011-10-07 15:57 UTC, Andrew Jones
no flags Details | Diff
Do not expose X86_FEATURE_POPCNT feature to avoid crash on migration to a host that doesn't have it (706 bytes, patch)
2011-10-18 13:11 UTC, Igor Mammedov
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0150 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 5.8 kernel update 2012-02-21 07:35:24 UTC

Description Chris Lalancette 2009-10-02 07:08:15 UTC
Description of problem:
Trying to boot Xen guests on newer hardware is always an adventure.  One of the reasons for this is that, by default, the Xen hypervisor only masks out features it knows it can't support.  However, it can't know about support for newer features until the features are there.  So what it should instead do is to mask out *all* features, and then selectively enable the ones that it knows it can support.  Upstream currently does not work like this (nor does RHEL-5), so we will have to submit patches there.

Comment 1 Paolo Bonzini 2010-06-22 16:58:36 UTC
Chris,

considering upstream's handling of CPUID is completely different and done in the tools, the solution to this issue would likely have to be done twice for RHEL-5 and upstream.

It seems more palatable (if anything) to backport upstream's userspace handling of CPUID.  The patches are relatively large but quite self-contained, and it would make it easier to tweak the defaults without requiring kernel upgrades.  What do you think?

Comment 2 Chris Lalancette 2010-06-29 20:43:12 UTC
Hey Paolo,
     I'm OK with going with upstream's userspace implementation, though I'm not quite sure how it works.  In particular, does it *always* send a list of supported flags down to the hypervisor when starting a guest?  As long as there is always a whitelist (that will mask out things like GB pages, etc), then I think doing the userspace version would be just fine.  The only thing we'll have to be careful of is that since this is (probably?) a change to the hypervisor/tools ABI, we'll have to have a compat mode so that a new userspace could run on an older hypervisor.

Chris Lalancette

Comment 4 Andrew Jones 2011-06-14 07:52:08 UTC
Some features current I'd like to mask out haven't necessarily caused problems, but one never knows going into the future, and of course the idea behind this bug is to guard against features that don't currently exist.

One current feature I'd like to mask is X86_FEATURE_HT. This hasn't caused problems yet, but it does cause a warning to be output on every boot of RHEL6 PV guests.

CPU: Unsupported number of siblings

This is output from detect_ht(). After that warning, the guest kernel decides to to forget the whole thing and is fine. The warning could be avoided by simply masking the HT feature though.

Comment 5 Andrew Jones 2011-10-07 15:57:23 UTC
Created attachment 526923 [details]
cpuid whitelist function

Comment 7 Konrad Rzeszutek Wilk 2011-10-07 18:35:58 UTC
It looks like it could be quite useful in the upstream Xen? Why not post there as well?

Comment 8 Laszlo Ersek 2011-10-07 19:48:44 UTC
Hello Konrad,

it was our understanding (... any inaccuracy in representing my colleagues' understanding is my fault ...) that upstream Xen "has a mix of white and black listing depending on guest type and does its cpuid management in userspace".

The set of whitelisted features might be useful for upstream, but then it should be specified somewhere in the vm configs or another default setting in userspace, shouldn't it? (Eg. tools/libxc/xc_cpuid_x86.c, amd_xc_cpuid_policy() / intel_xc_cpuid_policy().)

Comment 9 Igor Mammedov 2011-10-10 12:50:52 UTC
*** Bug 711070 has been marked as a duplicate of this bug. ***

Comment 10 Igor Mammedov 2011-10-18 13:11:57 UTC
Created attachment 528802 [details]
Do not expose X86_FEATURE_POPCNT feature to avoid crash on migration to a host that doesn't have it

FC16 HVM will crash after migration with invalid op if it was started on host with X86_FEATURE_POPCNT feature but have been migrated to a host without it.

Attached patch, applied on top of white-listing-V2, fixes this issue.

Comment 14 Jarod Wilson 2011-10-27 13:09:32 UTC
Patch(es) available in kernel-2.6.18-294.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.

Comment 16 Andrew Jones 2011-11-06 10:27:34 UTC
*** Bug 711322 has been marked as a duplicate of this bug. ***

Comment 22 Qin Guan 2012-01-18 08:24:06 UTC
Testing of this problem is covered by running acceptance/functional test with
several Snapshot builds (from Snapshot1 to Snapshot4) on different CPU models. 

No any problem found during the testing, marked it as Verified:SanityOnly.

Comment 24 errata-xmlrpc 2012-02-21 03:26:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0150.html


Note You need to log in before you can comment on or make changes to this bug.