Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 526862 - [RHEL5 Xen]: Mask out CPU features by default
[RHEL5 Xen]: Mask out CPU features by default
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
5.4
All Linux
low Severity medium
: rc
: ---
Assigned To: Andrew Jones
Virtualization Bugs
: FutureFeature
: 711322 (view as bug list)
Depends On:
Blocks: 514489 514490 711070
  Show dependency treegraph
 
Reported: 2009-10-02 03:08 EDT by Chris Lalancette
Modified: 2013-01-08 08:38 EST (History)
13 users (show)

See Also:
Fixed In Version: kernel-2.6.18-294.el5
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-02-20 22:26:24 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
cpuid whitelist function (8.21 KB, patch)
2011-10-07 11:57 EDT, Andrew Jones
no flags Details | Diff
Do not expose X86_FEATURE_POPCNT feature to avoid crash on migration to a host that doesn't have it (706 bytes, patch)
2011-10-18 09:11 EDT, Igor Mammedov
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0150 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 5.8 kernel update 2012-02-21 02:35:24 EST

  None (edit)
Description Chris Lalancette 2009-10-02 03:08:15 EDT
Description of problem:
Trying to boot Xen guests on newer hardware is always an adventure.  One of the reasons for this is that, by default, the Xen hypervisor only masks out features it knows it can't support.  However, it can't know about support for newer features until the features are there.  So what it should instead do is to mask out *all* features, and then selectively enable the ones that it knows it can support.  Upstream currently does not work like this (nor does RHEL-5), so we will have to submit patches there.
Comment 1 Paolo Bonzini 2010-06-22 12:58:36 EDT
Chris,

considering upstream's handling of CPUID is completely different and done in the tools, the solution to this issue would likely have to be done twice for RHEL-5 and upstream.

It seems more palatable (if anything) to backport upstream's userspace handling of CPUID.  The patches are relatively large but quite self-contained, and it would make it easier to tweak the defaults without requiring kernel upgrades.  What do you think?
Comment 2 Chris Lalancette 2010-06-29 16:43:12 EDT
Hey Paolo,
     I'm OK with going with upstream's userspace implementation, though I'm not quite sure how it works.  In particular, does it *always* send a list of supported flags down to the hypervisor when starting a guest?  As long as there is always a whitelist (that will mask out things like GB pages, etc), then I think doing the userspace version would be just fine.  The only thing we'll have to be careful of is that since this is (probably?) a change to the hypervisor/tools ABI, we'll have to have a compat mode so that a new userspace could run on an older hypervisor.

Chris Lalancette
Comment 4 Andrew Jones 2011-06-14 03:52:08 EDT
Some features current I'd like to mask out haven't necessarily caused problems, but one never knows going into the future, and of course the idea behind this bug is to guard against features that don't currently exist.

One current feature I'd like to mask is X86_FEATURE_HT. This hasn't caused problems yet, but it does cause a warning to be output on every boot of RHEL6 PV guests.

CPU: Unsupported number of siblings

This is output from detect_ht(). After that warning, the guest kernel decides to to forget the whole thing and is fine. The warning could be avoided by simply masking the HT feature though.
Comment 5 Andrew Jones 2011-10-07 11:57:23 EDT
Created attachment 526923 [details]
cpuid whitelist function
Comment 7 Konrad Rzeszutek Wilk 2011-10-07 14:35:58 EDT
It looks like it could be quite useful in the upstream Xen? Why not post there as well?
Comment 8 Laszlo Ersek 2011-10-07 15:48:44 EDT
Hello Konrad,

it was our understanding (... any inaccuracy in representing my colleagues' understanding is my fault ...) that upstream Xen "has a mix of white and black listing depending on guest type and does its cpuid management in userspace".

The set of whitelisted features might be useful for upstream, but then it should be specified somewhere in the vm configs or another default setting in userspace, shouldn't it? (Eg. tools/libxc/xc_cpuid_x86.c, amd_xc_cpuid_policy() / intel_xc_cpuid_policy().)
Comment 9 Igor Mammedov 2011-10-10 08:50:52 EDT
*** Bug 711070 has been marked as a duplicate of this bug. ***
Comment 10 Igor Mammedov 2011-10-18 09:11:57 EDT
Created attachment 528802 [details]
Do not expose X86_FEATURE_POPCNT feature to avoid crash on migration to a host that doesn't have it

FC16 HVM will crash after migration with invalid op if it was started on host with X86_FEATURE_POPCNT feature but have been migrated to a host without it.

Attached patch, applied on top of white-listing-V2, fixes this issue.
Comment 14 Jarod Wilson 2011-10-27 09:09:32 EDT
Patch(es) available in kernel-2.6.18-294.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.
Comment 16 Andrew Jones 2011-11-06 05:27:34 EST
*** Bug 711322 has been marked as a duplicate of this bug. ***
Comment 22 Qin Guan 2012-01-18 03:24:06 EST
Testing of this problem is covered by running acceptance/functional test with
several Snapshot builds (from Snapshot1 to Snapshot4) on different CPU models. 

No any problem found during the testing, marked it as Verified:SanityOnly.
Comment 24 errata-xmlrpc 2012-02-20 22:26:24 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0150.html

Note You need to log in before you can comment on or make changes to this bug.