Bug 626806 - [RHEL 5.5] 32-bit pvhvm guest on 64-bit host crash w/xm mem-set
Summary: [RHEL 5.5] 32-bit pvhvm guest on 64-bit host crash w/xm mem-set
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.6
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Michal Novotny
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 674514 674516
Blocks: 514499
TreeView+ depends on / blocked
 
Reported: 2010-08-24 13:25 UTC by Andrew Jones
Modified: 2014-02-02 22:38 UTC (History)
13 users (show)

Fixed In Version: xen-3.0.3-124.el5
Doc Type: Bug Fix
Doc Text:
Previously, a 32-bit HVM guest running under a 64-bit hypervisor terminated unexpectedly if the "xm mem-set" command attempted to change the guest's memory reservation. With this update, a patch has been provided that checks whether the HVM domain is 32-bit or 64-bit. The patch then disallows the aforementioned method of setting up memory for guests, thus ensuring this bug can no longer occur.
Clone Of: 605697
: 674514 (view as bug list)
Environment:
Last Closed: 2011-07-21 09:16:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to implement domain_is_32_bit_hvm() and forbid setting up memory for 32-on-64 bit HVM guests (3.96 KB, patch)
2011-02-23 16:44 UTC, Michal Novotny
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1070 0 normal SHIPPED_LIVE xen bug fix and enhancement update 2011-07-21 09:12:56 UTC

Comment 1 Andrew Jones 2010-08-24 13:38:56 UTC
We need to keep users from attempting to do 'xm mem-set' on 32b HVM guests when running on 64b hosts. Ballooning for 32on64 HVM guests is currently not implemented in the HV, so it returns -ENOSYS. But when the HV returns -ENOSYS the guest kernel BUGs and crashes. It would be nice to protect against this in the guest kernel (rather than the tools), but it's not so easy to do so for two reasons: 1) the guest needs to know it's on a 64b host (since 32on32 should work). 2) the balloon driver currently isn't designed in such a way that error propagation is easily implemented. At some point we may try to support 32on64 hvm ballooning (as upstream does), but not in the immediate future. We should protect against this in the tools in the meantime.

Comment 2 Andrew Jones 2010-08-24 13:54:52 UTC
Just some thoughts for getting the 32on64 detection done.

Backport the xc_get_bit_size() bits from upstream changeset 20558 for the guest
bitsize detection, and use os.uname()[4] for the host bitsize detection.

That might not be the best way, so I'll leave it to Radim to sort out.

Drew

Comment 3 Miroslav Rezanina 2010-09-17 09:50:28 UTC
As we have fix for 605697 this is not so important to fix now. However, we should add this check for non-rhel guest - we do not support this feature and user should be inform about that.

Moving to 5.7

Comment 4 Paolo Bonzini 2010-09-17 11:25:08 UTC
HVM guests always have the same bitsize as the host, what really matters to decide 32-bit vs. 64-bit, is whether long-mode is active or not.

Comment 7 Michal Novotny 2011-01-31 12:23:42 UTC
(In reply to comment #4)
> HVM guests always have the same bitsize as the host, what really matters to
> decide 32-bit vs. 64-bit, is whether long-mode is active or not.

Paolo,
you're saying that the approach used in c/s 20558 is not  the right way to go but the right way to go is to determine whether the guest is in long-mode or not, right?

Does it mean to check for CR0.PG = 1, CR0.PAE = 1 and EFER.LME = 1 in the qemu-dm? At least this is how I understand the description of entering long-mode as described in [1]. Or do I need anything else to be done (since there's some note about compatibility mode)? But we don't want to enter the long mode rather than detect it in the VM.

Thanks,
Michal

[1] http://en.wikibooks.org/wiki/X86_Assembly/Protected_Mode#Entering_Long_Mode

Comment 9 Andrew Jones 2011-01-31 13:12:01 UTC
You can use the same technique used by the balloon driver to decide if it's possible or not. Simply attempt to do a decrease for a size of 0. If you get back -ENOSYS from the HV, then it doesn't work, and you can then output a nice message to the user.  So you'll add code to the memset command that does something this

if domain_memory_decrease_reservation(domid=d,size=0) == -ENOSYS
   print "no can do for domain %d\n",d

You'll need to copy+paste+modify the pyxc_domain_memory_increase_reservation and corresponding codes since domain_memory_decrease_reservation doesn't currently exist in userspace.

Comment 10 Paolo Bonzini 2011-01-31 13:56:10 UTC
> Does it mean to check for CR0.PG = 1, CR0.PAE = 1 and EFER.LME = 1 in the
> qemu-dm?

Not in qemu-dm but you can use the hypervisor data structures.  Checking
EFER.LMA = 1 is enough.

This requires backporting:

- c/s 19168 and 19169 for the hypervisor

- c/s 19170 for libxc 

On top of this you can pass the information to Python.  Take a look at c/s
19171 for how to detect long-mode.  Changeset 20558 that Andrew pointed out is
not necessary, but you will need a Python binding for the new code in libxc and
that changeset can help you figure that out.

This also enables us to fix xenctx for 32-on-64 and HVM domains.  This is
upstream c/s 18962, c/s 18963, the xc_pagetab.c changes of c/s 19170, c/s
19171, c/s 19453.  Of course that would be a separate BZ.

That said, it looks like fixing 32-on-64 ballooning should not be too hard.  The relevant changesets are between 17780 and 17855.  Unless we want the xenctx changes anyway, it's probably better if we first try to implement the feature for real in the hypervisor.

Comment 11 Andrew Jones 2011-01-31 14:10:32 UTC
Ok, so now we have 3 options:

1) simply do the -ENOSYS check and print a message to the user
2) more complicated changes, but enable the fixing of another "bug" - 32on64 hvm xenctx (we need to write new bug for that)
3) fix decrease_memory_reservation in the HV - that requires moving this bug over to kernel-xen 

or 

4) combo of (2) and (3) in order get both a working xenctx and decrease_memory_reservation

I'm all for number (4), considering Paolo has already done all the upstream analysis. We can at least open the bugs with low priority and work them when able. It would have been nice to do (1) for 5.6, but it's too late now... Maybe we should still do (1) now, in case the low priority bugs (2) and (3) don't get scheduled in time for 5.7? That adds a required revert though if we do do number (3).

Comment 12 Paolo Bonzini 2011-01-31 14:30:20 UTC
Ugh, forgot to add this in my message.

The ENOSYS check wouldn't work, because the implementation of decrease_reservation is different depending on whether the _caller_ (not the referenced domain!!!) is PV or HVM.  The latter is in arch/x86/hvm/hvm.c, see hvm_hypercall32_table.

Let's create xen and kernel-xen bugs for xenctx (item 2 in Drew's list) for now...

Comment 13 Michal Novotny 2011-02-02 09:04:43 UTC
(In reply to comment #12)
> Ugh, forgot to add this in my message.
> 
> The ENOSYS check wouldn't work, because the implementation of
> decrease_reservation is different depending on whether the _caller_ (not the
> referenced domain!!!) is PV or HVM.  The latter is in arch/x86/hvm/hvm.c, see
> hvm_hypercall32_table.
> 
> Let's create xen and kernel-xen bugs for xenctx (item 2 in Drew's list) for
> now...

Paolo, did you alaredy create xen and kernel-xen bugs for xenctx ? If so, could you please put the bug numbers to this bug?

Also, what should we do about this one ? Make it blocked by those kernel-xen and xen bugs you've created (or you'll create respectively)?

Thanks,
Michal

Comment 14 Michal Novotny 2011-02-23 16:44:28 UTC
Created attachment 480524 [details]
Patch to implement domain_is_32_bit_hvm() and forbid setting up memory for 32-on-64 bit HVM guests

This is the patch to implement a new domain_is_32_bit_hvm() API into the libxc Python binding to check whether the HVM domain is 32-bit or 64-bit. The function returns 1 if the domain is 32-bit, 0 if it's 64-bit and -1 if the domain is a PV domain.

This function (also with checking for 64-bit system from capabilities) is
used in the setMemoryTarget() method in Python XenD code to forbid setting
up memory for 32-on-64 bit HVM guests since it's not supported.

Michal

Comment 18 Yufang Zhang 2011-03-25 07:11:56 UTC
QA verified this bug on xen-3.0.3-126.el5:

1. Start a 32bit pvHVM on x86_64 host

2. Try to balloon down the guest

# xm mem-set vm1 400
Error: (38, 'Function not implemented')
Usage: xm mem-set <Domain> <Mem>

Set the current memory usage for a domain.



So change this bug to VERIFIED.

Comment 19 Tomas Capek 2011-07-13 13:25:57 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, a 32-bit HVM guest running under a 64-bit hypervisor terminated unexpectedly if the "xm mem-set" command attempted to change the guest's memory reservation. With this update, a patch has been provided that implements a new domain_is_32_bit_hvm() API into the libxc Python binding library to check whether the HVM domain is 32-bit or 64-bit. The patch also disallows the aforementioned method of setting up memory for guests, thus ensuring this bug can no longer occur.

Comment 20 Paolo Bonzini 2011-07-13 14:53:21 UTC
I don't think internal API names should be included in the technical notes.

Comment 21 Paolo Bonzini 2011-07-13 14:53:21 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, a 32-bit HVM guest running under a 64-bit hypervisor terminated unexpectedly if the "xm mem-set" command attempted to change the guest's memory reservation. With this update, a patch has been provided that implements a new domain_is_32_bit_hvm() API into the libxc Python binding library to check whether the HVM domain is 32-bit or 64-bit. The patch also disallows the aforementioned method of setting up memory for guests, thus ensuring this bug can no longer occur.+Previously, a 32-bit HVM guest running under a 64-bit hypervisor terminated unexpectedly if the "xm mem-set" command attempted to change the guest's memory reservation. With this update, a patch has been provided that checks whether the HVM domain is 32-bit or 64-bit. The patch then disallows the aforementioned method of setting up memory for guests, thus ensuring this bug can no longer occur.

Comment 22 errata-xmlrpc 2011-07-21 09:16:59 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1070.html

Comment 23 errata-xmlrpc 2011-07-21 11:58:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1070.html


Note You need to log in before you can comment on or make changes to this bug.