Bug 1398116 - [Docs][Compute] Document constraints of using 1GB huge pages
Summary: [Docs][Compute] Document constraints of using 1GB huge pages
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 13.0 (Queens)
Hardware: All
OS: Linux
low
low
Target Milestone: ga
: 16.0 (Train on RHEL 8.1)
Assignee: Irina
QA Contact: James Parker
URL:
Whiteboard: docs-accepted
Depends On:
Blocks: 1791319 1800565 1800568 1800678 1800682
TreeView+ depends on / blocked
 
Reported: 2016-11-24 07:31 UTC by VIKRANT
Modified: 2022-08-10 09:46 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
: 1398343 1800565 (view as bug list)
Environment:
Last Closed: 2020-02-07 11:01:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-7813 0 None None None 2022-08-10 09:46:34 UTC
Red Hat Knowledge Base (Solution) 2778991 0 None None None 2016-11-24 10:35:49 UTC

Description VIKRANT 2016-11-24 07:31:22 UTC
Description of problem:

Not able to set the huge pages in guest OS if the cpu_model parameter in set on compute node. If this parameter is not set on compute node then able to set the huge pages in guest OS without any issue. 



Version-Release number of selected component (if applicable):
RHEL OSP 7

Other versions are likely impacted due to this issue.

How reproducible:
Everytime. 


Steps to Reproduce:
1. Set the cpu_model" parameter in /etc/nova/nova.conf file of compute node.
2. Try to set huge pages on guest instance but it's not getting set.
3. Remove the cpu_model parameter now we are able to set the huge pages in  guest OS without any issue.

Actual results:
We are not able to set guest OS huge pages until cpu_model is set on compute node. 

Expected results:
We should be able to set guest OS huge pages irrespective of cpu_model is set on compute node or not. 


Additional info:

Issue looks similar to upstream Bug [1]

[1] https://bugs.launchpad.net/nova/+bug/1538565

Comment 2 Daniel Berrangé 2016-11-24 10:13:08 UTC
(In reply to VIKRANT from comment #0)
> Description of problem:
> 
> Not able to set the huge pages in guest OS if the cpu_model parameter in set
> on compute node. If this parameter is not set on compute node then able to
> set the huge pages in guest OS without any issue. 
> 
> 
> 
> Version-Release number of selected component (if applicable):
> RHEL OSP 7
> 
> Other versions are likely impacted due to this issue.
> 
> How reproducible:
> Everytime. 
> 
> 
> Steps to Reproduce:
> 1. Set the cpu_model" parameter in /etc/nova/nova.conf file of compute node.
> 2. Try to set huge pages on guest instance but it's not getting set.
> 3. Remove the cpu_model parameter now we are able to set the huge pages in 
> guest OS without any issue.
> 
> Actual results:
> We are not able to set guest OS huge pages until cpu_model is set on compute
> node. 
> 
> Expected results:
> We should be able to set guest OS huge pages irrespective of cpu_model is
> set on compute node or not. 

It isn't quite that simple. If using 2 MB huge pages, you can use *any* CPU model as no special features are required for that to work. If you want to use 1 GB huge pages, then uou need to pick a CPU model that contains the "pdpe1gb" flag.  *none* of the Intel CPU models supported by QEMU include that flag. So realistically at this time you need to use the "host-passthrough" feature.

Nova *does* need a feature enhancement to extend the "cpu_model" syntax to allow us to request extra CPU flags beyond what the basic model allows. eg so you could do

cpu_model=Haswell
cpu_model_extra_flags="pdpe1gb"

but that will obviously need some work upstream to support

Comment 3 VIKRANT 2016-11-24 10:39:18 UTC
Daniel,

Thanks. When you saying " *none* of the Intel CPU models supported by QEMU include that flag." does that mean this limitation is present outside nova ? But if the Cu is not setting "cpu_model" then in that case huge pages are working fine.

Comment 4 Daniel Berrangé 2016-11-24 10:44:00 UTC
(In reply to VIKRANT from comment #3)
> Daniel,
> 
> Thanks. When you saying " *none* of the Intel CPU models supported by QEMU
> include that flag." does that mean this limitation is present outside nova ?
> But if the Cu is not setting "cpu_model" then in that case huge pages are
> working fine.

The features included in the named CPU models are defined by QEMU, not Nova, so apply to any app using those models. Only the Opteron_G4 and Opteron_G5 CPU models include the pdpe1gb feature flag. For every other CPU model you need to request the flag manually, which nova does not yet support

For Nova, if neither 'cpu_mode' or 'cpu_model' are set in nova.conf, then Nova will fallback to use "host-model" which will include the flag.

Comment 5 Stephen Gordon 2016-11-24 13:38:54 UTC
I'm going to move this one to documentation for now, as while there is a long term need to provide some more flexibility in Nova here it doesn't necessarily seem imminent.

Comment 6 Daniel Berrangé 2016-11-24 13:49:49 UTC
Cloned this back to Nova to track future CPU flag enhancement, since this is not the first time this general requirement has come up

 https://bugzilla.redhat.com/show_bug.cgi?id=1398343

Comment 7 Lucy Bopf 2016-11-25 05:34:32 UTC
(In reply to Stephen Gordon from comment #5)
> I'm going to move this one to documentation for now, as while there is a
> long term need to provide some more flexibility in Nova here it doesn't
> necessarily seem imminent.

Steve, are there specific updates you're looking to have added to the documentation to cover this bug?

I'll update the summary accordingly.

Comment 8 Stephen Gordon 2016-12-05 16:05:07 UTC
Basically the documentation around the use of huge pages, specifically 1G ones, needs to call out the constraints Dan mentions in comment # 2.

At this time this effectively means they must use libvirt_cpu_mode=host-passthrough on the compute nodes if the intent is to use 1G huge pages in the guests, this has ramifications for cluster compatibility for live migration (see also: https://wiki.openstack.org/wiki/LibvirtXMLCPUModel).

Comment 9 Lucy Bopf 2016-12-06 03:29:49 UTC
Thanks, Steve. Updating the summary to reflect the requirement.

Comment 12 Matthew Booth 2019-07-05 14:41:22 UTC
SME: Kashyap Chamarty <kchamart>


Note You need to log in before you can comment on or make changes to this bug.