Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1308744

Summary: [RFE] PAPR Hash Page Table (HPT) resizing (RHV)
Product: Red Hat Enterprise Virtualization Manager Reporter: Karen Noel <knoel>
Component: ovirt-engineAssignee: Michal Skrivanek <michal.skrivanek>
Status: CLOSED ERRATA QA Contact: Liran Rotenberg <lrotenbe>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0.0CC: bugproxy, dgibson, eedri, gveitmic, hannsj_uhl, libvirt-maint, lsurette, mavital, mdeng, michal.skrivanek, michen, mtessun, qzhang, Rhev-m-bugs, royoung, sbonazzo, srevivo, tburke, virt-bugs, virt-maint, xuhan, xuma
Target Milestone: ovirt-4.3.0Keywords: FutureFeature, TestOnly, ZStream
Target Release: ---Flags: mavital: testing_plan_complete+
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 1308743
: 1624911 (view as bug list) Environment:
Last Closed: 2019-05-08 12:36:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1305398, 1308743    
Bug Blocks: 1248279, 1284775, 1577865, 1624911    

Description Karen Noel 2016-02-15 22:33:13 UTC
RHEV should determine if the guest supports HPT resizing, then increase the default max size of memory.

+++ This bug was initially created as a clone of Bug #1308743 +++

In case libvirt requires changes to support HPT resizing. If no changes required, set this BZ to TestOnly.

+++ This bug was initially created as a clone of Bug #1305398 +++

Description of problem:

Allow the hash page table (HPT) of PAPR guests to be resized at runtime.

This is important for practical memory hotplug.  Without this the HPT needs to be sized for the guest's maximum possible memory - since RHEV wants to set that to 4T, this can result in a much bigger than necessary HPT which wastes host resources and can cause allocation failures.  With HV KVM the HPT is unswappable, contiguous host memory.

This BZ covers the qemu parts of this including TCG and PR KVM implementation of the necessary hypercalls, feature negotation with the guest and enabling the necessary KVM host pieces.

--- Additional comment from David Gibson on 2016-02-07 19:52:00 EST ---

An RFC has been posted upstream:

https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg05852.html

Comment 1 Karen Noel 2016-02-15 22:40:31 UTC
*** Bug 1308746 has been marked as a duplicate of this bug. ***

Comment 2 David Gibson 2016-02-16 02:58:45 UTC
The resize-hpt=required machine option in qemu might be useful for RHEV to determine if the guest supports HPT resizing.  With that option qemu will refuse to boot a guest which does not support it (exiting with an error during boot).

So, RHEV could boot with that option, then if the boot fails adjust the max memory size down and restart with without the option to run the non-HPT-resize aware guest.

Obviously RHEV might then want to cache the value, and/or pre-populate it when it knows the distro / version of the guest.

Comment 5 David Gibson 2017-09-13 00:55:02 UTC
Note that the necessary qemu and kvm parts for this are now merged downstream.

That least only libvirt work as a prerequisite (which should be relatively simple) to use this in RHV.

Comment 6 David Gibson 2018-06-28 02:29:46 UTC
It should be possible to unblock this.  The qemu and libvirt changes are merged and released downstream, and seem to be working well (after some initial bugs were found and fixed).

There shouldn't be a lot that's needed on the oVirt / RHV side.  Basically qemu and the guest should negotiate HPT resizing automatically, and it will be triggered automatically when memory is hotplugged or unplugged.  libvirt should correctly handle locked memory allocation for it.

The only real impact for RHV is that for guests which do support HPT resizing, it can (again) freely use a large maximum memory size without that causing allocation of excessively sized HPTs any more.

Comment 7 Michal Skrivanek 2018-06-28 11:04:03 UTC
I guess we do not really need to change anything on RHV side. Since the bug was opened we added a notion of max memory (upper hotplug limit) for all platforms and use a conservative default of 4x the configured memory. That helped "enough" with the original problem, for sane VMs we're not allocating 4TB(2TB on ppc) blindly, and user have a way how to change that when they do not plan to use hotplug at all

With the HPT resizing implemented we will just see more VMs to fit into the host as the default overhead of counting with 4x as much memory is gone. But that's entirely transparent to the end user, it will just "work better" now.

Comment 8 David Gibson 2018-06-29 00:40:47 UTC
Michal,

Great, sounds like we're in agreement.

Comment 9 Liran Rotenberg 2018-07-11 08:52:06 UTC
Verified on PPC environment.
Reference to verification:
https://bugzilla.redhat.com/show_bug.cgi?id=1228543
https://bugzilla.redhat.com/show_bug.cgi?id=515840

Comment 12 Eyal Edri 2018-09-03 15:12:25 UTC
We cloned this bug to 4.2.6, I wasn't sure if the status for the 4.3 should stay verified or move back to ON_QA, please update if QE plans to verify it also on 4.3

Comment 13 Raz Tamir 2018-09-04 07:45:32 UTC
(In reply to Eyal Edri from comment #12)
> We cloned this bug to 4.2.6, I wasn't sure if the status for the 4.3 should
> stay verified or move back to ON_QA, please update if QE plans to verify it
> also on 4.3

You can keep it on modified as it is a medium and TestOnly

Thanks

Comment 14 Eyal Edri 2018-09-04 07:51:00 UTC
moving back to Modified so QE can test it for 4.3

Comment 15 Liran Rotenberg 2019-01-15 11:56:32 UTC
As comment #9, the tests passed on 4.3 environment.

Power9 hosts, RHEL7.6
ovirt-engine-4.3.0-0.6.alpha2.el7.noarch
vdsm-4.30.4-1.el7ev.ppc64le

Moving to verified.

Comment 17 errata-xmlrpc 2019-05-08 12:36:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:1085