The scheduler should be able to schedule VMs that have memory backed by memory - such VMs do not consume regular memory pages we normally consider, but rather hugepages from previously allocated pool. Specification of hugepages feature: - hugepage cannot be split: VM with 800 MB ram and hugepage size of 1048576 will consume exactly one hugepage - only the memory backing of the guest is stored on hugepages; QEMU overhead resides in regular memory - host reports available hugepages for VM scheduling in it's stats ``` ... "hugepages": { "1048576": { "resv_hugepages": 0, "free_hugepages": 16, "nr_overcommit_hugepages": 0, "surplus_hugepages": 0, "vm.free_hugepages": 16, # <--- the amount of hugepages that we may use for VMs (with hugepage size of 1048576) "nr_hugepages": 16, "nr_hugepages_mempolicy": 16 }, "2048": { "resv_hugepages": 0, "free_hugepages": 0, "nr_overcommit_hugepages": 0, "surplus_hugepages": 0, "vm.free_hugepages": 0, # <--- the amount of hugepages that we may use for VMs (with hugepage size of 2048) "nr_hugepages": 0, "nr_hugepages_mempolicy": 0 } }, "dateTime": "2017-06-14T13:02:53 GMT", ... ``` - - scheduler *must* use the highlighted field as certain number of system free hugepages may be reserved for something else (e.g. DPDK) - host reports available hugepages sizes in capabilities ``` ... "hostedEngineDeployed": false, "hugepages": [ 1048576, 2048 ], "kvmEnabled": "true", ... ``` - allocating hugepages reduces available system memory by number of hugepages times hugepages size (even if the hugepages are unused) - - in other words, each hugepage size can be treated as separate memory pool - - example: if the host has 64 GiB RAM and 14 1048576 hugepages, it's free memory follows 0 <= free memory <= (64 - 14) GiB - - the hugepages will be allocated at the boot time - as far as scheduler is concerned, the hugepages cannot be dynamically (de)allocated - migration for VMs with hugepages is disabled - memory hot(unplug) is disabled - no overcommit - quota support (maximum number of hugepages consumed by user) - no special permissions required - not numa aware at the moment
Do we wish to have different clusters for hosts with static huge pages?
it wouldn't really help much with most of the work needed. once scheduler is aware there is not real need to separate. I mean, sure, it may make sense to do that, but I do not see a reason to enforce it
sorry, it was supposed to be an oVirt bug
Verified according to polarion plan. ovirt-engine-4.2.0-0.0.master.20170921184504.gitfcfc9a7.el7.centos.noarch Small notes: 1) Quota does not recognize hugepages - Martin Sivak said it expected behavior. 2) Memory hot-plug does not work for 1 GB pages - https://bugzilla.redhat.com/show_bug.cgi?id=1495535 3) Memory hot-unplug works for 2 MB hugepages, I need to get answer if it expected behavior or bug(see under description "memory hot(unplug) is disabled")
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.