Bug 1461476

Summary: Extend sla architecture to consider hugepages-enabled VMs
Product: [oVirt] ovirt-engine Reporter: Martin Polednik <mpoledni>
Component: Backend.CoreAssignee: Martin Sivák <msivak>
Status: CLOSED CURRENTRELEASE QA Contact: Artyom <alukiano>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.0CC: apinnick, bugs, dfediuck, lsurette, mavital, mgoldboi, michal.skrivanek, mpoledni, rbalakri, Rhev-m-bugs, srevivo, ykaul
Target Milestone: ovirt-4.2.0Keywords: FutureFeature, Triaged
Target Release: 4.2.0Flags: rule-engine: ovirt-4.2+
rule-engine: blocker+
alukiano: testing_plan_complete+
mgoldboi: planning_ack+
rule-engine: devel_ack+
mavital: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-20 11:43:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1481246    
Bug Blocks: 1457239    

Description Martin Polednik 2017-06-14 14:31:47 UTC
The scheduler should be able to schedule VMs that have memory backed by memory - such VMs do not consume regular memory pages we normally consider, but rather hugepages from previously allocated pool.

Specification of hugepages feature:
- hugepage cannot be split: VM with 800 MB ram and hugepage size of 1048576 will consume exactly one hugepage
- only the memory backing of the guest is stored on hugepages; QEMU overhead resides in regular memory
- host reports available hugepages for VM scheduling in it's stats
```
    ...
    "hugepages": {
        "1048576": {
            "resv_hugepages": 0,
            "free_hugepages": 16,
            "nr_overcommit_hugepages": 0,
            "surplus_hugepages": 0,
            "vm.free_hugepages": 16, # <--- the amount of hugepages that we may use for VMs (with hugepage size of 1048576)
            "nr_hugepages": 16,
            "nr_hugepages_mempolicy": 16
        },
        "2048": {
            "resv_hugepages": 0,
            "free_hugepages": 0,
            "nr_overcommit_hugepages": 0,
            "surplus_hugepages": 0,
            "vm.free_hugepages": 0, # <--- the amount of hugepages that we may use for VMs (with hugepage size of 2048)
            "nr_hugepages": 0,
            "nr_hugepages_mempolicy": 0
        }
    },
    "dateTime": "2017-06-14T13:02:53 GMT",
    ...
```
- - scheduler *must* use the highlighted field as certain number of system free hugepages may be reserved for something else (e.g. DPDK)
- host reports available hugepages sizes in capabilities
```
    ...
    "hostedEngineDeployed": false,
    "hugepages": [
        1048576,
        2048
    ],
    "kvmEnabled": "true",
    ...
```
- allocating hugepages reduces available system memory by number of hugepages times hugepages size (even if the hugepages are unused)
- - in other words, each hugepage size can be treated as separate memory pool
- - example: if the host has 64 GiB RAM and 14 1048576 hugepages, it's free memory follows 0 <= free memory <= (64 - 14) GiB
- - the hugepages will be allocated at the boot time
- as far as scheduler is concerned, the hugepages cannot be dynamically (de)allocated
- migration for VMs with hugepages is disabled
- memory hot(unplug) is disabled
- no overcommit
- quota support (maximum number of hugepages consumed by user)
- no special permissions required
- not numa aware at the moment

Comment 1 Yaniv Kaul 2017-06-14 15:45:46 UTC
Do we wish to have different clusters for hosts with static huge pages?

Comment 2 Michal Skrivanek 2017-06-15 11:08:30 UTC
it wouldn't really help much with most of the work needed. once scheduler is aware there is not real need to separate. I mean, sure, it may make sense to do that, but I do not see a reason to enforce it

Comment 5 Michal Skrivanek 2017-06-21 10:21:10 UTC
sorry, it was supposed to be an oVirt bug

Comment 6 Artyom 2017-09-26 10:22:57 UTC
Verified according to polarion plan.
ovirt-engine-4.2.0-0.0.master.20170921184504.gitfcfc9a7.el7.centos.noarch

Small notes:
1) Quota does not recognize hugepages - Martin Sivak said it expected behavior.
2) Memory hot-plug does not work for 1 GB pages - https://bugzilla.redhat.com/show_bug.cgi?id=1495535
3) Memory hot-unplug works for 2 MB hugepages, I need to get answer if it expected behavior or bug(see under description "memory hot(unplug) is disabled")

Comment 9 Sandro Bonazzola 2017-12-20 11:43:51 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.