Description of problem: The engine does not subtract VMs using dynamic hugepages from committed memory when calculating host scheduling memory, so the scheduling memory does not go down, allowing more and more VMs to be scheduled on the host. This can cause out of memory. Version-Release number of selected component (if applicable): vdsm-4.30.40-1.el7ev.x86_64 ovirt-engine-4.3.8.2-0.4.el7.noarch How reproducible: Always Steps to Reproduce: 1. Configure Dynamic Hugepages on the host (no static hugepages config on kernel cmdline) 1.1. /etc/vdsm/vdsm.conf [performance] use_dynamic_hugepages = true 1.2. Administration -> Configure -> Scheduling Policies -> Edit -> Disable "Huge Pages" Filter 2. Check Scheduling Memory of the Host <max_scheduling_memory>7963934720</max_scheduling_memory> 3. Run a VM with 2G of 1G hugepages 2020-02-18 15:23:18,687+1000 INFO (vm/8f54d8f6) [virt.vm] (vmId='8f54d8f6-3a16-49cd-980d-73ec684796c5') Allocating 2 (1048576) hugepages (memsize 2097152) (vm:2270) 4. Check Scheduling Memory, seems to have reduced only the overhead of the VM, not the actual memory (see below) <max_scheduling_memory>7775191040</max_scheduling_memory> Actual results: Overcommit Expected results: Prevent overcommit Additional info: The Commited Memory in the Scheduling Memory does not include the HugePages for each VM. The loop of getTotalRequiredMemoryInMb for all VMs on the host does not take it into account: It calls this, which deliberately does not account for HugePages. /** * A convenience method that makes it easier to express the difference * between normal and huge pages backed VM in scheduler. * * @return The amount of non huge page memory needed for the VM */ public static Integer getRequiredMemoryWithoutHugePages(VmBase vmBase) { if (isBackedByHugepages(vmBase)) { return 0; } else { return vmBase.getMemSizeMb(); } }
As this can get confusing... BZ1804037 - Scheduling Memory calculation disregards huge-pages ---> for not considering statically allocated hugepages at kernel cmdline when calculating scheduling memory BZ1804046 - Engine does not reduce scheduling memory when a VM with dynamic hugepages runs ---> for not considering VMs running with dynamic hugepages when calculating scheduling memory
See also Docs BZ1785507.
Isn't this a dupe of the other bug you just opened? Seems like docs can cover it. Andrej, thoughts?
Ryan, I'm not sure, why would this be a DUP? And how can this be fixed by a Docs change? I'm confused...
Ok, so it's more or less two scheduling bugs around hugepages which look very similar, and a docs bug about making hugepages less confusing. Andrej is the scheduler engineer, but I'm curious what the expected resolution for this set of bugs is from a customer perspective. From my POV, we'd document and let a relatively tricky edge case get managed per use case "Engine does not reduce scheduling memory when a VM with dynamic hugepages runs." vs "Scheduling Memory calculation disregards huge-pages"
I agree its a bit confusing, but IMHO they are different. 1) The Docs bug is about dynamic hugepages, which are involved only in BZ1804046. The other BZ happens with static hugepages. 2) The other 2 bugs are about Free Scheduling memory not being adjusted. But each bug has a different reason. "Engine does not reduce scheduling memory when a VM with dynamic hugepages runs." --> HP VM runs and Scheduling Memory is not subtracted by VM memory "Scheduling Memory calculation disregards huge-pages" --> Scheduling Memory is not subtracted by amount of static hugepages configured on the host I don't think a Docs bug will solve these, sounds like a code change is needed. And if you come up with a patch that fixes both at once, I'm happy to close as DUP :)
It this all about the confusion around static/dynamic hugepages? getRequiredMemoryWithoutHugePages only makes sense for VMs that use static hugepages, right? From a customer perspective: Why even suggest static hugepages when you could have it all dynamic with zero manual grub configuration? I think I already suggested this at the docs bug make dynamic hugepages default again and make hugepages feature easier to use. Then you can just treat a hugepages VM like you would treat any other VM that needs memory.
change SLA team to virt, we're not tracking SLA separately anymore
The verifying on rhv-4.4.0-31. base on the following tests performed the engine now does reduce the scheduling memory when VM with dynamic huge pages runs, but there is a bug with increasing the scheduling memory back after re-configuring and restart of the VM removing the huge pages (please see the Test3 below ). This is the reason for changing the status to failed qa. Pre-condition: In Get https://{{host}}/ovirt-engine/api/hosts response: <max_scheduling_memory>33205256192</max_scheduling_memory> on host: /etc/vdsm/vdsm.conf [performance] use_dynamic_hugepages = true Test1. On host: echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages Configure VM (size 2048) with hugepages=1048576 (custom properties) Run the VM. Result ok: max_scheduling_memory reduces accordingly <max_scheduling_memory>30952914944</max_scheduling_memory> Test2. echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages Configure VM (size 4096) with hugepages=1048576 (custom properties) Result ok: max_scheduling_memory reduces accordingly <max_scheduling_memory>28801236992</max_scheduling_memory> Shutdown the VM, Result ok: max_scheduling_memory increases back to 33205256192 Test3. Configure the same VM (size 4096) with no hugepages. Re-start The VM. Result failed: max_scheduling_memory reduces as even the hugepages are configured <max_scheduling_memory>28801236992</max_scheduling_memory> (??) Expected: max_scheduling_memory must reduce only by size of the VM <max_scheduling_memory>33205252096</max_scheduling_memory>, where 33205252096 = 33205256192 - 4096(vm size) Test4. echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB Configure VM1 (size 4096) with hugepages=1048576, VM2 (size 1024) with hugepages=1048576, VM3 (size 2048) with hugepages=1048576. Run VM1: max_scheduling_memory reduces to 28801236992 Run VM2: max_scheduling_memory reduces to 27566014464 Run VM3: max_scheduling_memory reduces to 25254952960
Too many scenarios here. Let's get a new bug for Test#3 and verify the rest with a documented limitation so we can at least try to get part of this to 4.3.z
The Test3 looks to be working as expected, there is just a mistake with the units. The 'max_scheduling_memory' is displayed in Bytes and VM size is in MiB. So after starting the VM, the scheduling memory should be: 33205256192 - 4096 * 1024 * 1024 = 28910288896 (27571 MiB) minus some overhead, so the reported scheduling memory 28801236992 Bytes (27467 MiB) looks correct.
so, I'm verifying on the base of testing described in https://bugzilla.redhat.com/show_bug.cgi?id=1804046#c22 Inserted new bug for test3 https://bugzilla.redhat.com/show_bug.cgi?id=1828290
I inserted the 1828290 before I've read the comment 24. So, after re-test close it as not a bug. sorry for mess :)
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Virtualization security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3807