Description of problem: PowerSaving policy does not balance VM's from host with over-utilized memory Version-Release number of selected component (if applicable): rhvm-4.2.1.3-0.1.el7.noarch How reproducible: Always Steps to Reproduce: 1. Below you can check system overview 2. 3. Actual results: VM golden_env_mixed_virtio_0 does not migrate to the host host_mixed_2 Expected results: VM golden_env_mixed_virtio_0 must migrate to the host host_mixed_2, because of memory balancing condition Additional info: System overview { "golden_env_mixed_1": { "hosts": { "host_mixed_1": { "id": "cabaf6e9-b730-4b95-adfe-8b8a8e3fd8c9", "max_scheduling_memory": "3177MB", "status": "up", "vms": { "HostedEngine": { "guaranteed_memory": "8192MB", "id": "0deec603-834a-42ad-aa77-9271b0297d4a", "memory": "8192MB", "status": "up" }, "golden_env_mixed_virtio_0": { "guaranteed_memory": "1024MB", "id": "32c5cab7-4398-4b4a-84d3-c5a4f79d8ff7", "memory": "1024MB", "status": "up" }, "vm_overutilized_0": { "guaranteed_memory": "10655MB", "id": "8499f5a5-4493-4543-8943-59a600ee68c9", "memory": "10655MB", "status": "up" } } }, "host_mixed_2": { "id": "850b9a24-0807-47ef-a763-33864e8dbb48", "max_scheduling_memory": "7202MB", "status": "up", "vms": { "golden_env_mixed_virtio_1": { "guaranteed_memory": "1024MB", "id": "5e277e7c-9ec9-4c7d-a08e-1c70e051d368", "memory": "1024MB", "status": "up" }, "golden_env_mixed_virtio_5": { "guaranteed_memory": "258MB", "id": "14c3fdd0-3038-4511-af4f-e9edbe1eb3c1", "memory": "258MB", "status": "up" }, "vm_normalutilized_1": { "guaranteed_memory": "6559MB", "id": "5d193f25-41bb-49e1-9509-d57234f02de8", "memory": "6559MB", "status": "up" } } }, "host_mixed_3": { "id": "a6d3d3e8-9465-47bd-bca0-03e93bdf17c8", "max_scheduling_memory": "15087MB", "status": "up", "vms": { "golden_env_mixed_virtio_4": { "guaranteed_memory": "239MB", "id": "9e801f02-ff77-410a-a0e2-667a2865ee1f", "memory": "239MB", "status": "up" } } } }, "id": "77cb9110-0734-11e8-aac6-001a4a16109f", "policy": { "custom_power_saving_memory": { "balances": { "OptimalForPowerSaving": { "id": "736999d0-1023-46a4-9a75-1316ed50e151" } }, "filters": { "CPUOverloaded": { "id": "98842bc5-4094-4b83-8224-7b50f86a94c9" }, "CpuPinning": { "id": "6d636bf6-a35c-4f9d-b68d-0731f731cddc" }, "HostDevice": { "id": "728a21f1-f97e-4d32-bc3e-b3cc49756abb" }, "Memory": { "id": "c9ddbb34-0e1d-4061-a8d7-b0893fa80932" }, "Migration": { "id": "e659c871-0bf1-4ccc-b748-f28f5d08ddda" }, "Network": { "id": "72163d1c-9468-4480-99d9-0888664eb143" }, "PinToHost": { "id": "12262ab6-9690-4bc3-a2b3-35573b172d54" }, "VmAffinityGroups": { "id": "84e6ddee-ab0d-42dd-82f0-c297779db566" }, "VmToHostsAffinityGroups": { "id": "e69808a9-8a41-40f1-94ba-dd5d385d82d8" } }, "id": "6405fe75-b642-4494-924c-c418ebe6a39c", "weights": { "OptimalForCpuPowerSaving": { "id": "736999d0-1023-46a4-9a75-1316ed50e15b" }, "OptimalForMemoryPowerSaving": { "id": "9dfe6086-646d-43b8-8eef-4d94de8472c8" }, "PreferredHosts": { "id": "591cdb81-ba67-45b4-9642-e28f61a97d57" } } } }, "policy_params": { "CpuOverCommitDurationMinutes": "1", "HighUtilization": "75", "LowUtilization": "35", "MaxFreeMemoryForOverUtilized": "5535", "MinFreeMemoryForUnderUtilized": "9631" } }, }
You can start look into the log from lines: 2018-02-04 17:49:31,277+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-20) [clusters_update_bc4d2fd2-8ac5-4cb7] EVENT_ID: USER_UPDATE_CLUSTER(811), Host cluster golden_env_mixed_1 was updated by admin@internal-authz
Created attachment 1391022 [details] engine log
Artyom, can you make the golden_env_mixed_virtio_0 VM a bit smaller and try again? We won't migrate when the destination host becomes overloaded itself. And it is not just the 1GB we use in the equation, but also the static and dynamic overhead.
Created attachment 1394588 [details] new_engine.log Hi Martin, I reduced the memory of the VM golden_env_mixed_virtio_0 to 512Mb, but I still can see the same issue. System Overview: { "golden_env_mixed_1": { "hosts": { "host_mixed_1": { "id": "cabaf6e9-b730-4b95-adfe-8b8a8e3fd8c9", "max_scheduling_memory": "3690MB", "status": "up", "vms": { "HostedEngine": { "guaranteed_memory": "8192MB", "id": "0deec603-834a-42ad-aa77-9271b0297d4a", "memory": "8192MB", "status": "up" }, "golden_env_mixed_virtio_0": { "guaranteed_memory": "512MB", "id": "32c5cab7-4398-4b4a-84d3-c5a4f79d8ff7", "memory": "512MB", "status": "up" }, "vm_overutilized_0": { "guaranteed_memory": "10654MB", "id": "140f7b8f-7c3a-4a0d-8e2c-cb2c31efe8d1", "memory": "10654MB", "status": "up" } } }, "host_mixed_2": { "id": "850b9a24-0807-47ef-a763-33864e8dbb48", "max_scheduling_memory": "7202MB", "status": "up", "vms": { "golden_env_mixed_virtio_1": { "guaranteed_memory": "1024MB", "id": "5e277e7c-9ec9-4c7d-a08e-1c70e051d368", "memory": "1024MB", "status": "up" }, "golden_env_mixed_virtio_5": { "guaranteed_memory": "259MB", "id": "14c3fdd0-3038-4511-af4f-e9edbe1eb3c1", "memory": "259MB", "status": "up" }, "vm_normalutilized_1": { "guaranteed_memory": "6558MB", "id": "e4e8d537-bd28-44d8-8093-925bc192a880", "memory": "6558MB", "status": "up" } } }, "host_mixed_3": { "id": "a6d3d3e8-9465-47bd-bca0-03e93bdf17c8", "max_scheduling_memory": "15086MB", "status": "up", "vms": { "golden_env_mixed_virtio_4": { "guaranteed_memory": "240MB", "id": "9e801f02-ff77-410a-a0e2-667a2865ee1f", "memory": "240MB", "status": "up" } } } }, "id": "77cb9110-0734-11e8-aac6-001a4a16109f", "policy": { "custom_power_saving_memory": { "balances": { "OptimalForPowerSaving": { "id": "736999d0-1023-46a4-9a75-1316ed50e151" } }, "filters": { "CPUOverloaded": { "id": "98842bc5-4094-4b83-8224-7b50f86a94c9" }, "CpuPinning": { "id": "6d636bf6-a35c-4f9d-b68d-0731f731cddc" }, "HostDevice": { "id": "728a21f1-f97e-4d32-bc3e-b3cc49756abb" }, "Memory": { "id": "c9ddbb34-0e1d-4061-a8d7-b0893fa80932" }, "Migration": { "id": "e659c871-0bf1-4ccc-b748-f28f5d08ddda" }, "Network": { "id": "72163d1c-9468-4480-99d9-0888664eb143" }, "PinToHost": { "id": "12262ab6-9690-4bc3-a2b3-35573b172d54" }, "VmAffinityGroups": { "id": "84e6ddee-ab0d-42dd-82f0-c297779db566" }, "VmToHostsAffinityGroups": { "id": "e69808a9-8a41-40f1-94ba-dd5d385d82d8" } }, "id": "f07cacab-e3cc-4c8e-ba40-514ce0132b40", "weights": { "OptimalForCpuPowerSaving": { "id": "736999d0-1023-46a4-9a75-1316ed50e15b" }, "OptimalForMemoryPowerSaving": { "id": "9dfe6086-646d-43b8-8eef-4d94de8472c8" }, "PreferredHosts": { "id": "591cdb81-ba67-45b4-9642-e28f61a97d57" } } } }, "policy_params": { "CpuOverCommitDurationMinutes": "1", "HighUtilization": "75", "LowUtilization": "35", "MaxFreeMemoryForOverUtilized": "5534", "MinFreeMemoryForUnderUtilized": "9630" } }, } You can start looking at log from line: 2018-02-11 15:22:33,521+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-29) [clusters_update_fedaac2b-9540-4176] EVENT_ID: USER_UPDATE_CLUSTER(811), Host cluster golden_env_mixed_1 was updated by admin@internal-authz
The same problem is reproduced on 4.1 build (tested on rhv-release-4.1.10-6-001.noarch) the engine.log with scheduler debug is attached. Please start looking from 2018-03-19 11:43:28
Created attachment 1409769 [details] engine.log for rhv-release-4.1.10-6-001.noarch
Problem still happens on rhv-release-4.2.3-2-001.noarch
I've tested it on the latest build rhv-release-4.2.3-4-001.noarch and see that the problem is solved. the test steps are in the attached test.txt
Created attachment 1428415 [details] test.txt
This bug is verified in 4.2.3 but it's targeted to 4.2.4. Can you please check and eventually move target milestone to 4.2.3?
This bugzilla is included in oVirt 4.2.3 release, published on May 4th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.3 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.