Description of problem: Using the 'vm_evenly_distributed' cluster scheduling policy, when a host is placed in maintenance mode, all VMs are migrated to a single host, followed by them getting redistributed to balance the cluster. If no policy is used, the migration distribution appears to be evenly distributed. So, the question is, with a 'vm_evenly_distributed' policy, is this the correct behaviour ? Version-Release number of selected component (if applicable): - RHV 4.0.4 - RHEL 7.3 hosts How reproducible: 100% in my testing. Steps to Reproduce: 1. Configure the 'vm_evenly_distributed' cluster policy with the following; HighVmCount=4 SpmVmGrace=5 MigrationThreshold=2 2. Three hosts, one with 'N' VMs (I had 14), the other two with none. 3. Place the host with 13 VMs into maintenance mode. 4. Observe the initial distribution of VMs (as after they've all been migrated, some will get migrated again to balance the load). In my case, all 14 went to one host. AND; 1. Configure the 'vm_evenly_distributed' cluster policy with the following; HighVmCount=4 SpmVmGrace=5 MigrationThreshold=2 2. Three hosts, two with 5 VMs one with 4. 3. Place the host with 4 VMs into maintenance mode. 4. Observe the initial distribution of VMs (as after they've all been migrated, some will get migrated again to balance the load). In my case, all 4 went to one host. Actual results: All VMs are initially migrated to one host. Expected results: One might expect that with a 'vm_evenly_distributed' policy, that the VMs would be evenly distributed in this scenario. The policy does kick in after the initial migrations and redistributes the VMs on an evenly distributed basis. However, this results in additional work and overhead. Additional info:
Thanks for the information Gordon, we indeed have a bug there. The balancing rule works fine except for one small issue... we only count running VMs on destination hosts, but most of the VMs are only starting their migrations there and are missing from the computation. We have a mechanism to track and count what we call pending VMs, we just somehow forgot to use it here.
Verified on rhevm-4.1.0.2-0.2.el7.noarch Environment has 3 hosts(host_1, host_2, host_3), host_1 is SPM 1) Set scheduling policy to 'vm_evenly_distributed' with parameters HighVmCount=4 SpmVmGrace=5 MigrationThreshold=2 2) Start 17 VM's 3) host_1 has 2 VM's host_2 has 8 VM's host_3 has 7 VM's 4) put host_2 to maintenance 5) host_1 has 6 VM's host_3 has 11 VM's