Description of problem: We have 3 oVirt nodes with cca. 40 VMs running on each of them. We needed to restart them one by one, so we put one of them into maintenance state. Despite we have vm_evenly_distributed as cluster policy (KSM, Memory balloon), it seems that oVirt choses only one host where to distribute all the VMs, which in our case leaded to an ugly crash with data loss. Version-Release number of selected component (if applicable): 3.5.3.1 How reproducible: When the destination node has a substantial number of VMs and resources are not enough to afford all the machines of the node put into maintenance. Steps to Reproduce: 1. Node X is put into maintenance (with Y VMs running on it) 2. oVirt choses node Z to distribute the Y VMs, having more hosts where to migrate. Actual results: A node crash in our case. Expected results: Maybe VMs should be evenly distributed amongst the rest of active nodes?
Hi, What do you mean by node crash? Can you please provide the log files (engine, vdsm both hosts) from the relevant time?
Created attachment 1087447 [details] Log file nearby the time of the crash The crash time is at 08:44:12, I included a time before that and also after so you might see any relevant facts.
By crash I mean the following behavior: 1. Node 1 was put in standby 2. Node 3 was chosen as the destination for *all* the VMs running on 1 3. About 20 VMs were migrated from 1 to 3 4. Then migration stopped, and node 3 became unresponsive. 5. The node was unresponsive for about 2 minutes, then changed status to "Up" again, but with 0 machines running on it. The machines that were running on 3 at crash time were powered off and some of them had data loss. If you need some additional info don't hesitate to ask.
I see that EvenGuestDistribution is used before the maintenance command, which should have tried to put the same number of VMs to each host. But it also reports that all hosts are over-utilized.. Can you please describe the cluster a bit more? How many VMs were there on the nodes before the maintenance? Was there any fencing configured? I see IPMI commands in the log and although we report them as not successful they might have caused a reboot. Also a vdsm log from the affected host would help if you still have it...
By then, the cluster had 3 hosts, each of them with 64GB of RAM memory, 32CPUs and there were cca. 35-40 VMs running on each host, so the fact it was over-utilized might be correct. I put one of the nodes on maintenance when I realized all VMs were being migrated to the same host. Fencing was indeed configured and the node was rebooted. Unfortunately, the log rotated long ago so I have no longer that log file.
What about the engine log where you cut that snippet from? It would help us to see what happened during the maintenance command. The log usually contains the reasons why hosts were not selected as migration destination. My guess right now is that the host got too overloaded with all the VMs (there might be more reasons for that.. especially in 3.5) and did not respond in time. The engine then initiated fencing and rebooted it.
I don't have that log either, sorry, we were not sending those logs to a central server at that time... Yes, I concur with your guess, but the weird thing is why all the VMs running on the node set on maintenance were then distributed on only one node having another one where balance VMs. I guess there's not much you can investigate without logs, so it will be ok if you decide to close this bug. I can try to reproduce it on a test environment, however it may take some time as I need to gather some hardware to reproduce the issue.
Meital can you please help with reprodusing this with vm_evenly_distribute policy and 3 hosts?
Artyom, can you please try to reproduce this?
I checked it on rhevm-3.6.3.4-0.1.el6.noarch 1) started with 3 hosts(host_1, host_2, host_3) 2) start 40 vms(host_1: 14, host_2: 8, host_3: 18) 3) put host_2 to maintenance all vms migrated to host_1 vm_evenly_distributed weight module prefer hosts with less vms for migration destination, looks like scheduler choose host_1 for all vms, because it has less vms(so if we really want to distribute it evenly between hosts we need implement the same mechanism as for scheduler memory with pending list of vms), but anyway it must not crush server, because memory filter(if host does not have enough memory, it must be filtered)
this needs to be tested with a larger difference between the host becaue we ended up with H1=22 vs H3=18 . Lets test with H1=16 H2=8 H3=8 and set H1 to maintenance and that case we should see where the 8 vms migrate to.
I believe you mean set to maintenance H2 or H3 Start with: H1 20 H2 10 H3 10 Put to maintenance host H2 All vms migrated to H3 H1 20 H2 maintenance H3 20
Created attachment 1134810 [details] engine log
Created attachment 1139623 [details] another test Another test: Start with cluster policy none: H1 11 H2 39 H3 10 Change cluster policy to even vm distribution with default parameters and put H2 to maintenance: H1 30 H2 0 H3 30 so looks like all works fine
Based on comment 15, we're unable to reproduce in version 3.6.3.4-0.1.el6.noarch. If you can reproduce in this version or above please re-open with all relevant information.