Bug 843058

Summary: Can't run large amount of VMs simultaneously. Getting error Cant find VDS to run the VM.
Product: Red Hat Enterprise Virtualization Manager Reporter: Leonid Natapov <lnatapov>
Component: ovirt-engineAssignee: Roy Golan <rgolan>
Status: CLOSED ERRATA QA Contact: vvyazmin <vvyazmin>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.1.0CC: chetan, dyasny, hateya, iheim, lpeer, mhuth, ofrenkel, pstehlik, Rhev-m-bugs, sgrinber, yeylon, ykaul
Target Milestone: ---Keywords: Reopened
Target Release: 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: virt
Fixed In Version: sf2 Doc Type: Bug Fix
Doc Text:
The pending memory count increases when the RunVm call is issued, and decreases when the virtual machine changes to an Up state. When the memory was not decreased, it created an overflow of free memory which prevented a host from being selected to run virtual machines. Consequently, a large number of virtual machines could not be run simultaneously. This update implements an interleaving solution where the pending memory count is monitored, and throttled if there is insufficient memory. Bulk running of virtual machines now succeed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-06-10 21:08:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 915537    
Attachments:
Description Flags
engine log
none
vdsm log
none
engine debug log none

Description Leonid Natapov 2012-07-25 13:17:55 UTC
Created attachment 600296 [details]
engine log

Can't  run large amount of VMs  simultaneously. Getting  error Cant find VDS to run the VM. I have 20+ VMs that I am trying to run simultaneously. Some VMs turns on and switch to powering up state but some vms are failed to run. After the Powering up VMs are UP I can successfully start VMs which previously failed to run. I can run them one by one and it works fine. In the backend I get the following error:

2012-07-25 16:02:12,134 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (pool-3-thread-43) [40d6b26e] Cant find VDS to run the VM e53f8a2e-4fc0-4d5d-81ea-53135622f577 on, so this VM will not be run.

full engine log attached

Comment 2 Roy Golan 2012-07-25 14:17:36 UTC
can you specify your setup: num of hosts and memory and CPU of the VMS and HOSTS

Comment 3 Simon Grinberg 2012-07-26 11:10:35 UTC
(In reply to comment #2)
> can you specify your setup: num of hosts and memory and CPU of the VMS and
> HOSTS

Leonid, could it be that you over commit memory? If so then it's a known issue. You must wait until KSM kicks in before you can farther run VMS

Comment 6 Roy Golan 2012-08-06 08:37:53 UTC
I'm not sure its KSM issue. it could be IO, timeout on VDSM semaphore lock for running qemu etc... Leonid please specify which VMs you played and attach the VDSM log.

Comment 8 Leonid Natapov 2012-08-06 10:54:38 UTC
Attaching vdsm.log file.

I am running 1 host in cluster. VMs are server machines with no OS.

Comment 9 Leonid Natapov 2012-08-06 10:55:08 UTC
Created attachment 602482 [details]
vdsm log

Comment 10 Roy Golan 2012-08-08 11:22:38 UTC
Created attachment 602998 [details]
engine debug log

Comment 11 Roy Golan 2012-08-08 11:43:21 UTC
The problem is that we are summing the increasing pending memory count from the RunVm and decreasing it when VdsUpdateRunTimeInfo detects that the VM goes to UP  so a burst running VMs will always fail short after ~ 1/2 of the VMs to run.

one of the solutions I can come with is to throttle the VM run in a way the *monitoring* will be able to interleave and decrement the pending memory . this means probably slower flow because we need a way to fire the monitoring (maybe parts of it by code sharing?) after every VM run?

Anyway I find it very bad UX when you have a monster Host but you just can't bulk run a mass of VMs on it.

Comment 12 Simon Grinberg 2012-08-08 14:41:04 UTC
(In reply to comment #11)
> The problem is that we are summing the increasing pending memory count from
> the RunVm and decreasing it when VdsUpdateRunTimeInfo detects that the VM
> goes to UP  so a burst running VMs will always fail short after ~ 1/2 of the
> VMs to run.
> 
> one of the solutions I can come with is to throttle the VM run in a way the
> *monitoring* will be able to interleave and decrement the pending memory .
> this means probably slower flow because we need a way to fire the monitoring
> (maybe parts of it by code sharing?) after every VM run?
> 
> Anyway I find it very bad UX when you have a monster Host but you just can't
> bulk run a mass of VMs on it.

There are other consequences of firing up multiple VMs at the same time. 
For example - timeout on 'wait for launch' that may happen when you spawn many VMs at once, IO storms when all VMs try to boot from the same shared storage, etc. You need to throttle anyhow. 

The solution is to have the creation of multiple object asynchronous, and then throttle the actual creation. It's not a bad UX, it's a reasonable limitation to prevent Monday morning effect. Actually we have an RFE to do just that, I just can find it ATM

Comment 13 Roy Golan 2012-08-12 14:15:08 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > The problem is that we are summing the increasing pending memory count from
> > the RunVm and decreasing it when VdsUpdateRunTimeInfo detects that the VM
> > goes to UP  so a burst running VMs will always fail short after ~ 1/2 of the
> > VMs to run.
> > 
> > one of the solutions I can come with is to throttle the VM run in a way the
> > *monitoring* will be able to interleave and decrement the pending memory .
> > this means probably slower flow because we need a way to fire the monitoring
> > (maybe parts of it by code sharing?) after every VM run?
> > 
> > Anyway I find it very bad UX when you have a monster Host but you just can't
> > bulk run a mass of VMs on it.
> 
> There are other consequences of firing up multiple VMs at the same time. 
> For example - timeout on 'wait for launch' that may happen when you spawn
> many VMs at once, IO storms when all VMs try to boot from the same shared
> storage, etc. You need to throttle anyhow. 
> 
> The solution is to have the creation of multiple object asynchronous, and
> then throttle the actual creation. It's not a bad UX, it's a reasonable
> limitation to prevent Monday morning effect. Actually we have an RFE to do
> just that, I just can find it ATM

I am not sure about the I/O storm you mentioned. I know VDSM has a semaphore for running a VM with the num of cores as the semaphore count. 

Anyhow my take on this now will be to decrease the pending memory count when the Vm status changes to POWERING_UP instead of UP and to see if this hurry things up.

Comment 14 Roy Golan 2012-08-15 08:44:21 UTC
http://gerrit.ovirt.org/#/c/7204/

Comment 24 Doron Fediuck 2013-03-27 11:14:25 UTC
*** Bug 927078 has been marked as a duplicate of this bug. ***

Comment 27 vvyazmin@redhat.com 2013-05-26 20:27:29 UTC
No issues are found, When run 150 VM's simultaneously (via Python SDK), each VM have 256 MB RAM

Verified on RHEVM 3.2 - SF17.1 environment:

RHEVM: rhevm-3.2.0-11.28.el6ev.noarch
VDSM: vdsm-4.10.2-21.0.el6ev.x86_64
LIBVIRT: libvirt-0.10.2-18.el6_4.5.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.355.el6_4.3.x86_64
SANLOCK: sanlock-2.6-2.el6.x86_64

Comment 28 errata-xmlrpc 2013-06-10 21:08:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0888.html