Bug 691874

Summary: [vdsm] [scale]When running 140 vms 'Recovering from crash or Initializing' takes ~33 minutes
Product: Red Hat Enterprise Linux 6 Reporter: David Naori <dnaori>
Component: vdsmAssignee: Federico Simoncelli <fsimonce>
Status: CLOSED CURRENTRELEASE QA Contact: yeylon <yeylon>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.1CC: abaron, bazulay, dnaori, hateya, iheim, mgoldboi, srevivo, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm-4.9-65.el6.x86_64 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-09 09:51:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
vdsm-logs none

Description David Naori 2011-03-29 18:34:58 UTC
Description of problem:
when running ~140 vms and restarting vdsmd, it takes 20 minutes to vdsm to recover.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 3 David Naori 2011-03-29 19:08:42 UTC
Created attachment 488534 [details]
vdsm-logs

Comment 4 David Naori 2011-03-29 19:09:33 UTC
Description of problem:

when running ~140 vms and restarting vdsmd, it takes about 33 minutes to recover.

looks like communication with libvirt for _domDependentInit for every single vm is very slow.
*tested without sasl - same behaviour 

Version-Release number of selected component (if applicable):
-vdsm 4-9-57
-libvirt-0.8.7-15.el6.x86_64


How reproducible:
100%

Steps to Reproduce:
1.run ~140 vms on single host
2.restart vdsmd

Comment 14 Federico Simoncelli 2011-05-09 09:51:35 UTC
Closing according to:

https://trac.qa.lab.tlv.redhat.com/trac/integration/ticket/262

Didn't manage to reproduce on latest build 65 (vdsm-4.9-65.el6.x86_64)

Flow:
 - host runs 141 VMs (installed with real O.S on it, 5G disk).
 - once all vms are up, restarted vdsm service, vdsm finish to recover all vms in 26 seconds

First recover message:
clientIFinit::DEBUG::2011-05-09 09:47:56,176::clientIF::1209::vds::(_recoverVm) Trying to recover a0be6b38-15e5-4294-8faa-188cf48c27b5

Last recover message:
* clientIFinit::DEBUG::2011-05-09 09:48:22,415::clientIF::1209::vds::(_recoverVm) Trying to recover 664e81b3-97ba-4654-862b-13c38cc665fb