Bug 1285738

Summary: Hosted engine setup fails when VDSM is slow to initialize
Product: [oVirt] ovirt-hosted-engine-setup Reporter: Martin Sivák <msivak>
Component: GeneralAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED CURRENTRELEASE QA Contact: Artyom <alukiano>
Severity: high Docs Contact:
Priority: high    
Version: 1.3.1CC: bugs, dfediuck, didi, lveyde, mavital, msivak, rmartins, sbonazzo, stirabos, ylavi
Target Milestone: ovirt-3.6.1Keywords: Triaged
Target Release: 1.3.1.1Flags: rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
dfediuck: devel_ack+
mavital: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
hosted-engine setups has to wait for VDSM to become ready. Improving the wait time to be able to run on overloaded environments for testing purposes. Clearly failing on timeouts.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-23 09:19:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1284979    
Attachments:
Description Flags
Screenshot
none
setup.log none

Description Martin Sivák 2015-11-26 11:22:31 UTC
Created attachment 1099228 [details]
Screenshot

Description of problem:

VDSM is slow to initialize and the setup does not wait long enough.. and fails with VDSM reporting error 99 - Recovering or initializing

Version-Release number of selected component (if applicable):

ovirt-node 20151104 el7.2

How reproducible:

Always on my nested VM setup

Comment 1 Martin Sivák 2015-11-26 11:23:09 UTC
Created attachment 1099229 [details]
setup.log

Comment 2 Nikolai Sednev 2015-11-30 09:37:19 UTC
Hi Martin,
May I ask for exact reproduction steps? On my real HW based setup I don't see the slowness and setup succeeds during deployment over RHEL7.2 hosts.
Engine:
ovirt-host-deploy-java-1.4.1-1.el6ev.noarch
ovirt-vmconsole-1.0.0-1.el6ev.noarch
ovirt-host-deploy-1.4.1-1.el6ev.noarch
ovirt-vmconsole-proxy-1.0.0-1.el6ev.noarch
rhevm-3.6.1-0.2.el6.noarch
ovirt-engine-extension-aaa-jdbc-1.0.3-1.el6ev.noarch

Host:
ovirt-vmconsole-host-1.0.1-0.0.master.20151105234454.git3e5d52e.el7.noarch
ovirt-release36-002-2.noarch
sanlock-3.2.4-1.el7.x86_64
ovirt-setup-lib-1.0.1-0.0.master.20151126203321.git2da7763.el7.centos.noarch
ovirt-engine-sdk-python-3.6.1.1-0.1.20151127.git2400b22.el7.centos.noarch
vdsm-4.17.11-7.gitc0752ac.el7.noarch
ovirt-vmconsole-1.0.1-0.0.master.20151105234454.git3e5d52e.el7.noarch
ovirt-release36-snapshot-002-2.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.3.x86_64
mom-0.5.1-2.el7.noarch
ovirt-hosted-engine-ha-1.3.3.1-0.0.master.20151125134310.20151125134307.git2718494.el7.noarch
ovirt-hosted-engine-setup-1.3.1.1-0.0.master.20151124151641.git8763f36.el7.centos.noarch
ovirt-host-deploy-1.4.2-0.0.master.20151122153544.gitfc808fc.el7.noarch
libvirt-client-1.2.17-13.el7.x86_64

Comment 3 Martin Sivák 2015-11-30 10:25:50 UTC
Well there is no special reproducer.. it just took 12 seconds on my nested environment. Try use nesting and limit the cpu power of the VM, that might be enough.

Comment 4 Artyom 2016-02-18 16:22:58 UTC
Verified on ovirt-hosted-engine-setup-1.3.3.1-1.el7ev.noarch
Added to __main__ of /usr/share/vdsm/daemonAdapter sleep 130 seconds and started deploy.
After 120 seconds I can see:
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ ERROR ] Failed to execute stage 'Environment setup': VDSM did not start within 120 seconds