Bug 1220310 - [hosted-engine-setup] [Gluster support] Deployment gets stuck: "oVirt API connection failure"
Summary: [hosted-engine-setup] [Gluster support] Deployment gets stuck: "oVirt API con...
Keywords:
Status: CLOSED DUPLICATE of bug 1201355
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-hosted-engine-setup
Version: 3.6
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Sandro Bonazzola
QA Contact: Elad
URL:
Whiteboard: integration
Depends On:
Blocks: Hosted_Engine_External_GlusterFS oVirt_Hosted_Engine_GlusterFS
TreeView+ depends on / blocked
 
Reported: 2015-05-11 09:47 UTC by Elad
Modified: 2015-05-12 08:37 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-05-12 07:26:35 UTC
oVirt Team: ---


Attachments (Terms of Use)
/var/log/ from the host (5.85 MB, application/x-gzip)
2015-05-11 09:47 UTC, Elad
no flags Details

Description Elad 2015-05-11 09:47:01 UTC
Created attachment 1024142 [details]
/var/log/ from the host

Description of problem:
Tried to deploy hosted engine over Gluster. Got to the phase when DB health check completed and the hosted-engine installation waited for VDSM to become operational. In this phase the deployment got stuck.


Version-Release number of selected component (if applicable):
ovirt-3.6.0-1 
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150401110307.git9665976.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1.
- Created a new volume in the Gluster server:

gluster volume create elad3 replica 3 transport tcp 10.35.160.6:/export/elad3 10.35.160.202:/home/elad/1 10.35.160.203:/home/elad/1 force

- Changed owner-gid and owner-uid to vdsm:kvm:

gluster volume set elad3 owner-uid 36
gluster volume set elad3 owner-uid 36

- Started the volume:

gluster volume start elad3 

2. Executed hosted-engine --deploy, picked glusterfs and gave it the path of the volume
3. Installed RHEL6.6 on the VM and executed engine-setup

Actual results:
After DB health check completed, the installation got stuck with the following:

[ INFO  ] Waiting for the host to become operational in the engine. This may take several minutes...
[ INFO  ] Still waiting for VDSM host to become operational...



 I got this error in the setup log:

20**FILTERED**5-05-**FILTERED** 09:00:2**FILTERED** DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:**FILTERED**89 VDSM host in  state
20**FILTERED**5-05-**FILTERED** 09:02:29 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:**FILTERED**83 Error fetching host state: [ERROR]::oVirt API connection failure, (7, 'Failed connect to elad-he.qa.lab.tlv.redhat.com:443; Connection timed out')



Expected results:
Hosted-engine deployment over Gluster should end successfully.

Additional info:
/var/log/ from the host

Comment 2 Sandro Bonazzola 2015-05-11 11:17:29 UTC
vdsm logs ends at 2015-05-11 08:21:07 while above logs are from 09:02:29. 

at such time the setup logs:
20**FILTERED**5-05-**FILTERED** 08:2**FILTERED**:05 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._wait_host_ready:**FILTERED**89 VDSM host in installing state

vdsm has been stopped by ovirt-host-deploy, executed by ovirt-engine, and it has not been restarted.

I need the host-deploy logs and/or the engine logs i order to understand why vdsm has not been restarted.

Comment 3 Doron Fediuck 2015-05-12 07:26:35 UTC
See possible workarounds in the duplicate bz.

*** This bug has been marked as a duplicate of bug 1201355 ***

Comment 4 Sandro Bonazzola 2015-05-12 07:28:23 UTC
Closed as duplicate since it seems the same issue described in bug #1201355.
When vdsmd service is stopped, it kills glusterfs process causing the storage domain to disappear.

Comment 5 Elad 2015-05-12 08:37:18 UTC
Engine VM moves to Paused so it does seems like the issue reported in bug #1201355


Note You need to log in before you can comment on or make changes to this bug.