Bug 1398601

Summary: Unable to enable gluster capability on the cluster
Product: [oVirt] ovirt-engine Reporter: SATHEESARAN <sasundar>
Component: Frontend.WebAdminAssignee: Martin Perina <mperina>
Status: CLOSED WORKSFORME QA Contact: Pavel Stehlik <pstehlik>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: bugs, knarra, oourfali, sabose, sasundar, ylavi
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
hci
Last Closed: 2016-12-01 09:57:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
part of the engine log with errors and tracebacks none

Description SATHEESARAN 2016-11-25 11:50:05 UTC
Description of problem:
-----------------------
I have created a hosted-engine setup with 3 hosts, with gluster volumes as backend.

Hosted-Engine was up and running. I have edited to cluster to check 'gluster service' - to enable gluster capability on the cluster.

After saving the change, it took sometime and enabling 'gluster capability' on the cluster failed

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
vdsm-gluster-4.18.999-978.gitcaed6de.el7.centos.noarch
vdsm-4.18.999-978.gitcaed6de.el7.centos.x86_64
ovirt-hosted-engine-setup-2.1.0-0.0.master.20161122155539.git00fff8b.el7.centos.noarch

How reproducible:
-----------------
Always

Steps to Reproduce:
-------------------
1. Install hosted-engine
2. Edit the cluster and check the 'Enable gluster service'

Actual results:
---------------
'Enabling gluster service' on the cluster waits for sometime and then it fails

Expected results:
------------------
'Enabling gluster service' on the cluster should be successful

Comment 1 SATHEESARAN 2016-11-25 11:50:41 UTC
Created attachment 1224227 [details]
part of the engine log with errors and tracebacks

Comment 2 Sahina Bose 2016-11-28 08:09:01 UTC
2 issues here.
The root cause seems to be communication issues between engine and vdsm on the host :
2016-11-25 11:20:44,899Z ERROR [org.ovirt.engine.core.vdsbroker.gluster.GetGlusterHostUUIDVDSCommand] (default task-38) [118de521] Command 'GetGlusterHostUUIDVDSCommand(HostName = host3, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='6cc5b665-363c-41bf-a486-9a6897c0b5eb'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues

Once the VDS command fails, there's also an error in setting the host to non-operational due to failure in getting resource. (Not sure if caused by DI changes)

Comment 3 Sahina Bose 2016-11-28 08:10:02 UTC
Are you testing with engine from master?

Comment 4 SATHEESARAN 2016-11-28 14:01:00 UTC
(In reply to Sahina Bose from comment #3)
> Are you testing with engine from master?

Hi Sahina,

Yes, as mentioned in comment0, I have been testing this with ovirt-master

Comment 5 Oved Ourfali 2016-11-30 09:52:03 UTC
What's the status of the host?
Can you provide complete logs and not only partial ones?

Comment 6 SATHEESARAN 2016-12-01 09:57:19 UTC
I have reprovisioned the setup and I couldn't get the engine logs.

I have tried to install HC setup with the latest ovirt-master ( http://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el7Server/noarch/ovirt-release-master-4.1.0-0.5.master.20161201000129.gitf370ec3.el7.centos.noarch.rpm )

This time, I am not seeing any such problems, while enabling gluster capability on the cluster.

Looks the problem got resolved with the latest ovirt-master.

CLOSING this bug as the issue is not happening with the master. Will re-open the bug if the problem re-occurs