Bug 1330831

Summary: 3rd node addition brings down the hosted engine appliance
Product: [oVirt] ovirt-hosted-engine-setup Reporter: Bhaskarakiran <byarlaga>
Component: GeneralAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED INSUFFICIENT_DATA QA Contact: meital avital <mavital>
Severity: high Docs Contact:
Priority: unspecified    
Version: ---CC: bugs, byarlaga, mzywusko, sbonazzo, stirabos, ylavi
Target Milestone: ovirt-3.6.8Flags: ylavi: ovirt-3.6.z?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-27 12:30:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
sosreports of hosted engine none

Description Bhaskarakiran 2016-04-27 06:13:13 UTC
Created attachment 1151201 [details]
sosreports of hosted engine

Description of problem:
-----------------------
During the 3rd node addition to the hosted engine, the appliance goes down (doesn't run and ping). Need to manually start the VM with "hosted-engine --vm-start". This happens every time. 

Attaching the hosted engine sos reports.


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Yaniv Lavi 2016-05-05 13:16:09 UTC
How did you get to this step if BZ #1329202 was blocking you?

Comment 2 Bhaskarakiran 2016-05-08 12:12:20 UTC
Yaniv,

All these errors / disconnects are seen during the 3rd node addition. Tried first time, the appliance went down, second time, certificate errors and etc.. May be the order i filed the bugs is different.

Comment 3 Yaniv Lavi 2016-06-02 08:49:02 UTC
Do you still have this issue?

Comment 4 Sandro Bonazzola 2016-06-02 08:50:19 UTC
Simone can you try to reproduce?

Comment 5 Bhaskarakiran 2016-06-03 06:39:54 UTC
I have seperated the networks for ovirt and gluster and do not see but if its the same for both then there are issues. I don't have the spare systems to reproduce. May be someone can try.

Comment 6 Simone Tiraboschi 2016-06-06 10:30:58 UTC
On my opinion this is just the combination of side effects from two different issues that we are going to solve with 3.6.6/3.6.7:

The first one was the SetupNetwork issue deploying an host:
https://bugzilla.redhat.com/show_bug.cgi?id=1322257
https://bugzilla.redhat.com/show_bug.cgi?id=1320128

If host-deploy failed for that issue, your host can end with the management network not properly configured.
If you used the same network also for gluster you lost also the storage connection.
If you use two separate networks for the management and the storage, the storage connection will probably survive.

The second one was here:
https://bugzilla.redhat.com/show_bug.cgi?id=1298693

You had a single point of failure deploying hosted-engine on gluster so, if you loose gluster due to the SetupNetwork issue on the host pointed by the hosted-engine configuration, the engine VM will go down.

Bhaskarakiran, can you please try to reproduce with 3.6.7 RC and a single network but using custom mount options as for https://bugzilla.redhat.com/show_bug.cgi?id=1298693#c20 ?

Comment 7 Bhaskarakiran 2016-06-07 09:11:46 UTC
I would need some time as i don't have machines with single network. Will update as i progress.

Comment 8 Yaniv Lavi 2016-06-27 12:30:05 UTC
Please reopen is you can reproduce and provide the needed info