Bug 1297845 - SPM should never run on the same host as the self hosted engine VM
Summary: SPM should never run on the same host as the self hosted engine VM
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: General
Version: 2.0.0
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
: ---
Assignee: Martin Sivák
QA Contact: Ilanit Stein
URL:
Whiteboard: sla
: 1297844 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-12 15:37 UTC by Jonas Lindholm
Modified: 2016-01-14 18:34 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-01-14 18:34:03 UTC
oVirt Team: SLA
Embargoed:
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
agent.log from a failed reproduction attempt (125.06 KB, text/plain)
2016-01-13 08:40 UTC, Simone Tiraboschi
no flags Details

Description Jonas Lindholm 2016-01-12 15:37:14 UTC
Description of problem:

When the host the engine VM is running on is also the SPM host a crash of that host prevents any other host in the same cluster to spin up the engine VM as there is no SPM host.
This is lke a catch 22 situation where a new SPM host can't be selected because the engine VM is not running and the engine VM can not be started because the storage domains are down and can't be brought up without the SPM.


Version-Release number of selected component (if applicable):


How reproducible:
Select the host the engine is running on as SPM and the power off that host.

Steps to Reproduce:
1. 
2.
3.

Actual results:
None of the other hosts can spin up the engine VM.

Expected results:


Additional info:

The fix would be to not allow the engine VM to run on the SPM host. Only time would be when there is a single host left in the cluster.
As soon there is a second host available the SPM role should be moved over to the host.
If the administrator migrate the engine VM to a host that is SPM the SPM role should move. The administrator should confirm that the SPM role will move before the migration start of the engine VM.

Comment 1 Jonas Lindholm 2016-01-12 15:40:29 UTC
*** Bug 1297844 has been marked as a duplicate of this bug. ***

Comment 2 Doron Fediuck 2016-01-13 07:27:48 UTC
Thanks for the report.
We need to verify the behavior here, but in general if this is a real issue the main problem is the need of SPM for the engine to start, regardless of which host is trying to run the VM. The HA agent should be capable of starting the VM on any healthy hosted-engine node without additional dependencies.

Comment 3 Simone Tiraboschi 2016-01-13 08:39:26 UTC
Unable to reproduce here with ovirt-hosted-engine-ha 1.3.3.6-1 using iSCSI for the hosted-engine storage domain.

I have two hosts: one was the SPM and the engine VM was running there.
I brutally powered it off and after about 4 minutes the engine VM successfully restarted on the other host.

I'm attaching agent.log from my reproducing attempt.

Jonas, could you please provide agent logs from your case to check what happened there?

Comment 4 Simone Tiraboschi 2016-01-13 08:40:46 UTC
Created attachment 1114315 [details]
agent.log from a failed reproduction attempt

Comment 5 Doron Fediuck 2016-01-13 13:31:14 UTC
Please provide your logs and reproduction steps.

Comment 6 Simone Tiraboschi 2016-01-14 18:34:03 UTC
ovirt-ha-agent doesn't rely on the SPM to start the engine VM.


Note You need to log in before you can comment on or make changes to this bug.