Bug 1410174 - When blocking connection to SPM host SPM cannot start on a different host
Summary: When blocking connection to SPM host SPM cannot start on a different host
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.0
Hardware: x86_64
OS: Unspecified
high
high
Target Milestone: ovirt-4.1.0-beta
: ---
Assignee: Liron Aravot
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-04 16:31 UTC by Lilach Zitnitski
Modified: 2017-01-09 10:26 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-09 10:26:58 UTC
oVirt Team: Storage
tnisan: ovirt-4.1?
lzitnits: planning_ack?
lzitnits: devel_ack?
lzitnits: testing_ack?


Attachments (Terms of Use)
logs (46.51 KB, application/zip)
2017-01-04 16:31 UTC, Lilach Zitnitski
no flags Details

Description Lilach Zitnitski 2017-01-04 16:31:20 UTC
Description of problem:
When blocking connection from engine to SPM host, the SPM role cannot start on the remaining active host, and without the SPM all the storage domains turn inactive. 

Version-Release number of selected component (if applicable):
ovirt-engine-4.1.0-0.4.master.20170103091953.gitfaae662.el7.centos.noarch
vdsm-4.19.1-17.gitf1272bf.el7.centos.x86_64

How reproducible:
100%

Steps to Reproduce:
1. run getAllTasksStatuses to determine which host is spm
2. block connection from the engine to that host

Actual results:
spm host and all storage domains are inactive

Expected results:
the active host gets the spm role, storage domains are still active

Additional info:

engine.log

2017-01-04 18:22:13,808+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler8) [] Correlation ID: null, Call Stack: null, Custom E
vent ID: -1, Message: VDSM blond-vdsf command failed: Connection issue java.rmi.ConnectException: Connection timeout
2017-01-04 18:22:13,809+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler8) [] Command 'SpmStatusVDSCommand(HostName = blond-vdsf, SpmStatusVDSCommandParameters:{runAsync='true', hostId='1529693d-d0fd-4dd6-bb76-8d66de7daeea', storagePoolId='00000001-0001-0001-0001-000000000311'})' execution failed: VDSGenericException: VDSNetworkException: Connection issue java.rmi.ConnectException: Connection timeout

Comment 1 Lilach Zitnitski 2017-01-04 16:31:55 UTC
Created attachment 1237234 [details]
logs

engine and vdsm

Comment 2 Liron Aravot 2017-01-09 10:26:58 UTC
That's the expected behavior - we can't start the SPM on a different host while there's a host holding the role.
In order to free the role the host may be fenced by our automatic fencing, manually rebooted (with later confirming in the engine that the host has been rebooted by right clicking on it -> "confirm host was rebooted") or have it's connectivity the engine restored.


Note You need to log in before you can comment on or make changes to this bug.