Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1409125

Summary:	SPDM job commands may not end while the performing host is non responsive
Product:	[oVirt] ovirt-engine	Reporter:	Liron Aravot <laravot>
Component:	BLL.Storage	Assignee:	Liron Aravot <laravot>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Kevin Alon Goldblatt <kgoldbla>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	4.1.0	CC:	bugs, gklein, tnisan
Target Milestone:	ovirt-4.1.0-beta	Flags:	rule-engine: ovirt-4.1+
Target Release:	4.1.0.2
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-02-15 15:02:00 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	Storage	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Liron Aravot 2016-12-29 17:30:09 UTC

Description of problem:
When a SPDM job is being executed we attempt to poll the performing host until the job is ended.
In case the host becomes non responsive after the operation has started, we may be able to poll the entity the job is performed on to determine the job status.
But if the host becomes non responsive before the job has started, we can't end the command as the job might start (but it may not - in a case the host was powered off) - on that case the engine must wait for the host to become responsive again in order to determine that status of the operation.

How reproducible:
Always

Steps to Reproduce:
1. Move disk in data center with version >= 4.1
2. stop the vdsm service on the performing host before the job starts.

Actual results:
The engine will wait for the host to become responsive again in order to decide on that status of the operation.

Expected results:
The engine will fence the operation on supporting flows by updating the job entity so that the job will fail before it modifies it.

Comment 1 Kevin Alon Goldblatt 2017-02-06 12:47:49 UTC

Verified with the following code:
-----------------------------------------------------------------------
Version-Release number of selected component (if applicable):
vdsm-4.19.4-1.el7ev.x86_64
rhevm-4.1.0.3-0.1.el7.noarch
ovirt-engine-4.1.0.3-0.1.el7.noarch

Verified with the following scenario:
-----------------------------------------------------------------------
Steps to Reproduce:
Steps to Reproduce:
1. Move disk in data center with version >= 4.1
2. stop the vdsm service on the performing host before the job starts - Jobs fail gracefully



Moving to VERIFIED!