Description of problem: When a SPDM job is being executed we attempt to poll the performing host until the job is ended. In case the host becomes non responsive after the operation has started, we may be able to poll the entity the job is performed on to determine the job status. But if the host becomes non responsive before the job has started, we can't end the command as the job might start (but it may not - in a case the host was powered off) - on that case the engine must wait for the host to become responsive again in order to determine that status of the operation. How reproducible: Always Steps to Reproduce: 1. Move disk in data center with version >= 4.1 2. stop the vdsm service on the performing host before the job starts. Actual results: The engine will wait for the host to become responsive again in order to decide on that status of the operation. Expected results: The engine will fence the operation on supporting flows by updating the job entity so that the job will fail before it modifies it.
Verified with the following code: ----------------------------------------------------------------------- Version-Release number of selected component (if applicable): vdsm-4.19.4-1.el7ev.x86_64 rhevm-4.1.0.3-0.1.el7.noarch ovirt-engine-4.1.0.3-0.1.el7.noarch Verified with the following scenario: ----------------------------------------------------------------------- Steps to Reproduce: Steps to Reproduce: 1. Move disk in data center with version >= 4.1 2. stop the vdsm service on the performing host before the job starts - Jobs fail gracefully Moving to VERIFIED!