Bug 975053

Summary: engine: when trying to clone a vm from a template which has disks on multiple domains and one of the domains in maintenance engine sends command to vdsm and we fail with unclear error which results in change of spm
Product: Red Hat Enterprise Virtualization Manager Reporter: Dafna Ron <dron>
Component: ovirt-engineAssignee: Maor <mlipchuk>
Status: CLOSED UPSTREAM QA Contact: Elad <ebenahar>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: acanan, acathrow, amureini, iheim, jkt, lpeer, mlipchuk, Rhev-m-bugs, scohen, yeylon
Target Milestone: ---Keywords: EasyFix, Triaged
Target Release: 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: is6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-19 06:06:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Dafna Ron 2013-06-17 14:04:50 UTC
Created attachment 762034 [details]
logs

Description of problem:

I tried creating a template with multiple disks on multiple domains. 
when I try to create the vm from the template we select the available domain and try to create the vm. 
the creation fails with the following error in vdsm and we change spm: 

Thread-4421::ERROR::2013-06-17 16:49:17,683::task::850::TaskManager.Task::(_setError) Task=`b0b17919-e95d-4d50-9556-0a7e3a105cd5`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 41, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 601, in spmStop
    pool.stopSpm()
  File "/usr/share/vdsm/storage/securable.py", line 66, in wrapper
    raise SecureError()
SecureError


Version-Release number of selected component (if applicable):

sf18

How reproducible:

100%

Steps to Reproduce:
1. in a two hosts cluster with iscsi multiple domains, create a template with disks on each of the multiple storage domains
2. put the non-master domain in maintenance 
3. try to create a server type vm from the template

Actual results:

engine sends the command to vdsm even though the src domain is in maintenance. 
we fail to create the vm with a generic error in event log and unclear error in both engine and vdsm logs. 
we also change spm because of failure to create the disk 

Expected results:

we need to make sure that the src domain is up before sending the create message. 
if we do not, we need to create a clear error message for easy debugging by user. 

Additional info: logs

Comment 1 Maor 2013-07-09 08:58:16 UTC
verifying the storage before changing the command could be racy, and will not provide a whole solution.

We should add a CDA in AddVmCommand#canAddVM which will call validate(storageDomainValidator.isDomainExistAndActive())

Comment 2 Maor 2013-07-10 07:13:32 UTC
creating VM storage validation should be the same as the template validation
The fix will be adding a validation in the CDA.

Comment 3 Maor 2013-07-11 16:03:43 UTC
merged with CDA validation for activity of storage domain