Created attachment 1095712 [details] engine and vdsm logs Description of problem: When trying to perform live storage migration (File to File) to one of the vm's disk, and other disk is located on storage domain in maintenance, the Auto-generated snapshot is failed to create. From engine.log: 2015-11-17 22:36:52,905 INFO [org.ovirt.engine.core.bll.lsm.LiveMigrateVmDisksCommand] (org.ovirt.thread.pool-7-thread-19) [disks_syncAction_fb0cbd8b-3c54-453c] Running command: LiveMigrateVmDisksCommand Task handler: LiveSnapshotTaskHandler internal: false. Entities affected : ID: 1fb1c29b-eab3-4443-b94f-e3c9198d7c11 Type: DiskAction group DISK_LIVE_STORAGE_MIGRATION with role type USER 2015-11-17 22:36:53,005 WARN [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (org.ovirt.thread.pool-7-thread-19) [4eab3fc8] CanDoAction of action 'CreateAllSnapshotsFromVm' failed for user admin@internal. Reasons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_STORAGE_DOMAIN_STATUS_ILLEGAL2,$status Maintenance Version-Release number of selected component (if applicable): rhevm-3.6.0.3-0.1.el6.noarch vdsm-4.17.10.1-0.el7ev.noarch How reproducible: 100% Steps to Reproduce: Setup with 2 storage domains (NFS in my case), vm + 1 bootable disk. 2 more disks attached to the vm and each on different storage domain: disk1 on nfs_sd1 and disk2 on nfs_sd2 1. Deactivate disk2 disk and maintenance nfs_sd2 2. Live migrate disk1 3. Actual results: In tasks tab: "Creating VM Snapshot Auto-generated for Live Storage Migration for VM live_storage_migration_nfs" Expected results: LSM should work Additional info:
After taking a look at the code, this is what I've found out: 1. The CDA of CreateAllSnapshotsFromVmCommand fails on its last step, validateStorage(). 2. Patch I9f42f387781425d16f53a0e8a34d859365808ec0 changes the disks that should be validated to be all of the disks (including the inactive one) instead of only those coming back from getDisksListForChecks(). 3. For some reason, the command gets as a parameter the id of the disk that we want to move (the active one) as an id of a disk that we should ignore (getParameters().getDiskIdsToIgnoreInChecks()), which might ruin getDisksListForChecks(). 4. Anyway, the command doesn't end with an error message, the disk stays locked and ruins your env. The only thing I managed to do is to remove it from the db and then from the storage manually. It might be enough to just replace the call to getSnappableVmDisks() (first line in validateStorage()) to getDisksListForChecks(), but since the mentioned above patch changed it a few months ago, we should dig a bit deeper to see what's going on there.
oVirt 3.6.2 RC1 has been released for testing, moving to ON_QA
Verified on rhevm-3.6.2-0.1.el6.noarch vdsm-4.17.15-0.el7ev.noarch