Created attachment 1119495 [details] export-snaps.tgz Description of problem: Some of the snapshots are not moved to new StorageDomain when moving disk. Version-Release number of selected component (if applicable): 3.6.3-1 How reproducible: 100% I noticed originally on environment which was upgraded from 3.5 and iSCSI. This BZ & logs are from new clean test environment on NFS. Steps to Reproduce: 1. Have VM with 1disk, 2 data SDs. 2. In VM Snapshost subtab can see 3+1(current) snapshots 3. In Disks tab can see 1x VM disk & 4x snapshot disk 4. Start Moving HDD (VM disk subtab Move button) 6a. In Storage tab/OrigSD Disks: 2x OVF_STORE files 6b. In Storage tab/OrigSD DiskSnap: none 7a. In Storage tab/DestSD Disks: 1x VM disk & 2x OVF_STORE 7b. In Storage tab/DestSD DiskSnap: 3 disks with snapshot 8. After migration (no ERR shown, logs looks good) put Original(also master) SD to maintenance 9. Attached an ExportDomain 10. Try to export VM (no collapse, no override) => boom! (see additional info below) 11. The failure is not clear and logs need to be investigated - it tries to fetch disks from SD which is already gone. Actual results: Some of the snapshots stayed on the original SD, some were moved to new SD. Expected results: All snapshots are moved with VM disk to new destination. Additional info: Jan 29, 2016 4:53:27 PM User admin@internal moving disk myvm01_Disk1 to domain ps-ovirt-str01-ps02-n04. Jan 29, 2016 4:57:48 PM User admin@internal finished moving disk myvm01_Disk1 to domain ps-ovirt-str01-ps02-n04. Jan 29, 2016 6:07:01 PM Failed to export Vm myvm01 to ps-ovirt-str01-ps2-n05-exp vdsm.log: ============== ... jsonrpc.Executor/4::DEBUG::2016-01-29 18:06:59,741::task::550::Storage.TaskManager.Task::(__state_aborting) Task=`e6a911e3-e4e8-4cec-8e13-d1340f07f3d2`::_aborting: recover policy none jsonrpc.Executor/4::DEBUG::2016-01-29 18:06:59,741::task::595::Storage.TaskManager.Task::(_updateState) Task=`e6a911e3-e4e8-4cec-8e13-d1340f07f3d2`::moving from state aborting -> state failed jsonrpc.Executor/4::DEBUG::2016-01-29 18:06:59,741::resourceManager::943::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} jsonrpc.Executor/4::DEBUG::2016-01-29 18:06:59,742::resourceManager::980::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} jsonrpc.Executor/4::ERROR::2016-01-29 18:06:59,742::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Storage domain does not exist: (u'd14c3046-f972-4427-8a56-6fadf865aaf0',)", 'code': 358}} 5a0585ff-c38b-4eb7-9653-0bdc9af4638a::DEBUG::2016-01-29 18:06:59,848::task::752::Storage.TaskManager.Task::(_save) Task=`5a0585ff-c38b-4eb7-9653-0bdc9af4638a`::_save: orig /rhev/data-center/00000001-0001-0001-0001-0000000001ec/mastersd/master/tasks/5a0585ff-c38b-4eb7-9653-0bdc9af4638a temp /rhev/data-center/00000001-0001-0001-0001-0000000001ec/mastersd/master/tasks/5a0585ff-c38b-4eb7-9653-0bdc9af4638a.temp 5a0585ff-c38b-4eb7-9653-0bdc9af4638a::DEBUG::2016-01-29 18:07:00,040::fileVolume::535::Storage.Volume::(validateVolumePath) validate path for 8caa631a-ba1e-4813-94c2-dc8cb2110894 ...
Bug is reproduced in master. For reproducing it, the taken snapshots should be while the VM is UP, and should include its memory. After making the snapshot, move the disk to another storage domain, then put the source domain to maintenance. exporting this VM will now fail.
So there's no problem with moving the disks, but with being unable to export a memory snapshot from an inaccessible domain. This is a known issue that existed since introducing memory snapshots in 3.3, and will be handled in 4.0 as part of modelling these volumes as proper disks.
What's current status?
(In reply to Jiri Belka from comment #3) > What's current status? No work has started here. It's not very high on our priority list.
(In reply to Allon Mureinik from comment #4) > (In reply to Jiri Belka from comment #3) > > What's current status? > No work has started here. It's not very high on our priority list. I'm confused, it has depens on BZ1150239 which is done.
(In reply to Jiri Belka from comment #5) > (In reply to Allon Mureinik from comment #4) > > (In reply to Jiri Belka from comment #3) > > > What's current status? > > No work has started here. It's not very high on our priority list. > > I'm confused, it has depens on BZ1150239 which is done. Regardless it is not high on the priority list. it's in our scope- but we have more urgent issues currently
Moving out all non blocker\exceptions.
Since 4.2 it's possible to move memory disks as you do to regular VM disks, this circumvents this issue.