Created attachment 752245 [details] logs Description of problem: Storage Live migration for a vm in status pause (Run-Once as Paused) fails with Drive image file %s could not be found Version-Release number of selected component (if applicable): sf17.1 vdsm-4.10.2-21.0.el6ev.x86_64 libvirt-0.10.2-18.el6_4.5.x86_64 How reproducible: 100% Steps to Reproduce: 1. in iscsi storage with two hosts, create and run a vm in run-once as paused on hsm 2. live migrate the disk 3. Actual results: we fail to live migrate the disk with Drive image file %s could not be found Expected results: we should be able to live migrate the disk Additional info:logs engine: 013-05-23 17:07:34,232 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-37) [3c337780] Command VmReplicateDiskFinishVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found vdsm: Thread-39710::DEBUG::2013-05-23 17:07:38,169::BindingXMLRPC::920::vds::(wrapper) return vmDiskReplicateStart with {'status': {'message': 'Drive image file %s could not be found', 'code': 13}} Thread-39711::DEBUG::2013-05-23 17:07:38,208::BindingXMLRPC::913::vds::(wrapper) client [10.35.161.49]::call vmDiskReplicateFinish with ('6ab79646-1109-40c1-adfe-20a510d7389c', {'device': 'disk', 'domainID': '81ef11d0-4c0c-47b4-8953-d61a6af442d8', 'volumeID': 'fb20e948-f963-4970-a483-f47c7a988746', 'poolID': '7fd33b43-a9f4-4eb7-a885-e9583a929ceb', 'imageID': 'e8ce721d-61a0-40b7-b5e8-2009c606c554'}, {'device': 'disk', 'domainID': '81ef11d0-4c0c-47b4-8953-d61a6af442d8', 'volumeID': 'fb20e948-f963-4970-a483-f47c7a988746', 'poolID': '7fd33b43-a9f4-4eb7-a885-e9583a929ceb', 'imageID': 'e8ce721d-61a0-40b7-b5e8-2009c606c554'}) {} flowID [3c337780] Thread-39711::DEBUG::2013-05-23 17:07:38,208::BindingXMLRPC::920::vds::(wrapper) return vmDiskReplicateFinish with {'status': {'message': 'Drive image file %s could not be found', 'code': 13}} Thread-39710::ERROR::2013-05-23 17:07:38,168::libvirtvm::2317::vm.Vm::(diskReplicateStart) vmId=`6ab79646-1109-40c1-adfe-20a510d7389c`::Unable to find the disk for '{'device': 'disk', 'domainID': '81ef11d0-4c0c-47b4-8953-d61a6af442d8', 'volumeID': 'fb20e948-f963-4970-a483-f47c7a988746', 'poolID': '7fd33b43-a9f4-4eb7-a885-e9583a929ceb', 'imageID': 'e8ce721d-61a0-40b7-b5e8-2009c606c554'}'
minor bug - srcDisk is not printed self.log.error("Unable to find the disk for '%s'", srcDisk) Need to understand what exactly fails though.
Backend forgot to send the "snapshot" command to the VM: Thread-39662::DEBUG::2013-05-23 17:05:53,278::BindingXMLRPC::913::vds::(wrapper) client [10.35.161.49]::call vmCreate with ({'vmId': '6ab79646-1109-40c1-adfe-20a510d7389c', ...},) {} flowID [1bb413aa] One disk attached (volume 30e4d88e-e807-4fb9-9b41-39c988c338ad): { 'vmId': '6ab79646-1109-40c1-adfe-20a510d7389c', 'devices': [ ... {'bootOrder': '1', 'device': 'disk', 'deviceId': 'e8ce721d-61a0-40b7-b5e8-2009c606c554', 'domainID': '81ef11d0-4c0c-47b4-8953-d61a6af442d8', 'format': 'cow', 'iface': 'ide', 'imageID': 'e8ce721d-61a0-40b7-b5e8-2009c606c554', 'index': 0, 'optional': 'false', 'poolID': '7fd33b43-a9f4-4eb7-a885-e9583a929ceb', 'propagateErrors': 'off', 'readonly': 'false', 'shared': 'false', 'specParams': {}, 'type': 'disk', 'volumeID': '30e4d88e-e807-4fb9-9b41-39c988c338ad'}, ... ], ... } (...missing vmSnapshot call...) Thread-39710::DEBUG::2013-05-23 17:07:38,168::BindingXMLRPC::913::vds::(wrapper) client [10.35.161.49]::call vmDiskReplicateStart with ('6ab79646-1109-40c1-adfe-20a510d7389c', {'device': 'disk', 'domainID': '81ef11d0-4c0c-47b4-8953-d61a6af442d8', 'volumeID': 'fb20e948-f963-4970-a483-f47c7a988746', 'poolID': '7fd33b43-a9f4-4eb7-a885-e9583a929ceb', 'imageID': 'e8ce721d-61a0-40b7-b5e8-2009c606c554'}, {'device': 'disk', 'domainID': '38755249-4bb3-4841-bf5b-05f4a521514d', 'volumeID': 'fb20e948-f963-4970-a483-f47c7a988746', 'poolID': '7fd33b43-a9f4-4eb7-a885-e9583a929ceb', 'imageID': 'e8ce721d-61a0-40b7-b5e8-2009c606c554'}) {} flowID [3c337780] (Volume fb20e948-f963-4970-a483-f47c7a988746 is never mentioned in the HSM host). Is this still happening? We fixed several bugs in this area since May.
The mentioned error is still existing when trying to LSM a paused VM's disk: engine: 2013-07-09 17:36:54,481 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-49) [70fe1c97] Command VmReplicateDiskFinishVDS execution failed. Exception: VDSErrorExcepti on: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found vdsm: Thread-30819::ERROR::2013-07-09 17:36:56,268::vm::3671::vm.Vm::(diskReplicateStart) vmId=`fd5bed8a-033d-4cd7-ac40-dcd4c1ecbc7b`::Unable to find the disk for '{'device': 'disk', 'domainID': '283c1cc8-1d44-47f6-970d -5df9f4b4dedf', 'volumeID': '0da7287c-04e3-44a1-aa96-36dfef8235e5', 'poolID': '8510a3ba-2457-4df6-9140-8257f127bca5', 'imageID': '49ee6c6a-8779-4b50-992f-29967baeb34b'}' Thread-30819::DEBUG::2013-07-09 17:36:56,268::BindingXMLRPC::936::vds::(wrapper) return vmDiskReplicateStart with {'status': {'message': 'Drive image file %s could not be found', 'code': 13}} Dummy-51::DEBUG::2013-07-09 17:36:56,276::storage_mailbox::727::Storage.Misc.excCmd::(_checkForMail) 'dd if=/rhev/data-center/8510a3ba-2457-4df6-9140-8257f127bca5/mastersd/dom_md/inbox iflag=direct,fullblock count =1 bs=1024000' (cwd None) Checked on: vdsm-4.11.0-69.gitd70e3d5.el6.x86_64 rhevm-3.3.0-0.6.master.el6ev.noarch
LSM is working fine when vm's disk is in pause state Checked on RHEVM3.3-IS13
Closing - RHEV 3.3 Released