Created attachment 777619 [details] ## Logs rhevm, vdsm, libvirt, thread dump, superVdsm Description of problem: Live Storage Migration failed on VmReplicateDiskStartVDS command. Version-Release number of selected component (if applicable): RHEVM 3.3 - IS6 environment: RHEVM: rhevm-3.3.0-0.9.master.el6ev.noarch VDSM: vdsm-4.12.0-rc1.12.git8ee6885.el6.x86_64 LIBVIRT: libvirt-0.10.2-18.el6_4.9.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.355.el6_4.5.x86_64 SANLOCK: sanlock-2.6-2.el6.x86_64 How reproducible: 100% Steps to Reproduce: Create FCP or iSCSI Data Center with two hosts connected to two Storage Domains. Create VM with one disk Try run Live Storage Migration (LSM) Actual results: Failed run LSM Expected results: Succeed run LSM Impact on user: Workaround: none Additional info: /var/log/ovirt-engine/engine.log 2013-07-23 11:50:47,018 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] START, VmReplicateDiskStartVDSCommand(HostName = tigris02.scl.lab.tlv.redhat.com, HostId = 3cca5914-a9 84-45e5-9e02-1e22f2073049, vmId=0f91a910-8758-44dd-b3e6-5888f1c064e0), log id: 504979cb 2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] Failed in VmReplicateDiskStartVDS method 2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDis kStartVDS, error = Drive image file %s could not be found 2013-07-23 11:50:47,036 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] Command org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand return value StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=13, mMessage=Drive image file %s could not be found]] 2013-07-23 11:50:47,036 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] HostName = tigris02.scl.lab.tlv.redhat.com 2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] Command VmReplicateDiskStartVDS execution failed. Exception: VDSErrorException: VDSGenericException: V DSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive image file %s could not be found 2013-07-23 11:50:47,036 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] FINISH, VmReplicateDiskStartVDSCommand, log id: 504979cb 2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.bll.lsm.VmReplicateDiskStartTaskHandler] (pool-5-thread-123) [17aa3b6] Failed VmReplicateDiskStart (Disk 506c4e21-0da2-4bee-b0a3-8813645f6eac , VM 0f91a910-8758-44dd-b3e6-5888f1c064e0) 2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-5-thread-123) [17aa3b6] Command org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand throw Vdc Bll exception. With error message VdcBLLException: Drive image file imageErr could not be found (Failed with VDSM error imageErr and code 13) 2013-07-23 11:50:47,038 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] START, VmReplicateDiskFinishVDSCommand(HostName = tigris02.scl.lab.tlv.redhat.com, HostId = 3cca5914- a984-45e5-9e02-1e22f2073049, vmId=0f91a910-8758-44dd-b3e6-5888f1c064e0), log id: 5d1d5a05 2013-07-23 11:50:47,044 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] Failed in VmReplicateDiskFinishVDS method 2013-07-23 11:50:47,044 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDi skFinishVDS, error = Drive image file %s could not be found 2013-07-23 11:50:47,044 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] Command org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand return value StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=13, mMessage=Drive image file %s could not be found]] 2013-07-23 11:50:47,045 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] HostName = tigris02.scl.lab.tlv.redhat.com 2013-07-23 11:50:47,045 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] Command VmReplicateDiskFinishVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found /var/log/vdsm/vdsm.log
Created attachment 777620 [details] ## Logs rhevm, vdsm, libvirt, thread dump, superVdsm (iSCSI)
Verified, tested on RHEVM 3.3 - IS10 environment: RHEVM: rhevm-3.3.0-0.15.master.el6ev.noarch PythonSDK: rhevm-sdk-python-3.3.0.10-1.el6ev.noarch VDSM: vdsm-4.12.0-61.git8178ec2.el6ev.x86_64 LIBVIRT: libvirt-0.10.2-18.el6_4.9.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.355.el6_4.5.x86_64 SANLOCK: sanlock-2.8-1.el6.x86_64
Reopening, still occurs in is10 rhevm-3.3.0-0.15.master.el6ev.noarch, vdsm-4.12.0-61.git8178ec2.el6ev.x86_64
Created attachment 789154 [details] Logs from jenkins job 2013-08-19 17:17:10,718 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] Failed in VmReplicateDiskStartVDS method 2013-08-19 17:17:10,718 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive image file %s could not be found 2013-08-19 17:17:10,719 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] Command org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand return value 2013-08-19 17:17:10,719 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] Command VmReplicateDiskStartVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive image file %s could not be found 2013-08-19 17:17:10,719 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] FINISH, VmReplicateDiskStartVDSCommand, log id: 6a1203a7 2013-08-19 17:17:10,721 ERROR [org.ovirt.engine.core.bll.lsm.VmReplicateDiskStartTaskHandler] (pool-5-thread-48) [2b2fbb88] Failed VmReplicateDiskStart (Disk 16df32ed-598b-479c-b91a-6a806d338a62 , VM 9466621b-d553-4055-a361-9b8ec89f50b4) 2013-08-19 17:17:10,722 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-5-thread-48) [2b2fbb88] Command org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand throw Vdc Bll exception. With error message VdcBLLException: Drive image file imageErr could not be found (Failed with error imageErr and code 13) 2013-08-19 17:17:10,725 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] START, VmReplicateDiskFinishVDSCommand(HostName = 10.35.160.63, HostId = 17b889df-40e6-46c4-9d94-faf4282f190c, vmId=9466621b-d553-4055-a361-9b8ec89f50b4), log id: 53c00b6 2013-08-19 17:17:10,735 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] Failed in VmReplicateDiskFinishVDS method 2013-08-19 17:17:10,736 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found 2013-08-19 17:17:10,736 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] Command org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand return value 2013-08-19 17:17:10,736 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] Command VmReplicateDiskFinishVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found 2013-08-19 17:17:10,736 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] FINISH, VmReplicateDiskFinishVDSCommand, log id: 53c00b6
Mismatch between vmSnapshot / vmDiskReplicateStart calls: Thread-906::DEBUG::2013-08-19 17:16:40,690::BindingXMLRPC::974::vds::(wrapper) client [10.35.161.89]::call vmSnapshot with ('9466621b-d553-4055-a361-9b8ec89f50b4', [{'baseVolumeID': '5ec428b9-5659-49f6-9495-d801a7e69a69', 'domainID': '47bd0b3f-c6e7-4dde-80bd-da4fc94a8095', 'volumeID': 'cff033d1-c465-4af2-be1c-231d5e1c24fa', 'imageID': '2da54f8a-8e93-46c1-9d3b-3fcbd017997f'}], '') {} Thread-931::DEBUG::2013-08-19 17:17:10,736::BindingXMLRPC::974::vds::(wrapper) client [10.35.161.89]::call vmDiskReplicateStart with ('9466621b-d553-4055-a361-9b8ec89f50b4', {'device': 'disk', 'domainID': 'b596943e-d3d6-4d0d-a8a1-86ba9a73de6b', 'volumeID': '2006d7e3-419e-412f-9a3a-1c7f53f8721b', 'poolID': 'ccc4e869-894b-4e12-9974-805019c9db07', 'imageID': '16df32ed-598b-479c-b91a-6a806d338a62'}, {'device': 'disk', 'domainID': '47bd0b3f-c6e7-4dde-80bd-da4fc94a8095', 'volumeID': '2006d7e3-419e-412f-9a3a-1c7f53f8721b', 'poolID': 'ccc4e869-894b-4e12-9974-805019c9db07', 'imageID': '16df32ed-598b-479c-b91a-6a806d338a62'}) {} flowID [2b2fbb88] vmSnapshot is called with (baseVolumeID -> volumeID): 5ec428b9-5659-49f6-9495-d801a7e69a69 -> cff033d1-c465-4af2-be1c-231d5e1c24fa vmDiskReplicateStart is called with (volumeID): 2006d7e3-419e-412f-9a3a-1c7f53f8721b No idea where that comes from... another snapshot/disk/vm?
Jakub, * can you please attach the db dump as well? * which jenkins job is failing on it?
Created attachment 789575 [details] DB dump (In reply to Daniel Erez from comment #11) > Jakub, > * can you please attach the db dump as well? Attached > * which jenkins job is failing on it? storage live migration sanity http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_live_migration_sanity-iscsi-rest/30/testReport/storage_live_migration.test_sanity/001-LiveMigrate;test_vms_live_migration/LiveMigrate_test_vms_live_migration/
IIUC, it seems that the test tries to live migrate an unplugged disk (not sure whether it's intentional or not). As for the error message, it should be resolved as part of bug 957494 (which will gracefully block the operation). Since live migration of an unplugged disk is not applicable, can we verify in the test that the disk is plugged before trying to live migrate it?
The disks were plugged before running live migration.
After examining the rest tests (executed manually), it seems that the second disk was indeed unplugged before invoking live migration. Should figure out what is needed to be changed in the tests.
* Removing TestBlocker from Keywords as the test needs to be fixed (verify that the disk is plugged before initiating live migrating). * Leaving the bug open for resolving the error issue; i.e. block live migrating an unplugged disk that is attached to a running VM.
To clarify, this is the required behavior when selecting N disks of a VM and sending a "move" command: * if the VM is down - cold move all the disks * if the VM is up: - if all the disks are plugged - take a live snapshot and live migrate all of them - if all the disks are unplugged - cold move all the disks - if some are plugged and some unplugged - fail with a canDoAction error.
Tested on FCP Data Centers Verified, tested on RHEVM 3.3 - IS18 environment: Host OS: RHEL 6.5 RHEVM: rhevm-3.3.0-0.25.beta1.el6ev.noarch PythonSDK: rhevm-sdk-python-3.3.0.15-1.el6ev.noarch VDSM: vdsm-4.13.0-0.2.beta1.el6ev.x86_64 LIBVIRT: libvirt-0.10.2-27.el6.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.412.el6.x86_64 SANLOCK: sanlock-2.8-1.el6.x86_64
This bug is currently attached to errata RHEA-2013:15231. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag. Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information: * Cause: What actions or circumstances cause this bug to present. * Consequence: What happens when the bug presents. * Fix: What was done to fix the bug. * Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore') Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug. For further details on the Cause, Consequence, Fix, Result format please refer to: https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes Thanks in advance.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-0038.html
*** Bug 927203 has been marked as a duplicate of this bug. ***