Bug 987783 - Live Storage Migration attempted on an unplugged disk of a running VM (instead of a simple cold move)
Summary: Live Storage Migration attempted on an unplugged disk of a running VM (instea...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.3.0
Assignee: Daniel Erez
QA Contact: yeylon@redhat.com
URL:
Whiteboard: storage
: 927203 (view as bug list)
Depends On:
Blocks: 962497 965676 966153 966618
TreeView+ depends on / blocked
 
Reported: 2013-07-24 07:23 UTC by vvyazmin@redhat.com
Modified: 2018-12-04 15:38 UTC (History)
17 users (show)

Fixed In Version: is18
Doc Type: Enhancement
Doc Text:
Previously, live storage migration of unplugged disks on a running virtual machine would fail. With this feature, administrators can now migrate unplugged disks on a running virtual machine as though the virtual machine was offline. Furthermore, errors messages are now displayed when attempting to migrate a mixture of plugged and unplugged disks to notify administrators that such operations are not permitted.
Clone Of:
Environment:
Last Closed: 2014-01-21 17:34:39 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
amureini: Triaged+


Attachments (Terms of Use)
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm (2.61 MB, application/x-gzip)
2013-07-24 07:23 UTC, vvyazmin@redhat.com
no flags Details
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm (iSCSI) (3.01 MB, application/x-gzip)
2013-07-24 07:23 UTC, vvyazmin@redhat.com
no flags Details
Logs from jenkins job (5.59 MB, application/x-bzip)
2013-08-22 11:23 UTC, Jakub Libosvar
no flags Details
DB dump (1.16 MB, text/x-sql)
2013-08-23 12:00 UTC, Jakub Libosvar
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2014:0038 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Virtualization Manager 3.3.0 update 2014-01-21 22:03:06 UTC
oVirt gerrit 19105 0 None None None Never

Description vvyazmin@redhat.com 2013-07-24 07:23:00 UTC
Created attachment 777619 [details]
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm

Description of problem:
Live Storage Migration failed on VmReplicateDiskStartVDS command.

Version-Release number of selected component (if applicable):
RHEVM 3.3 - IS6 environment:

RHEVM: rhevm-3.3.0-0.9.master.el6ev.noarch
VDSM: vdsm-4.12.0-rc1.12.git8ee6885.el6.x86_64
LIBVIRT: libvirt-0.10.2-18.el6_4.9.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.355.el6_4.5.x86_64
SANLOCK: sanlock-2.6-2.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
Create FCP or iSCSI Data Center with two hosts connected to two Storage Domains.
Create VM with one disk
Try run Live Storage Migration (LSM)

Actual results:
Failed run LSM

Expected results:
Succeed run LSM

Impact on user:

Workaround:
none

Additional info:

/var/log/ovirt-engine/engine.log

2013-07-23 11:50:47,018 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] START, VmReplicateDiskStartVDSCommand(HostName = tigris02.scl.lab.tlv.redhat.com, HostId = 3cca5914-a9
84-45e5-9e02-1e22f2073049, vmId=0f91a910-8758-44dd-b3e6-5888f1c064e0), log id: 504979cb
2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] Failed in VmReplicateDiskStartVDS method
2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDis
kStartVDS, error = Drive image file %s could not be found
2013-07-23 11:50:47,036 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] Command org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand return value 
 StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=13, mMessage=Drive image file %s could not be found]]
2013-07-23 11:50:47,036 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] HostName = tigris02.scl.lab.tlv.redhat.com
2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] Command VmReplicateDiskStartVDS execution failed. Exception: VDSErrorException: VDSGenericException: V
DSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive image file %s could not be found
2013-07-23 11:50:47,036 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-123) [17aa3b6] FINISH, VmReplicateDiskStartVDSCommand, log id: 504979cb
2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.bll.lsm.VmReplicateDiskStartTaskHandler] (pool-5-thread-123) [17aa3b6] Failed VmReplicateDiskStart (Disk 506c4e21-0da2-4bee-b0a3-8813645f6eac , VM 0f91a910-8758-44dd-b3e6-5888f1c064e0)
2013-07-23 11:50:47,036 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-5-thread-123) [17aa3b6] Command org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand throw Vdc Bll exception. With error message VdcBLLException: 
Drive image file imageErr could not be found (Failed with VDSM error imageErr and code 13)
2013-07-23 11:50:47,038 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] START, VmReplicateDiskFinishVDSCommand(HostName = tigris02.scl.lab.tlv.redhat.com, HostId = 3cca5914-
a984-45e5-9e02-1e22f2073049, vmId=0f91a910-8758-44dd-b3e6-5888f1c064e0), log id: 5d1d5a05
2013-07-23 11:50:47,044 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] Failed in VmReplicateDiskFinishVDS method
2013-07-23 11:50:47,044 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDi
skFinishVDS, error = Drive image file %s could not be found
2013-07-23 11:50:47,044 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] Command org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand return value 
 StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=13, mMessage=Drive image file %s could not be found]]
2013-07-23 11:50:47,045 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] HostName = tigris02.scl.lab.tlv.redhat.com
2013-07-23 11:50:47,045 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-123) [17aa3b6] Command VmReplicateDiskFinishVDS execution failed. Exception: VDSErrorException: VDSGenericException:
 VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found

/var/log/vdsm/vdsm.log

Comment 1 vvyazmin@redhat.com 2013-07-24 07:23:41 UTC
Created attachment 777620 [details]
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm (iSCSI)

Comment 7 vvyazmin@redhat.com 2013-08-18 07:51:10 UTC
Verified, tested on RHEVM 3.3 - IS10 environment:

RHEVM:  rhevm-3.3.0-0.15.master.el6ev.noarch
PythonSDK:  rhevm-sdk-python-3.3.0.10-1.el6ev.noarch
VDSM:  vdsm-4.12.0-61.git8178ec2.el6ev.x86_64
LIBVIRT:  libvirt-0.10.2-18.el6_4.9.x86_64
QEMU & KVM:  qemu-kvm-rhev-0.12.1.2-2.355.el6_4.5.x86_64
SANLOCK:  sanlock-2.8-1.el6.x86_64

Comment 8 Jakub Libosvar 2013-08-22 11:21:30 UTC
Reopening, still occurs in is10 rhevm-3.3.0-0.15.master.el6ev.noarch, vdsm-4.12.0-61.git8178ec2.el6ev.x86_64

Comment 9 Jakub Libosvar 2013-08-22 11:23:22 UTC
Created attachment 789154 [details]
Logs from jenkins job

2013-08-19 17:17:10,718 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] Failed in VmReplicateDiskStartVDS method
2013-08-19 17:17:10,718 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive image file %s could not be found
2013-08-19 17:17:10,719 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] Command org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand return value 
2013-08-19 17:17:10,719 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] Command VmReplicateDiskStartVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive image file %s could not be found
2013-08-19 17:17:10,719 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (pool-5-thread-48) [2b2fbb88] FINISH, VmReplicateDiskStartVDSCommand, log id: 6a1203a7
2013-08-19 17:17:10,721 ERROR [org.ovirt.engine.core.bll.lsm.VmReplicateDiskStartTaskHandler] (pool-5-thread-48) [2b2fbb88] Failed VmReplicateDiskStart (Disk 16df32ed-598b-479c-b91a-6a806d338a62 , VM 9466621b-d553-4055-a361-9b8ec89f50b4)
2013-08-19 17:17:10,722 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-5-thread-48) [2b2fbb88] Command org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand throw Vdc Bll exception. With error message VdcBLLException: Drive image file imageErr could not be found (Failed with error imageErr and code 13)
2013-08-19 17:17:10,725 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] START, VmReplicateDiskFinishVDSCommand(HostName = 10.35.160.63, HostId = 17b889df-40e6-46c4-9d94-faf4282f190c, vmId=9466621b-d553-4055-a361-9b8ec89f50b4), log id: 53c00b6
2013-08-19 17:17:10,735 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] Failed in VmReplicateDiskFinishVDS method
2013-08-19 17:17:10,736 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found
2013-08-19 17:17:10,736 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] Command org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand return value 
2013-08-19 17:17:10,736 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] Command VmReplicateDiskFinishVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found
2013-08-19 17:17:10,736 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (pool-5-thread-48) [2b2fbb88] FINISH, VmReplicateDiskFinishVDSCommand, log id: 53c00b6

Comment 10 Federico Simoncelli 2013-08-22 13:26:52 UTC
Mismatch between vmSnapshot / vmDiskReplicateStart calls:

Thread-906::DEBUG::2013-08-19 17:16:40,690::BindingXMLRPC::974::vds::(wrapper) client [10.35.161.89]::call vmSnapshot with ('9466621b-d553-4055-a361-9b8ec89f50b4', [{'baseVolumeID': '5ec428b9-5659-49f6-9495-d801a7e69a69', 'domainID': '47bd0b3f-c6e7-4dde-80bd-da4fc94a8095', 'volumeID': 'cff033d1-c465-4af2-be1c-231d5e1c24fa', 'imageID': '2da54f8a-8e93-46c1-9d3b-3fcbd017997f'}], '') {}

Thread-931::DEBUG::2013-08-19 17:17:10,736::BindingXMLRPC::974::vds::(wrapper) client [10.35.161.89]::call vmDiskReplicateStart with ('9466621b-d553-4055-a361-9b8ec89f50b4', {'device': 'disk', 'domainID': 'b596943e-d3d6-4d0d-a8a1-86ba9a73de6b', 'volumeID': '2006d7e3-419e-412f-9a3a-1c7f53f8721b', 'poolID': 'ccc4e869-894b-4e12-9974-805019c9db07', 'imageID': '16df32ed-598b-479c-b91a-6a806d338a62'}, {'device': 'disk', 'domainID': '47bd0b3f-c6e7-4dde-80bd-da4fc94a8095', 'volumeID': '2006d7e3-419e-412f-9a3a-1c7f53f8721b', 'poolID': 'ccc4e869-894b-4e12-9974-805019c9db07', 'imageID': '16df32ed-598b-479c-b91a-6a806d338a62'}) {} flowID [2b2fbb88]

vmSnapshot is called with (baseVolumeID -> volumeID):

 5ec428b9-5659-49f6-9495-d801a7e69a69 -> cff033d1-c465-4af2-be1c-231d5e1c24fa

vmDiskReplicateStart is called with (volumeID):

 2006d7e3-419e-412f-9a3a-1c7f53f8721b

No idea where that comes from...  another snapshot/disk/vm?

Comment 11 Daniel Erez 2013-08-22 13:40:12 UTC
Jakub,
* can you please attach the db dump as well?
* which jenkins job is failing on it?

Comment 12 Jakub Libosvar 2013-08-23 12:00:39 UTC
Created attachment 789575 [details]
DB dump

(In reply to Daniel Erez from comment #11)
> Jakub,
> * can you please attach the db dump as well?
Attached

> * which jenkins job is failing on it?
storage live migration sanity

http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_live_migration_sanity-iscsi-rest/30/testReport/storage_live_migration.test_sanity/001-LiveMigrate;test_vms_live_migration/LiveMigrate_test_vms_live_migration/

Comment 13 Daniel Erez 2013-09-01 11:27:16 UTC
IIUC, it seems that the test tries to live migrate an unplugged disk (not sure whether it's intentional or not). As for the error message, it should be resolved as part of bug 957494 (which will gracefully block the operation).

Since live migration of an unplugged disk is not applicable, can we verify in the test that the disk is plugged before trying to live migrate it?

Comment 15 Meital Bourvine 2013-09-02 13:04:13 UTC
The disks were plugged before running live migration.

Comment 16 Daniel Erez 2013-09-02 15:40:37 UTC
After examining the rest tests (executed manually), it seems that the second disk was indeed unplugged before invoking live migration. Should figure out what is needed to be changed in the tests.

Comment 18 Daniel Erez 2013-09-03 11:42:43 UTC
* Removing TestBlocker from Keywords as the test needs to be fixed (verify that the disk is plugged before initiating live migrating).

* Leaving the bug open for resolving the error issue; i.e. block live migrating an unplugged disk that is attached to a running VM.

Comment 19 Allon Mureinik 2013-09-09 08:34:21 UTC
To clarify, this is the required behavior when selecting N disks of a VM and sending a "move" command:

* if the VM is down - cold move all the disks
* if the VM is up:
 - if all the disks are plugged - take a live snapshot and live migrate all of them
 - if all the disks are unplugged - cold move all the disks
 - if some are plugged and some unplugged - fail with a canDoAction error.

Comment 20 vvyazmin@redhat.com 2013-10-20 10:53:03 UTC
Tested on FCP Data Centers
Verified, tested on RHEVM 3.3 - IS18 environment:

Host OS: RHEL 6.5

RHEVM:  rhevm-3.3.0-0.25.beta1.el6ev.noarch
PythonSDK:  rhevm-sdk-python-3.3.0.15-1.el6ev.noarch
VDSM:  vdsm-4.13.0-0.2.beta1.el6ev.x86_64
LIBVIRT:  libvirt-0.10.2-27.el6.x86_64
QEMU & KVM:  qemu-kvm-rhev-0.12.1.2-2.412.el6.x86_64
SANLOCK:  sanlock-2.8-1.el6.x86_64

Comment 23 Charlie 2013-11-28 00:10:08 UTC
This bug is currently attached to errata RHEA-2013:15231. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 25 errata-xmlrpc 2014-01-21 17:34:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0038.html

Comment 26 Allon Mureinik 2015-01-24 09:42:31 UTC
*** Bug 927203 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.