Bug 977261

Summary: live-migration: Shutdown of VM during migration cause that disk is locked forever
Product: Red Hat Enterprise Virtualization Manager Reporter: Jakub Libosvar <jlibosva>
Component: ovirt-engineAssignee: Nobody's working on this, feel free to take it <nobody>
Status: CLOSED CURRENTRELEASE QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: acathrow, amureini, derez, iheim, jkt, lpeer, Rhev-m-bugs, scohen, yeylon
Target Milestone: ---   
Target Release: 3.3.0   
Hardware: All   
OS: Linux   
Whiteboard: storage
Fixed In Version: is1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Engine vdsm logs none

Description Jakub Libosvar 2013-06-24 07:49:14 UTC
Created attachment 764473 [details]
Engine vdsm logs

Description of problem:
When performing storage live migration and VM is shutdown, migration task ends with error. No rollback is done, disk is Locked forever.

VDSM:
d521257b-c9c9-4b3a-8c62-649c2c6984f1::ERROR::2013-06-24 09:29:48,070::task::850::TaskManager.Task::(_setError) Task=`d521257b-c9c9-4b3a-8c62-649c2c6984f1`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 318, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1802, in syncImageData
    syncType)
  File "/usr/share/vdsm/storage/image.py", line 682, in syncData
    {'srcChain': srcChain, 'dstChain': dstChain})
  File "/usr/share/vdsm/storage/image.py", line 576, in _interImagesCopy
    size=srcSize)
  File "/usr/share/vdsm/storage/misc.py", line 472, in ddWatchCopy
    raise se.MiscBlockWriteException(dst, offset, size)
MiscBlockWriteException: Internal block device write failure: 'name=/rhev/data-center/0c7fe5e7-9936-4681-94ba-dc82ce816604/835c6aa5-89e7-4d48-bf31-1498d93c558c/images/d504473f-2482-4fb2-93a8-79ad1661f510/443092b8-c6ae-42f6-a2c1-4877b1403d91, offset=0, size=1073741824'

Engine:
2013-06-24 09:29:47,474 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-4-thread-50) [7d91f0ab] Ending command with failure: org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand
2013-06-24 09:29:57,439 ERROR [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-55) BaseAsyncTask::LogEndTaskFailure: Task d521257b-c9c9-4b3a-8c62-649c2c6984f1 (Parent Command LiveMigrateDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:^M
-- Result: cleanSuccess^M
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Internal block device write failure,^M
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Internal block device write failure


2013-06-24 09:29:57,460 ERROR [org.ovirt.engine.core.vdsbroker.ResourceManager] (pool-4-thread-50) [175b7633] CreateCommand failed: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: Vds with id: 00000000-0000-0000-0000-000000000000 was not found
    at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.initializeVdsBroker(VdsBrokerCommand.java:50) [engine-vdsbroker.jar:]
    at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.<init>(VdsBrokerCommand.java:29) [engine-vdsbroker.jar:]
    at org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskVDSCommand.<init>(VmReplicateDiskVDSCommand.java:9) [engine-vdsbroker.jar:]
    at org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand.<init>(VmReplicateDiskFinishVDSCommand.java:8) [engine-vdsbroker.jar:]

2013-06-24 09:29:57,477 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-4-thread-50) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateDisk threw an exception: javax.ejb.EJBException: java.lang.RuntimeException: VdcBLLException: Vds with id: 00000000-0000-0000-0000-000000000000 was not found
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:191) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:237) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
    at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:218) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
    at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]





Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Jakub Libosvar 2013-06-24 07:51:17 UTC
Version-Release number of selected component (if applicable):
rhevm-3.2.1-0.31.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Have running VM
2. Migrate its disk
3. Shutdown VM during migration

Actual results:
Disk is stuck, I'm not sure about data

Expected results:
Disk should either finish migration or rollback

Additional info:
Logs attached

Comment 2 Aharon Canan 2013-08-15 11:52:41 UTC
Verified using is10 according to above steps.
Disk is not locked, can start VM and logs looks fine.

Comment 3 Itamar Heim 2014-01-21 22:22:16 UTC
Closing - RHEV 3.3 Released

Comment 4 Itamar Heim 2014-01-21 22:27:20 UTC
Closing - RHEV 3.3 Released