Bug 879167 - engine [Live Storage Migration]: cannot run vm, create template or export a vm after live storage migration failure
Summary: engine [Live Storage Migration]: cannot run vm, create template or export a v...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.1.0
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
: 3.2.0
Assignee: Daniel Erez
QA Contact: Dafna Ron
URL:
Whiteboard: storage
Depends On: 915354
Blocks: 915537
TreeView+ depends on / blocked
 
Reported: 2012-11-22 09:08 UTC by Dafna Ron
Modified: 2016-02-10 20:22 UTC (History)
10 users (show)

Fixed In Version: sf3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs from failure (879.67 KB, application/x-gzip)
2012-11-22 09:08 UTC, Dafna Ron
no flags Details
image does not exist logs (276.88 KB, application/x-gzip)
2012-11-22 09:10 UTC, Dafna Ron
no flags Details
logs and dbdump (1.45 MB, application/x-gzip)
2013-01-20 17:22 UTC, Dafna Ron
no flags Details
new logs from 3.2 (2.73 MB, application/x-gzip)
2013-02-06 13:56 UTC, Dafna Ron
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 10154 0 None MERGED core,ui: support multiple concurrent disks migration 2020-05-18 07:45:18 UTC

Description Dafna Ron 2012-11-22 09:08:13 UTC
Created attachment 649588 [details]
logs from failure

Description of problem:

after live storage migration failure I stopped a my vm. 
trying to re-run it again, create template from it or export it will give error that the image does not exist. 

Version-Release number of selected component (if applicable):

si24.4

How reproducible:

100%

Steps to Reproduce:
1. run several vm's and move their disks
2. after live migration fails stop the vms and try to re-run them
3.
  
Actual results:

engine rolled back on love storage migration after getting error from vdsm that it failed to remove logical volume as a result when we shut down the vm and try to run it, create template or export engine will send wrong vg uuid to vdsm. 

Expected results:

engine should not rollback on every step in live storage migration. 

Additional info:logs from original failure + logs after sutting down the vm. 

failure to delete the volume: 

Thread-17922::ERROR::2012-11-21 17:05:04,056::task::853::TaskManager.Task::(_setError) Task=`c07ee9e7-fff6-4d00-96d1-b88f36cd36de`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1349, in deleteImage
    dom.deleteImage(sdUUID, imgUUID, volsByImg)
  File "/usr/share/vdsm/storage/blockSD.py", line 945, in deleteImage
    deleteVolumes(sdUUID, toDel)
  File "/usr/share/vdsm/storage/blockSD.py", line 177, in deleteVolumes
    lvm.removeLVs(sdUUID, vols)
  File "/usr/share/vdsm/storage/lvm.py", line 1010, in removeLVs
    raise se.CannotRemoveLogicalVolume(vgName, str(lvNames))
CannotRemoveLogicalVolume: Cannot remove Logical Volume: ('d40978c8-3fab-483b-b786-2f1e1c5cf130', "('34ff2273-e1cd-41b9-9c30-61defdc85948', '98d1cf94-5e59-4f85-8696-698b0269e347')")


image does not exist error: 

7b870179-d4e8-488c-9b6e-a99b0a1a2fc5::ERROR::2012-11-22 10:39:09,989::task::853::TaskManager.Task::(_setError) Task=`7b870179-d4e8-488c-9b6e-a99b0a1a2fc5`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 320, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 63, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1741, in moveImage
    image.Image(repoPath).move(srcDomUUID, dstDomUUID, imgUUID, vmUUID, op, postZero, force)
  File "/usr/share/vdsm/storage/image.py", line 635, in move
    chains = self._createTargetImage(destDom, srcSdUUID, imgUUID)
  File "/usr/share/vdsm/storage/image.py", line 484, in _createTargetImage
    srcChain = self.getChain(srcSdUUID, imgUUID)
  File "/usr/share/vdsm/storage/image.py", line 314, in getChain
    raise se.ImageDoesNotExistInSD(imgUUID, sdUUID)
ImageDoesNotExistInSD: Image does not exist in domain: 'image=270835d7-b3bb-4e1c-a34d-f09d0538affd, domain=8c0ef67f-03c1-4fbf-b099-3e3668405cfc'

engine is sending sdUUID a5f10bab-bd9d-4834-b1d9-b29d0ec887dc

2012-11-22 10:19:48,246 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.CopyImageVDSCommand] (ajp-/127.0.0.1:8702-5) [71667ddd] -- copyImage parameters:
                sdUUID=a5f10bab-bd9d-4834-b1d9-b29d0ec887dc
                spUUID=edf0ee04-0cc2-4e13-877d-1e89541aea55
                vmGUID=818ebfe3-c74c-4230-a272-8287463f77e8
                srcImageGUID=8e2e185f-6789-4b99-b684-079990ada9a5
                srcVolUUID=6f5e0d04-1d04-4573-b3ab-37d1b3e79387
                dstImageGUID=e70899b1-fe8d-4f8c-98ec-d79e2e070885
                dstVolUUID=d46b8d44-3131-4ccb-ba5f-37c165fe9357
                descr=Auto-generated for Live Storage Migration of NFS-RHEL6_iSCSI_Disk1
                

lv is under vg d40978c8-3fab-483b-b786-2f1e1c5cf130:

root@gold-vdsc ~]# lvs |grep 98d1cf94-5e59-4f85-8696-698b0269e347
  98d1cf94-5e59-4f85-8696-698b0269e347 d40978c8-3fab-483b-b786-2f1e1c5cf130 -wi-a---   2.00g

Comment 1 Dafna Ron 2012-11-22 09:10:16 UTC
Created attachment 649592 [details]
image does not exist logs

Comment 2 Eduardo Warszawski 2012-11-23 07:38:08 UTC
The removal of the src image failed since the LVs were still open.
Afterwards the destination image was succefully removed.
Engine still look for the image at the destination.


Thread-16680::INFO::2012-11-21 16:44:45,322::logUtils::37::dispatcher::(wrapper) Run and protect: createVolume(sdUUID='d40978c8-3fab-483b-b786-2f1e1c5cf130', spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', size='16106127360', volFormat=4, preallocate=2, diskType=2, volUUID='34ff2273-e1cd-41b9-9c30-61defdc85948', desc='', srcImgUUID='a8eb3963-520a-435d-aba4-80a0dd2d9983', srcVolUUID='39f89a6a-7fbb-43c0-a5ea-19b271f51829')
Thread-17083::INFO::2012-11-21 16:51:17,001::logUtils::37::dispatcher::(wrapper) Run and protect: prepareImage(sdUUID='d40978c8-3fab-483b-b786-2f1e1c5cf130', spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', volUUID='34ff2273-e1cd-41b9-9c30-61defdc85948')
Thread-17083::INFO::2012-11-21 16:51:17,419::logUtils::39::dispatcher::(wrapper) Run and protect: prepareImage, Return response: {'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/d40978c8-3fab-483b-b786-2f1e1c5cf130/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/34ff2273-e1cd-41b9-9c30-61defdc85948', 'chain': [{'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/d40978c8-3fab-483b-b786-2f1e1c5cf130/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/39f89a6a-7fbb-43c0-a5ea-19b271f51829', 'domainID': 'd40978c8-3fab-483b-b786-2f1e1c5cf130', 'volumeID': '39f89a6a-7fbb-43c0-a5ea-19b271f51829', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}, {'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/d40978c8-3fab-483b-b786-2f1e1c5cf130/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/34ff2273-e1cd-41b9-9c30-61defdc85948', 'domainID': 'd40978c8-3fab-483b-b786-2f1e1c5cf130', 'volumeID': '34ff2273-e1cd-41b9-9c30-61defdc85948', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}]}
Thread-17595::INFO::2012-11-21 17:00:27,241::logUtils::37::dispatcher::(wrapper) Run and protect: createVolume(sdUUID='d40978c8-3fab-483b-b786-2f1e1c5cf130', spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', size='16106127360', volFormat=4, preallocate=2, diskType=2, volUUID='98d1cf94-5e59-4f85-8696-698b0269e347', desc='', srcImgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', srcVolUUID='34ff2273-e1cd-41b9-9c30-61defdc85948')
Thread-17639::INFO::2012-11-21 17:00:56,601::logUtils::37::dispatcher::(wrapper) Run and protect: prepareImage(sdUUID='d40978c8-3fab-483b-b786-2f1e1c5cf130', spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', volUUID='98d1cf94-5e59-4f85-8696-698b0269e347')
Thread-17639::INFO::2012-11-21 17:00:58,525::logUtils::39::dispatcher::(wrapper) Run and protect: prepareImage, Return response: {'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/d40978c8-3fab-483b-b786-2f1e1c5cf130/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/98d1cf94-5e59-4f85-8696-698b0269e347', 'chain': [{'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/d40978c8-3fab-483b-b786-2f1e1c5cf130/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/39f89a6a-7fbb-43c0-a5ea-19b271f51829', 'domainID': 'd40978c8-3fab-483b-b786-2f1e1c5cf130', 'volumeID': '39f89a6a-7fbb-43c0-a5ea-19b271f51829', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}, {'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/d40978c8-3fab-483b-b786-2f1e1c5cf130/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/34ff2273-e1cd-41b9-9c30-61defdc85948', 'domainID': 'd40978c8-3fab-483b-b786-2f1e1c5cf130', 'volumeID': '34ff2273-e1cd-41b9-9c30-61defdc85948', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}, {'path': '/rhev/
data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/d40978c8-3fab-483b-b786-2f1e1c5cf130/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/98d1cf94-5e59-4f85-8696-698b0269e347', 'domainID': 'd40978c8-3fab-483b-b786-2f1e1c5cf130', 'volumeID': '98d1cf94-5e59-4f85-8696-698b0269e347', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}]}
Thread-17662::INFO::2012-11-21 17:01:26,372::logUtils::37::dispatcher::(wrapper) Run and protect: cloneImageStructure(spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', sdUUID='d40978c8-3fab-483b-b786-2f1e1c5cf130', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', dstSdUUID='8c0ef67f-03c1-4fbf-b099-3e3668405cfc')
Thread-17705::INFO::2012-11-21 17:02:01,007::logUtils::37::dispatcher::(wrapper) Run and protect: prepareImage(sdUUID='8c0ef67f-03c1-4fbf-b099-3e3668405cfc', spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', volUUID='98d1cf94-5e59-4f85-8696-698b0269e347')
Thread-17705::INFO::2012-11-21 17:02:01,731::logUtils::39::dispatcher::(wrapper) Run and protect: prepareImage, Return response: {'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/8c0ef67f-03c1-4fbf-b099-3e3668405cfc/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/98d1cf94-5e59-4f85-8696-698b0269e347', 'chain': [{'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/8c0ef67f-03c1-4fbf-b099-3e3668405cfc/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/39f89a6a-7fbb-43c0-a5ea-19b271f51829', 'domainID': '8c0ef67f-03c1-4fbf-b099-3e3668405cfc', 'volumeID': '39f89a6a-7fbb-43c0-a5ea-19b271f51829', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}, {'path': '/rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/8c0ef67f-03c1-4fbf-b099-3e3668405cfc/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/34ff2273-e1cd-41b9-9c30-61defdc85948', 'domainID': '8c0ef67f-03c1-4fbf-b099-3e3668405cfc', 'volumeID': '34ff2273-e1cd-41b9-9c30-61defdc85948', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}, {'path': '/rhev/
data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/8c0ef67f-03c1-4fbf-b099-3e3668405cfc/images/270835d7-b3bb-4e1c-a34d-f09d0538affd/98d1cf94-5e59-4f85-8696-698b0269e347', 'domainID': '8c0ef67f-03c1-4fbf-b099-3e3668405cfc', 'volumeID': '98d1cf94-5e59-4f85-8696-698b0269e347', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}]}
Thread-17705::INFO::2012-11-21 17:02:31,017::logUtils::37::dispatcher::(wrapper) Run and protect: teardownImage(sdUUID='8c0ef67f-03c1-4fbf-b099-3e3668405cfc', spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', volUUID=None)
Thread-17722::INFO::2012-11-21 17:02:32,872::logUtils::37::dispatcher::(wrapper) Run and protect: syncImageData(spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', sdUUID='d40978c8-3fab-483b-b786-2f1e1c5cf130', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', dstSdUUID='8c0ef67f-03c1-4fbf-b099-3e3668405cfc', syncType='INTERNAL')
Thread-17922::INFO::2012-11-21 17:04:52,415::logUtils::37::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='d40978c8-3fab-483b-b786-2f1e1c5cf130', spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', postZero='false', force='false')
Thread-17939::INFO::2012-11-21 17:05:04,896::logUtils::37::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='8c0ef67f-03c1-4fbf-b099-3e3668405cfc', spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', postZero='false', force='false')

Comment 3 Daniel Erez 2012-12-31 07:27:52 UTC
patch merged:
http://gerrit.ovirt.org/#/c/10154/
Change-Id: Iadcffa5748b58b1af40535b0447487dde6c2d6cb

Comment 6 Dafna Ron 2013-01-20 17:20:57 UTC
tested on sf3 with vdsm-4.10.2-3.0.el6ev.x86_64

I live migrated two disks of the same vm and after the move started I added disks to a second vm on src domain so that we will have low disk space. 

we failed to delete volumes on one of the disks: 

2013-01-20 17:20:05,964 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-3-thread-40) [3e493079] START, DeleteImageGroupVDSCommand( storagePoolId = afcde1c5-6022-4077-ab06-2beed7e5e404, ignoreFailoverLimit = false, compatabilityVersion = null, storageDomainId = 8bcf7e0d-a418-4210-a79d-8a7888a26c5c, imageGroupId = d4e4d029-d9c5-47ee-8df9-06b70e2536f4, postZeros = false, forceDelete = false), log id: 5a354ebc
2013-01-20 17:20:12,316 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-40) [3e493079] Failed in DeleteImageGroupVDS method
2013-01-20 17:20:12,316 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-3-thread-40) [3e493079] Error code CannotRemoveLogicalVolume and error message IRSGenericException: IRSErrorException: Failed to DeleteImageGroupVDS, error = Cannot remove Logical Volume: ('8bcf7e0d-a418-4210-a79d-8a7888a26c5c', "('f4d19fed-171f-46d1-a403-8d9736a6c280', 'adcc64f2-f4d7-4e4e-8a96-ee90f592e217')")
2013-01-20 17:20:12,316 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-3-thread-40) [3e493079] IrsBroker::Failed::DeleteImageGroupVDS due to: IRSErrorException: IRSGenericException: IRSErrorException: Failed to DeleteImageGroupVDS, error = Cannot remove Logical Volume: ('8bcf7e0d-a418-4210-a79d-8a7888a26c5c', "('f4d19fed-171f-46d1-a403-8d9736a6c280', 'adcc64f2-f4d7-4e4e-8a96-ee90f592e217')")
2013-01-20 17:20:12,366 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-3-thread-40) [3e493079] FINISH, DeleteImageGroupVDSCommand, log id: 5a354ebc
2013-01-20 17:20:12,681 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-3-thread-40) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateDisk threw an exception: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.irsbroker.IRSErrorException: IRSGenericException: IRSErrorException: Failed to DeleteImageGroupVDS, error = Cannot remove Logical Volume: ('8bcf7e0d-a418-4210-a79d-8a7888a26c5c', "('f4d19fed-171f-46d1-a403-8d9736a6c280', 'adcc64f2-f4d7-4e4e-8a96-ee90f592e217')")


we also have exception in the logs: 

2013-01-20 17:20:12,681 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-3-thread-40) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateDisk threw an exception: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.irsbroker.IRSErrorException: IRSGenericException: IRSErrorException: Failed to DeleteImageGroupVDS, error = Cannot remove Logical Volume: ('8bcf7e0d-a418-4210-a79d-8a7888a26c5c', "('f4d19fed-171f-46d1-a403-8d9736a6c280', 'adcc64f2-f4d7-4e4e-8a96-ee90f592e217')")
        at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:168) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.AbstractSPMAsyncTaskHandler.compensate(AbstractSPMAsyncTaskHandler.java:51) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.revertPreviousHandlers(CommandBase.java:595) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.internalEndWithFailure(CommandBase.java:535) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.endActionInTransactionScope(CommandBase.java:472) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1465) [engine-bll.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:166) [engine-utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:108) [engine-utils.jar:]
        at org.ovirt.engine.core.bll.CommandBase.endAction(CommandBase.java:416) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.Backend.endAction(Backend.java:376) [engine-bll.jar:]
        at sun.reflect.GeneratedMethodAccessor218.invoke(Unknown Source) [:1.7.0_09-icedtea]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_09-icedtea]
        at java.lang.reflect.Method.invoke(Method.java:601) [rt.jar:1.7.0_09-icedtea]
        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptorFactory$ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptorFactory.java:72) [jboss-as-ee.jar:7.1.3.Final-redhat-4]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:374) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.ovirt.engine.core.utils.ThreadLocalSessionCleanerInterceptor.injectWebContextToThreadLocal(ThreadLocalSessionCleanerInterceptor.java:11) [engine-utils.jar:]


the UI is reporting the action as rolled back - moving back to devel.

Comment 7 Dafna Ron 2013-01-20 17:22:51 UTC
Created attachment 683755 [details]
logs and dbdump

Comment 8 Daniel Erez 2013-01-24 23:39:19 UTC
The issue described in comment #6 is during roll-back/cleanup phase (i.e. deleting the target image). The original bug fix is to prevent rollback on failure of source image deletion. Moving to ON_QA for verification.

Dafna, can you please open a seperate bug on the issue described in the comment?

Comment 9 Dafna Ron 2013-02-06 13:55:48 UTC
I tested this scenario on vdsm-4.10.2-5.0.el6ev.x86_64 with libvirt-0.10.2-18.el6.x86_64 and qemu-kvm-rhev-0.12.1.2-2.348.el6.x86_64

we seem to be hitting multiple issues with this scenario with the same result - cannot run a vm after live storage migration. 

two examples - one vm is getting the below error below after running the vm

Thread-76880::ERROR::2013-02-06 15:25:48,795::dispatcher::66::Storage.Dispatcher.Protect::(run) {'status': {'message': "Logical volume does not exist: ('04a91189-8741-4589-900b-3adbe0908d63/04512717-e9a9-4fdc-8c2c-a0adf845e09f',)", 'code': 610}}
Thread-76880::DEBUG::2013-02-06 15:25:48,797::vm::676::vm.Vm::(_startUnderlyingVm) vmId=`9f4dbd09-e734-496e-8ba7-64297082912e`::_ongoingCreations released
Thread-76880::ERROR::2013-02-06 15:25:48,800::vm::700::vm.Vm::(_startUnderlyingVm) vmId=`9f4dbd09-e734-496e-8ba7-64297082912e`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 662, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/libvirtvm.py", line 1441, in _run
    devices = self.buildConfDevices()
  File "/usr/share/vdsm/vm.py", line 499, in buildConfDevices
    self._normalizeVdsmImg(drv)
  File "/usr/share/vdsm/vm.py", line 406, in _normalizeVdsmImg
    drv['truesize'] = res['truesize']
KeyError: 'truesize'
Thread-76880::DEBUG::2013-02-06 15:25:48,883::vm::1047::vm.Vm::(setDownStatus) vmId=`9f4dbd09-e734-496e-8ba7-64297082912e`::Changed state to Down: 'truesize'

here is the failure to delete (the storage domain is a different UUID) 

Thread-74494::ERROR::2013-02-06 14:44:08,816::dispatcher::66::Storage.Dispatcher.Protect::(run) {'status': {'message': 'Cannot remove Logical Volume: (\'e7d6614c-a33b-4e9d-82b5-34bfd12d390b\', "(\'c481dc85-6289-4215-bb66-3bcf03b00460\',
 \'04512717-e9a9-4fdc-8c2c-a0adf845e09f\')")', 'code': 551}}


other vm's are giving a libvirt error about running vms with snapshot: 

https://bugzilla.redhat.com/show_bug.cgi?id=903248

since the big issue was not fixed (can't run the vm after live storage migration), and I think that we might hit several more issues with this scenario I suggest that we make this bug a tracker bug and start opening bugs for each underline issue.

Comment 10 Dafna Ron 2013-02-06 13:56:48 UTC
Created attachment 693942 [details]
new logs from 3.2

Comment 14 Ayal Baron 2013-03-04 10:09:54 UTC
Dafna, any update on this?

Comment 16 Ayal Baron 2013-03-04 12:56:24 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > Dafna, any update on this?
> 
Discussed with Haim, bug was VERIFIED with the libvirt scratch build.

Comment 17 Haim 2013-03-04 13:41:44 UTC
we will verify this bug once official libvirt build with fix will be released.
moving back to ON_QA.

Comment 18 Dafna Ron 2013-03-13 11:44:56 UTC
verified on sf10 with vdsm-4.10.2-11.0.el6ev.x86_64 and libvirt-0.10.2-18.el6_4.eblake.2.x86_64

Comment 19 Itamar Heim 2013-06-11 09:51:56 UTC
3.2 has been released

Comment 20 Itamar Heim 2013-06-11 09:52:01 UTC
3.2 has been released

Comment 21 Itamar Heim 2013-06-11 09:58:52 UTC
3.2 has been released


Note You need to log in before you can comment on or make changes to this bug.