Bug 1392248 - NFS Live Storage migration: Could not remove all image volumes
Summary: NFS Live Storage migration: Could not remove all image volumes
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.17.32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified vote
Target Milestone: ovirt-4.0.6
: ---
Assignee: Dan Kenigsberg
QA Contact: Aharon Canan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-06 20:37 UTC by Markus Stockhausen
Modified: 2016-11-08 09:57 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-08 09:57:49 UTC
oVirt Team: Storage
amureini: ovirt-4.0.z?
amureini: planning_ack?
amureini: devel_ack?
amureini: testing_ack?


Attachments (Terms of Use)
vdsm.log (831.60 KB, text/plain)
2016-11-06 20:37 UTC, Markus Stockhausen
no flags Details
engine.log (1.99 MB, text/plain)
2016-11-06 20:42 UTC, Markus Stockhausen
no flags Details

Description Markus Stockhausen 2016-11-06 20:37:46 UTC
Created attachment 1217834 [details]
vdsm.log

Description of problem:

NFS Live storage migrations produces error "Could not remove all image volumes"


Version-Release number of selected component (if applicable):

OVirt 4.0.5
VDSM 4.17.32 (Ovirt 3.6 node)


How reproducible:

would say 100% (occured twice today)


Steps to Reproduce:
1. Start single disk live migration between to nfs storage domains

Actual results:

migration succeeds but log gives above error message at the end. The named directory is missing for NFS storage after migration. 

Expected results:

Message should not occur

Additional info:

SPM seems to be relocated directly after the error.

Comment 1 Markus Stockhausen 2016-11-06 20:38:15 UTC
SORRY! Ovirt is 4.0.4

Comment 2 Markus Stockhausen 2016-11-06 20:42:05 UTC
Created attachment 1217835 [details]
engine.log

Comment 3 Allon Mureinik 2016-11-07 09:07:50 UTC
Relevant error in vdsm.log:

Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1544, in deleteImage
    pool.deleteImage(dom, imgUUID, volsByImg)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1982, in deleteImage
    domain.deleteImage(domain.sdUUID, imgUUID, volsByImg)
  File "/usr/share/vdsm/storage/fileSD.py", line 407, in deleteImage
    raise se.ImageDeleteError("%s %s" % (imgUUID, str(e)))
ImageDeleteError: Could not remove all image's volumes: (u'e9edf93f-f20b-4ccb-b298-45b4fa8da820 [Errno 39] Directory not empty',)

From the engine log we can see that there are volumes with the "_backup" suffix added to them that seem to muck things up:

2016-11-06 21:19:32,372 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (DefaultQuartzScheduler3) [74729fc2] Command 'DeleteImageGroupVDSCommand( DeleteImageGroupVDSCommandParameters:{runAsync='true', storagePoolId='94ed7a19-fade-4bd6-83f2-2cbb2f730b95', ignoreFailoverLimit='false', storageDomainId='c86d8409-4dd6-4e30-86dc-b5175a7ceb86', imageGroupId='e9edf93f-f20b-4ccb-b298-45b4fa8da820', postZeros='false', forceDelete='false'})' execution failed: IRSGenericException: IRSErrorException: Could not remove all image's volumes: (u'e9edf93f-f20b-4ccb-b298-45b4fa8da820 [Errno 39] Directory not empty',)
2016-11-06 21:19:32,372 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (DefaultQuartzScheduler3) [74729fc2] FINISH, DeleteImageGroupVDSCommand, log id: 2e0a458e
2016-11-06 21:19:32,375 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.GetImagesListVDSCommand] (DefaultQuartzScheduler3) [74729fc2] START, GetImagesListVDSCommand( GetImagesListVDSCommandParameters:{runAsync='true', storagePoolId='94ed7a19-fade-4bd6-83f2-2cbb2f730b95', ignoreFailoverLimit='false', sdUUID='c86d8409-4dd6-4e30-86dc-b5175a7ceb86'}), log id: 3b2ad90c
2016-11-06 21:19:32,524 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler3) [74729fc2] ERROR, GetImagesListVDSCommand( GetImagesListVDSCommandParameters:{runAsync='true', storagePoolId='94ed7a19-fade-4bd6-83f2-2cbb2f730b95', ignoreFailoverLimit='false', sdUUID='c86d8409-4dd6-4e30-86dc-b5175a7ceb86'}), exception: For input string: "d75a16fec85a_backup", log id: 3b2ad90c
2016-11-06 21:19:32,524 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler3) [74729fc2] Exception: java.lang.NumberFormatException: For input string: "d75a16fec85a_backup"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) [rt.jar:1.8.0_102]
	at java.lang.Long.parseLong(Long.java:589) [rt.jar:1.8.0_102]
	at java.lang.Long.valueOf(Long.java:776) [rt.jar:1.8.0_102]
	at java.lang.Long.decode(Long.java:928) [rt.jar:1.8.0_102]
	at java.util.UUID.fromString(UUID.java:206) [rt.jar:1.8.0_102]
	at org.ovirt.engine.core.compat.Guid.<init>(Guid.java:73) [compat.jar:]
	at org.ovirt.engine.core.vdsbroker.irsbroker.GetImagesListVDSCommand.executeIrsBrokerCommand(GetImagesListVDSCommand.java:23) [vdsbroker.jar:]
	at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.lambda$executeVDSCommand$0(IrsBrokerCommand.java:161) [vdsbroker.jar:]
	at org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData.runInControlledConcurrency(IrsProxyData.java:248) [vdsbroker.jar:]
	at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.executeVDSCommand(IrsBrokerCommand.java:158) [vdsbroker.jar:]
	at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:73) [vdsbroker.jar:]
	at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]
	at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:451) [vdsbroker.jar:]
	at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:2171) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.StorageHandlingCommandBase.runVdsCommand(StorageHandlingCommandBase.java:657) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand.isImageRemovedFromStorage(RemoveImageCommand.java:108) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand.executeCommand(RemoveImageCommand.java:89) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1305) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1447) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2075) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:166) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:105) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1490) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:398) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:493) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:475) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:670) [bll.jar:]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_102]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_102]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_102]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_102]
	at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437)
	at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:70) [wildfly-weld-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:80) [wildfly-weld-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:93) [wildfly-weld-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437)
	at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73) [weld-core-impl-2.3.2.Final.jar:2.3.2.Final]
	at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:83) [wildfly-weld-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) [wildfly-ee-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
	at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:263) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:243) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:43) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:66) [wildfly-ejb3-10.0.0.Final.jar:10.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:64)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:356)
	at org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:636)
	at org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:61)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:356)
	at org.jboss.invocation.PrivilegedWithCombinerInterceptor.processInvocation(PrivilegedWithCombinerInterceptor.java:80)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
	at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:195)
	at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:185)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
	at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:73)
	at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view3.runInternalAction(Unknown Source) [bll.jar:]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_102]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_102]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_102]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_102]
	at org.jboss.weld.util.reflection.Reflections.invokeAndUnwrap(Reflections.java:436) [weld-core-impl-2.3.2.Final.jar:2.3.2.Final]
	at org.jboss.weld.bean.proxy.EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.java:127) [weld-core-impl-2.3.2.Final.jar:2.3.2.Final]
	at org.jboss.weld.bean.proxy.EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56) [weld-core-impl-2.3.2.Final.jar:2.3.2.Final]
	at org.jboss.weld.bean.proxy.ProxyMethodHandler.invoke(ProxyMethodHandler.java:100) [weld-core-impl-2.3.2.Final.jar:2.3.2.Final]
	at org.ovirt.engine.core.bll.BackendCommandObjectsHandler$BackendInternal$BackendLocal$2049259618$Proxy$_$$_Weld$EnterpriseProxy$.runInternalAction(Unknown Source) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInternalAction(CommandBase.java:2471) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInternalActionWithTasksContext(CommandBase.java:2496) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInternalActionWithTasksContext(CommandBase.java:2491) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.lsm.LiveStorageMigrationHelper.removeImage(LiveStorageMigrationHelper.java:23) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand.performNextOperation(LiveMigrateDiskCommand.java:104) [bll.jar:]
	at org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:28) [bll.jar:]
	at org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:64) [bll.jar:]
	at org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:112) [bll.jar:]
	at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source) [:1.8.0_102]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_102]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_102]
	at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:77) [scheduler.jar:]
	at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:51) [scheduler.jar:]
	at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_102]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_102]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_102]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_102]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_102]


Nir - any idea where these volumes come from?

Comment 4 Markus Stockhausen 2016-11-07 10:15:45 UTC
Snapshots produced some Quite Strange errors in the past. So wie sometimes Backup images folders with Suffix _backup.

That should not produce SPM relocation

Comment 5 Nir Soffer 2016-11-07 17:22:55 UTC
(In reply to Markus Stockhausen from comment #4)
> Snapshots produced some Quite Strange errors in the past. So wie sometimes
> Backup images folders with Suffix _backup.
> 
> That should not produce SPM relocation

In a perfect world, yes. In the current system you cannot add arbitrary files
inside storage domain, in particular no in an image directory.

Please move images/volumes backup files elsewhere, this will probably fix your
failing live storage migration.

Comment 6 Markus Stockhausen 2016-11-07 20:15:10 UTC
Thanks for the carification. Removing the backup folders fixed our problem. Nevertheless wouldnt it make sense to ignore folders that do not match the UUID scheme?

Comment 7 Nir Soffer 2016-11-07 20:28:03 UTC
(In reply to Markus Stockhausen from comment #6)
> Thanks for the carification. Removing the backup folders fixed our problem.
> Nevertheless wouldnt it make sense to ignore folders that do not match the
> UUID scheme?

When we delete an image directory, we delete:

- the volume data file (uuid)
- the volume metadata file (uuid.meta)
- the volume lease file (uuid.lease)

We assume that there are no other files in the image directory, and the directory
must be empty after removing all files.

We remove any other files in this directory but this means we will delete your
backups. Maybe the best thing would be to warn that the directory is not empty
without failing the request.

Comment 8 Yaniv Lavi 2016-11-08 09:57:49 UTC
The storage domains are not a API you can use and change. We have API for backup if you want to backup VMs. Copying files within the storage domain is not something we can support. I'm closing this bug. Please clone the VM, export it to a export domain or back it up using the backup API, if you need to maintain a copy of the VM.


Note You need to log in before you can comment on or make changes to this bug.