Bug 1508844
Summary: | ExtendImageTicketVDSCommand might get stuck if host is restarted during image transfer | ||
---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Yaniv Kaul <ykaul> |
Component: | BLL.Storage | Assignee: | Daniel Erez <derez> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Avihai <aefrat> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.2.0 | CC: | amureini, bugs, derez |
Target Milestone: | ovirt-4.2.0 | Flags: | rule-engine:
ovirt-4.2+
|
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-12-20 10:56:11 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Yaniv Kaul
2017-11-02 11:02:07 UTC
Hi Daniel , I moved hosts to maintenance & rebooted the host during upload -> the disk went to 'Paused by System' state & the transfer does not continue - is this the expected behavior? I also got the following exception in the engine & upload can not be resumed: 2017-11-15 16:35:57,055+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-95) [13893784-f38d-4e15-a016-0160804106e9] Failed to stop image transfer session for ticket '2910890b-77a8-419d-9a65-da8ebbc0f554': {}: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to RemoveImageTicketVDS, error = Image daemon request failed: u"status=404, code=404, title=Not Found, explanation=The resource could not be found., detail=No such ticket u'2910890b-77a8-419d-9a65-da8ebbc0f554', reason=Not Found", code = 482 (Failed with error ImageDeamonError and code 482) at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:118) [bll.jar:] at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommand.stopImageTransferSession(TransferImageCommand.java:639) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommand.handlePaused(TransferImageCommand.java:441) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommand.handlePausedSystem(TransferImageCommand.java:344) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommand.executeStateHandler(TransferImageCommand.java:162) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommand.proceedCommandExecution(TransferImageCommand.java:140) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommandCallback.doPolling(TransferImageCommandCallback.java:21) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:145) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:106) [bll.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_151] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [rt.jar:1.8.0_151] at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) [javax.enterprise.concurrent-1.0.jar:] at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) [javax.enterprise.concurrent-1.0.jar:] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_151] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_151] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151] at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent-1.0.jar:] at org.jboss.as.ee.concurrent.service.ElytronManagedThreadFactory$ElytronManagedThread.run(ElytronManagedThreadFactory.java:78) Caused by: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to RemoveImageTicketVDS, error = Image daemon request failed: u"status=404, code=404, title=Not Found, explanation=The resource could not be found., detail=No such ticket u'2910890b-77a8-419d-9a65-da8ebbc0f554', reason=Not Found", code = 482 at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createDefaultConcreteException(VdsBrokerCommand.java:81) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.createException(BrokerCommandBase.java:223) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:193) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.RemoveImageTicketVDSCommand.executeVdsBrokerCommand(RemoveImageTicketVDSCommand.java:13) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:112) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:73) [vdsbroker.jar:] at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.DefaultVdsCommandExecutor.execute(DefaultVdsCommandExecutor.java:14) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:387) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.ResourceManager$Proxy$_$$_WeldSubclass.runVdsCommand$$super(Unknown Source) [vdsbroker.jar:] at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source) [:1.8.0_151] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_151] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_151] at org.jboss.weld.interceptor.proxy.TerminalAroundInvokeInvocationContext.proceedInternal(TerminalAroundInvokeInvocationContext.java:49) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.jboss.weld.interceptor.proxy.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:77) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.ovirt.engine.core.common.di.interceptor.LoggingInterceptor.apply(LoggingInterceptor.java:12) [common.jar:] at sun.reflect.GeneratedMethodAccessor66.invoke(Unknown Source) [:1.8.0_151] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_151] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_151] at org.jboss.weld.interceptor.reader.SimpleInterceptorInvocation$SimpleMethodInvocation.invoke(SimpleInterceptorInvocation.java:73) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.executeAroundInvoke(InterceptorMethodHandler.java:84) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.executeInterception(InterceptorMethodHandler.java:72) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.invoke(InterceptorMethodHandler.java:56) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.jboss.weld.bean.proxy.CombinedInterceptorAndDecoratorStackMethodHandler.invoke(CombinedInterceptorAndDecoratorStackMethodHandler.java:79) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.jboss.weld.bean.proxy.CombinedInterceptorAndDecoratorStackMethodHandler.invoke(CombinedInterceptorAndDecoratorStackMethodHandler.java:68) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.ovirt.engine.core.vdsbroker.ResourceManager$Proxy$_$$_WeldSubclass.runVdsCommand(Unknown Source) [vdsbroker.jar:] ... 18 more Created attachment 1352675 [details]
engine/vdsm/imageio logs
Added engine/vdsm/imageio logs
This behavior is by design. We move the transfer to paused state on error. I.e. we attempt to extend for a couple of times, and move the transfer to pause on failure. The exception is also expected since the tickets are removed from the daemon on restart (tickets are saves in memory). (In reply to Daniel Erez from comment #4) > This behavior is by design. We move the transfer to paused state on error. > I.e. we attempt to extend for a couple of times, and move the transfer to > pause on failure. The exception is also expected since the tickets are > removed from the daemon on restart (tickets are saves in memory). What about the fact that I can not resume/cancel the upload after host is up? I still get the following error all the time now : VDSM host_mixed_1 command RemoveImageTicketVDS failed: Image daemon request failed: u"status=404, code=404, title=Not Found, explanation=The resource could not be found., detail=No such ticket u'2910890b-77a8-419d-9a65-da8ebbc0f554', reason=Not Found" (In reply to Avihai from comment #5) > (In reply to Daniel Erez from comment #4) > > This behavior is by design. We move the transfer to paused state on error. > > I.e. we attempt to extend for a couple of times, and move the transfer to > > pause on failure. The exception is also expected since the tickets are > > removed from the daemon on restart (tickets are saves in memory). > > What about the fact that I can not resume/cancel the upload after host is up? > > I still get the following error all the time now : > VDSM host_mixed_1 command RemoveImageTicketVDS failed: Image daemon request > failed: u"status=404, code=404, title=Not Found, explanation=The resource > could not be found., detail=No such ticket > u'2910890b-77a8-419d-9a65-da8ebbc0f554', reason=Not Found" Does it happen when trying to cancel the upload? If so, can you please open a separate bug on it. Verified on ovirt-engine-4.2.0-0.0.master.20171114111003.git7aa1b91.el7.centos.noarch . will open a new bug on cancel/resume upload after host reboot. This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |