Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
* Previous description submitted by mistake. Description of problem: Upload image test case failed with "too many tasks" error in the engine log and "Worker blocked" exceptions in VDSM. The test case adds 4 disks and tries to upload them Asynchronous. 3 of the disks were uploaded successfully. The 4th disk failed with errors in the engine log: 2018-11-24 08:28:14,712+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ExtendImageTicketVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-67) [ebbc056d-829a-4b38-9ed9-6fd8ed4a62f1] Command 'ExtendImageTicketVDSCommand(HostName = host_mixed_3, ExtendImageTicketVDSCommandParameters:{hostId='f5382529-b7cc-4727-83f2-cb12b9fb9b80', ticketId='ce2a124a-fac9-48ba-a429-2e0759f2061d', timeout='300'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues 2018-11-24 08:28:14,712+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ExtendImageTicketVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-67) [ebbc056d-829a-4b38-9ed9-6fd8ed4a62f1] FINISH, ExtendImageTicketVDSCommand, return: , log id: 1c09a200 2018-11-24 08:28:14,713+02 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engine-Thread-12794) [ebbc056d-829a-4b38-9ed9-6fd8ed4a62f1] Host 'host_mixed_3' is not responding. 2018-11-24 08:28:14,713+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-67) [ebbc056d-829a-4b38-9ed9-6fd8ed4a62f1] Failed to extend image transfer session for ticket 'ce2a124a-fac9-48ba-a429-2e0759f2061d': {}: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022) at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:118) [bll.jar:] at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.extendImageTransferSession(TransferDiskImageCommand.java:914) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.extendTicketIfNecessary(TransferDiskImageCommand.java:506) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.handleTransferring(TransferDiskImageCommand.java:495) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.executeStateHandler(TransferDiskImageCommand.java:364) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.proceedCommandExecution(TransferDiskImageCommand.java:345) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommandCallback.doPolling(TransferImageCommandCallback.java:21) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:146) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:107) [bll.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_191] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [rt.jar:1.8.0_191] at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) [javax.enterprise.concurrent.jar:1.0.0.redhat-1] at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) [javax.enterprise.concurrent.jar:1.0.0.redhat-1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_191] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_191] at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent.jar:1.0.0.redhat-1] at org.jboss.as.ee.concurrent.service.ElytronManagedThreadFactory$ElytronManagedThread.run(ElytronManagedThreadFactory.java:78) Caused by: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:189) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.ExtendImageTicketVDSCommand.executeVdsBrokerCommand(ExtendImageTicketVDSCommand.java:16) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVdsCommandWithNetworkEvent(VdsBrokerCommand.java:123) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:111) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65) [vdsbroker.jar:] at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:31) [dal.jar:] at org.ovirt.engine.core.vdsbroker.vdsbroker.DefaultVdsCommandExecutor.execute(DefaultVdsCommandExecutor.java:14) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:396) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.ResourceManager$Proxy$_$$_WeldSubclass.runVdsCommand$$super(Unknown Source) [vdsbroker.jar:] at sun.reflect.GeneratedMethodAccessor87.invoke(Unknown Source) [:1.8.0_191] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_191] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_191] at org.jboss.weld.interceptor.proxy.TerminalAroundInvokeInvocationContext.proceedInternal(TerminalAroundInvokeInvocationContext.java:51) [weld-core-impl.jar:2.4.7.Final-redhat-1] at org.jboss.weld.interceptor.proxy.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:79) [weld-core-impl.jar:2.4.7.Final-redhat-1] at org.ovirt.engine.core.common.di.interceptor.LoggingInterceptor.apply(LoggingInterceptor.java:12) [common.jar:] at sun.reflect.GeneratedMethodAccessor66.invoke(Unknown Source) [:1.8.0_191] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_191] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_191] at org.jboss.weld.interceptor.reader.SimpleInterceptorInvocation$SimpleMethodInvocation.invoke(SimpleInterceptorInvocation.java:73) [weld-core-impl.jar:2.4.7.Final-redhat-1] at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.executeAroundInvoke(InterceptorMethodHandler.java:85) [weld-core-impl.jar:2.4.7.Final-redhat-1] at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.executeInterception(InterceptorMethodHandler.java:73) [weld-core-impl.jar:2.4.7.Final-redhat-1] at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.invoke(InterceptorMethodHandler.java:57) [weld-core-impl.jar:2.4.7.Final-redhat-1] at org.jboss.weld.bean.proxy.CombinedInterceptorAndDecoratorStackMethodHandler.invoke(CombinedInterceptorAndDecoratorStackMethodHandler.java:79) [weld-core-impl.jar:2.4.7.Final-redhat-1] at org.jboss.weld.bean.proxy.CombinedInterceptorAndDecoratorStackMethodHandler.invoke(CombinedInterceptorAndDecoratorStackMethodHandler.java:68) [weld-core-impl.jar:2.4.7.Final-redhat-1] at org.ovirt.engine.core.vdsbroker.ResourceManager$Proxy$_$$_WeldSubclass.runVdsCommand(Unknown Source) [vdsbroker.jar:] ... 18 more 2018-11-24 08:28:14,765+02 WARN [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-67) [ebbc056d-829a-4b38-9ed9-6fd8ed4a62f1] Failed to renew transfer ticket for Upload disk 'disk_qcow2_v3_2408234260' (disk id: '7a81e99c-03ec-4e12-afc3-5b3f394d0e9f', image id: 'ed1b759c-2c31-4d8e-896d-bc5cdf7bd2ae') 2018-11-24 08:31:27,035+02 INFO [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-22) [ebbc056d-829a-4b38-9ed9-6fd8ed4a62f1] Transfer was paused by system. Upload disk 'disk_qcow2_v3_2408234260' (disk id: '7a81e99c-03ec-4e12-afc3-5b3f394d0e9f', image id: 'ed1b759c-2c31-4d8e-896d-bc5cdf7bd2ae') In VDSM log, ~100 exception like this can be found: 2018-11-24 08:28:12,941+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/3 running <Task <JsonRpcTask {'params': {}, 'jsonrpc': '2.0', 'method': u'Host.getStats', 'id': u'ff344c57-2d79-459c-9082-deb14d234808'} at 0x7f81fc5ffc90> timeout=60, duration=1080.00 at 0x7f81fc5fff50> task#=8653 at 0x7f821c357190>, traceback: File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap self.__bootstrap_inner() File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File: "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run self._execute_task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ self._callable() File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 262, in __call__ self._handler(self._ctx, self._req) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 305, in _serveRequest response = self._handle_request(req, ctx) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 194, in _dynamicMethod result = fn(*methodArgs) File: "<string>", line 2, in getStats File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 50, in get_stats decStats = stats.produce(first_sample, last_sample) File: "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 72, in produce stats.update(get_interfaces_stats()) File: "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 154, in get_interfaces_stats return net_api.network_stats() File: "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File: "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 32, in report stats = link_stats.report() File: "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 34, in report stats[i.device] = i.statistics() File: "/usr/lib/python2.7/site-packages/vdsm/network/link/iface.py", line 232, in statistics 'state': 'up' if self.is_oper_up() else 'down', File: "/usr/lib/python2.7/site-packages/vdsm/network/link/iface.py", line 193, in is_oper_up properties = self.properties() File: "/usr/lib/python2.7/site-packages/vdsm/network/link/iface.py", line 165, in properties info = link.get_link(self._dev) File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 39, in get_link link_info = _link_info(link) File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 98, in _link_info info['master'] = _link_index_to_name(master_index, cache=cache) File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 112, in _link_index_to_name with _get_link(index=link_index) as link: File: "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 139, in _get_link with _pool.socket() as sock: File: "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/__init__.py", line 47, in socket with self._semaphore: File: "/usr/lib64/python2.7/threading.py", line 471, in acquire self.__cond.wait() File: "/usr/lib/python2.7/site-packages/pthreading.py", line 127, in wait return self.__cond.wait() File: "/usr/lib/python2.7/site-packages/pthread.py", line 131, in wait return _libpthread.pthread_cond_wait(self._cond, m.mutex()) (executor:363) Version-Release number of selected component (if applicable): ovirt-engine-4.3.0-0.5.alpha1.el7.noarch ovirt-imageio-proxy-1.4.6-1.el7ev.noarch vdsm-4.30.3-1.el7ev.x86_64 qemu-img-rhev-2.12.0-18.el7_6.1.x86_64 How reproducible: Unknown as it failed in automation and didn't manage to reproduce it again manually/autmation. Steps to Reproduce: 1. 2. 3. Actual results: The upload failed Expected results: The upload should succeed Additional info: Attached engine, VDSM, art and image-proxy logs
Created attachment 1508550 [details] logs
Dan, could you please take a look? From above vdsm.log part it seems to me that VDSM task is blocked in networking code ...
I think that Eddy already has, and can share his current understanding.
Since Edy is not available, can you please have a look, Dan?
I'm afraid I cannot. I would guess that there is a ctype bug that can lock a thread; if vdsm is regardlessly busy, we can see this on the logs. I would not consider it an urgency unless it suddenly reproduces more often.
BTW, I assume that "How reproducible:" is "very rarely" or "once ever". Is that true?
(In reply to Dan Kenigsberg from comment #7) > BTW, I assume that "How reproducible:" is "very rarely" or "once ever". Is > that true? Yes. As I wrote in the description, I didn't manage to reproduce it again so I'll have to say "once ever".
Created attachment 1515629 [details] logs2 Encountered again, this time during start VM: 2018-12-12 05:22:05,983+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/5 running <Task <JsonRpcTask {'params': {}, 'jsonrpc': '2.0', 'method': u'Host.getStats', 'id': u'79e6a07c-00f2-4dfb-9d82-e3f221e506cf'} at 0x7ff1a4342810> timeout=60, duration=960.01 at 0x7ff1a4342ad0> task#=7761 at 0x7ff1c4121410>, traceback: File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap self.__bootstrap_inner() File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File: "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run self._execute_task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ self._callable() File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 262, in __call__ self._handler(self._ctx, self._req) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 305, in _serveRequest response = self._handle_request(req, ctx) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 198, in _dynamicMethod result = fn(*methodArgs) File: "<string>", line 2, in getStats File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1407, in getStats multipath=True)} File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 50, in get_stats decStats = stats.produce(first_sample, last_sample) File: "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 72, in produce stats.update(get_interfaces_stats()) File: "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 154, in get_interfaces_stats return net_api.network_stats() File: "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File: "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 32, in report stats = link_stats.report() File: "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 32, in report for iface_properties in iface.list(): File: "/usr/lib/python2.7/site-packages/vdsm/network/link/iface.py", line 257, in list for properties in itertools.chain(link.iter_links(), dpdk_links): File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 46, in iter_links with _pool.socket() as sock: File: "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/__init__.py", line 47, in socket with self._semaphore: File: "/usr/lib64/python2.7/threading.py", line 471, in acquire self.__cond.wait() File: "/usr/lib/python2.7/site-packages/pthreading.py", line 127, in wait return self.__cond.wait() File: "/usr/lib/python2.7/site-packages/pthread.py", line 131, in wait return _libpthread.pthread_cond_wait(self._cond, m.mutex()) (executor:363)
Created attachment 1515631 [details] deactivate_host
The problem reproduces at 4.3 during Automation, when it tries to deactivate_host see attached logs "deactivate_host"
This bug reproduced more and more in our automation. Raising severity. Please consider retarget it to 4.3
At the following log we can see that it occurs due to timeout exception Maybe this is the root cause? See attached logs: test_autosync_30_networks.zip 2018-12-19 09:17:14,412+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=periodic/2 running <Task <Operation action=<vdsm.virt.sampling.HostMonitor object at 0x7f48ec73ed50> at 0x7f48ec73ed90> timeout=15, duration=15.09 at 0x7f48ec3b2790> task#=316 at 0x7f48ec71ed10>, traceback: File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap self.__bootstrap_inner() File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File: "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run self._execute_task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ self._callable() File: "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 186, in __call__ self._func() File: "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 481, in __call__ stats = hostapi.get_stats(self._cif, self._samples.stats()) File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 50, in get_stats decStats = stats.produce(first_sample, last_sample) File: "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 72, in produce stats.update(get_interfaces_stats()) File: "/usr/lib/python2.7/site-packages/vdsm/host/stats.py", line 154, in get_interfaces_stats return net_api.network_stats() File: "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 63, in network_stats return netstats.report() File: "/usr/lib/python2.7/site-packages/vdsm/network/netstats.py", line 32, in report stats = link_stats.report() File: "/usr/lib/python2.7/site-packages/vdsm/network/link/stats.py", line 39, in report elif i.type() == iface.Type.BOND: File: "/usr/lib/python2.7/site-packages/vdsm/network/link/iface.py", line 225, in type return self.properties().get('type', get_alternative_type(self._dev)) File: "/usr/lib/python2.7/site-packages/vdsm/network/link/iface.py", line 165, in properties info = link.get_link(self._dev) File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 39, in get_link link_info = _link_info(link) File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 98, in _link_info info['master'] = _link_index_to_name(master_index, cache=cache) File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 112, in _link_index_to_name with _get_link(index=link_index) as link: File: "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/link.py", line 139, in _get_link with _pool.socket() as sock: File: "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File: "/usr/lib/python2.7/site-packages/vdsm/network/netlink/__init__.py", line 47, in socket with self._semaphore: File: "/usr/lib64/python2.7/threading.py", line 471, in acquire self.__cond.wait() File: "/usr/lib/python2.7/site-packages/pthreading.py", line 127, in wait return self.__cond.wait() File: "/usr/lib/python2.7/site-packages/pthread.py", line 131, in wait return _libpthread.pthread_cond_wait(self._cond, m.mutex()) (executor:363)
Created attachment 1515843 [details] test_autosync_30_networks.zip
Created attachment 1516824 [details] logs3 Encountered for 'method': u'Image.syncData': 2018-12-26 11:41:21,868+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/5 running <Task <JsonRpcTask {'params': {u'syncType': u'INTERNAL', u'dstSdUUID': u'99cd05da-d404-449d-8d2f-9330bb087c61', u'storagepoolID': u'9cc6e14a-784e-418f-84db-11fa9ff07a80', u'storagedomainID': u'19bafff8-5d66-4550-a24e-c8710768e7a9', u'imageID': u'320b2ef0-f35d-47dc-b046- e85f2438f6c1'}, 'jsonrpc': '2.0', 'method': u'Image.syncData', 'id': u'8856d7ee-552a-4b88-b101-426fe01cfa2a'} at 0x7f4420029290> timeout=60, duration=120.00 at 0x7f4420029c90> task#=8912 at 0x7f442c0e91d0>, traceback: File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap self.__bootstrap_inner() File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File: "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run self._execute_task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ self._callable() File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 262, in __call__ self._handler(self._ctx, self._req) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 305, in _serveRequest response = self._handle_request(req, ctx) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 198, in _dynamicMethod result = fn(*methodArgs) File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 907, in syncData dstSdUUID, syncType) File: "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 72, in wrapper result = ctask.prepare(func, *args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in wrapper return m(self, *a, **kw) File: "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1179, in prepare result = self._run(func, *args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File: "<string>", line 2, in syncImageData File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1705, in syncImageData vars.task.getSharedLock(STORAGE, dom) File: "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1383, in getSharedLock timeout) File: "/usr/lib/python2.7/site-packages/vdsm/storage/resourceManager.py", line 784, in acquire timeout) File: "/usr/lib/python2.7/site-packages/vdsm/storage/resourceManager.py", line 1026, in acquireResource return _manager.acquireResource(namespace, name, lockType, timeout=timeout) File: "/usr/lib/python2.7/site-packages/vdsm/storage/resourceManager.py", line 465, in acquireResource request.wait(timeout) File: "/usr/lib/python2.7/site-packages/vdsm/storage/resourceManager.py", line 235, in wait return self._doneEvent.wait(timeout) File: "/usr/lib64/python2.7/threading.py", line 622, in wait self.__cond.wait(timeout, balancing) File: "/usr/lib/python2.7/site-packages/pthreading.py", line 125, in wait return self.__cond.timedwait(abstime) File: "/usr/lib/python2.7/site-packages/pthread.py", line 136, in timedwait C.pointer(abstime)) (executor:363) 2018-12-26 11:41:21,883+0200 INFO (jsonrpc/5) [vdsm.api] FINISH syncImageData error=Resource timeout: () from=::ffff:10.35.161.127,60606, flow_id=disks_syncAction_1142755c-bfd3-44e7, task_id=76e2dba2-d15b-4acc-88f0-d3961f940bc1 (api:52) 2018-12-26 11:41:21,884+0200 ERROR (jsonrpc/5) [storage.TaskManager.Task] (Task='76e2dba2-d15b-4acc-88f0-d3961f940bc1') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in syncImageData File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1705, in syncImageData vars.task.getSharedLock(STORAGE, dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1383, in getSharedLock timeout) File "/usr/lib/python2.7/site-packages/vdsm/storage/resourceManager.py", line 793, in acquire raise se.ResourceTimeout() ResourceTimeout: Resource timeout: () 2018-12-26 11:41:21,884+0200 DEBUG (jsonrpc/5) [storage.TaskManager.Task] (Task='76e2dba2-d15b-4acc-88f0-d3961f940bc1') Task._run: 76e2dba2-d15b-4acc-88f0-d3961f940bc1 (u'9cc6e14a-784e-418f-84db-11fa9ff07a80', u'19bafff8-5d66-4550-a24e-c8710768e7a9', u'320b2ef0-f35d-47dc-b046-e85f2438f6c1', u'99cd05da-d404-449d-8d2f-9330bb087c61', u'INTERNAL') {} failed - stopping task (task:894)
Verified d/s v4.3.0.4-0.1.el7
This bugzilla is included in oVirt 4.3.0 release, published on February 4th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.