Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Description: RHV hosts goes non-responsive while deleting snapshot. Below is the traceback that is seen. 2019-07-24 10:05:37,509+0200 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/4 running <Task <JsonRpcTask {'params': {u'topVolUUID': u'40db792b-7689-4fbd-9b97-bf4514bc116a', u'vmID': u'5b5 8c6fe-07eb-474c-9f0e-e764f7bb6197', u'drive': {u'imageID': u'ac23b5e4-4a5e-4e26-b033-5f5864203b79', u'volumeID': u'40db792b-7689-4fbd-9b97-bf4514bc116a', u'domainID': u'2f285eb6-80f5-49de-8601-cc297827e9a1', u'p oolID': u'58a4326c-0152-0361-02d5-0000000002d9'}, u'bandwidth': u'0', u'jobUUID': u'a87d0363-cf2c-47a8-80c8-f77b655df8a7', u'baseVolUUID': u'66be0e34-2bb3-49c9-b164-3fe26cee82ee'}, 'jsonrpc': '2.0', 'method': u' VM.merge', 'id': u'93f9eeab-f6c1-46fc-b799-afb03acf7998'} at 0x7fd4446bf990> timeout=60, duration=60 at 0x7fd4441e1290> task#=30718 at 0x7fd464056390>, traceback: File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap self.__bootstrap_inner() File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner self.run() File: "/usr/lib64/python2.7/threading.py", line 765, in run self.__target(*self.__args, **self.__kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run self._execute_task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task task() File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ self._callable() File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 523, in __call__ self._handler(self._ctx, self._req) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 566, in _serveRequest response = self._handle_request(req, ctx) File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in _handle_request res = method(**params) File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in _dynamicMethod result = fn(*methodArgs) File: "<string>", line 2, in merge File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File: "<string>", line 2, in merge File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 122, in method ret = func(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 739, in merge drive, baseVolUUID, topVolUUID, bandwidth, jobUUID) File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 6018, in merge bandwidth, flags) File: "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File: "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File: "/usr/lib64/python2.7/site-packages/libvirt.py", line 707, in blockCommit ret = libvirtmod.virDomainBlockCommit(self._o, disk, base, top, bandwidth, flags) (executor:363) This looks similar to the kbase https://access.redhat.com/solutions/3625291. According to the solution , this issue occurs when the IO thread is enabled on the VM. In this case while deleting the snapshot the host went non-responsive even when the IO threads were disabled.
I'm waiting on the sosreports, but if this is really 4.2.5, an upgrade to the latest 4.2 would be advisable
Closing as the customer has upgraded to 4.3, and will re-open if it recurs. Likely to be https://bugzilla.redhat.com/show_bug.cgi?id=1607130