Bug 1733596 - RHV host goes non-responsive while deleting snapshot
Summary: RHV host goes non-responsive while deleting snapshot
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.2.6
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: ---
Assignee: Dan Kenigsberg
QA Contact: Lukas Svaty
Depends On:
TreeView+ depends on / blocked
Reported: 2019-07-26 17:09 UTC by Shruti
Modified: 2020-08-03 15:26 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-07-29 15:12:37 UTC
oVirt Team: Storage
Target Upstream Version:
lsvaty: testing_plan_complete-

Attachments (Terms of Use)

Description Shruti 2019-07-26 17:09:53 UTC
Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Comment 1 Shruti 2019-07-26 17:18:08 UTC

RHV hosts goes non-responsive while deleting snapshot.

Below is the traceback that is seen.

2019-07-24 10:05:37,509+0200 WARN  (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=jsonrpc/4 running <Task <JsonRpcTask {'params': {u'topVolUUID': u'40db792b-7689-4fbd-9b97-bf4514bc116a', u'vmID': u'5b5
8c6fe-07eb-474c-9f0e-e764f7bb6197', u'drive': {u'imageID': u'ac23b5e4-4a5e-4e26-b033-5f5864203b79', u'volumeID': u'40db792b-7689-4fbd-9b97-bf4514bc116a', u'domainID': u'2f285eb6-80f5-49de-8601-cc297827e9a1', u'p
oolID': u'58a4326c-0152-0361-02d5-0000000002d9'}, u'bandwidth': u'0', u'jobUUID': u'a87d0363-cf2c-47a8-80c8-f77b655df8a7', u'baseVolUUID': u'66be0e34-2bb3-49c9-b164-3fe26cee82ee'}, 'jsonrpc': '2.0', 'method': u'
VM.merge', 'id': u'93f9eeab-f6c1-46fc-b799-afb03acf7998'} at 0x7fd4446bf990> timeout=60, duration=60 at 0x7fd4441e1290> task#=30718 at 0x7fd464056390>, traceback:
File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap
File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
File: "/usr/lib64/python2.7/threading.py", line 765, in run
  self.__target(*self.__args, **self.__kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run
  ret = func(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task
File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 523, in __call__
  self._handler(self._ctx, self._req)
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 566, in _serveRequest
  response = self._handle_request(req, ctx)
File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in _handle_request
  res = method(**params)
File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in _dynamicMethod
  result = fn(*methodArgs)
File: "<string>", line 2, in merge
File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
  ret = func(*args, **kwargs)
File: "<string>", line 2, in merge
File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 122, in method
  ret = func(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 739, in merge
  drive, baseVolUUID, topVolUUID, bandwidth, jobUUID)
File: "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 6018, in merge
  bandwidth, flags)
File: "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f
  ret = attr(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper
  ret = f(*args, **kwargs)
File: "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper
  return func(inst, *args, **kwargs)
File: "/usr/lib64/python2.7/site-packages/libvirt.py", line 707, in blockCommit
  ret = libvirtmod.virDomainBlockCommit(self._o, disk, base, top, bandwidth, flags) (executor:363)

This looks similar to the kbase https://access.redhat.com/solutions/3625291. According to the solution , this issue occurs when the IO thread is enabled on the VM. In this case while deleting the snapshot the host went non-responsive even when the IO threads were disabled.

Comment 3 Ryan Barry 2019-07-27 02:33:05 UTC
I'm waiting on the sosreports, but if this is really 4.2.5, an upgrade to the latest 4.2 would be advisable

Comment 5 Ryan Barry 2019-07-29 15:12:37 UTC
Closing as the customer has upgraded to 4.3, and will re-open if it recurs.

Likely to be https://bugzilla.redhat.com/show_bug.cgi?id=1607130

Note You need to log in before you can comment on or make changes to this bug.