Bug 873701 (scale) - [RFE] Change formatStorageDomain verb to be async
Summary: [RFE] Change formatStorageDomain verb to be async
Keywords:
Status: CLOSED WONTFIX
Alias: scale
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: unspecified
Hardware: x86_64
OS: Linux
high
low
Target Milestone: ---
: ---
Assignee: Allon Mureinik
QA Contact: yeylon@redhat.com
URL:
Whiteboard: storage
Depends On: 1080372 1185830
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-06 14:15 UTC by Dafna Ron
Modified: 2016-04-18 06:49 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-17 08:04:12 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
scohen: needinfo+
sherold: Triaged+


Attachments (Terms of Use)
logs (1.27 MB, application/x-gzip)
2012-11-06 14:15 UTC, Dafna Ron
no flags Details

Description Dafna Ron 2012-11-06 14:15:24 UTC
Created attachment 639392 [details]
logs

Description of problem:

I removed 10 domains concurrently, 3 domains failed on formatStroageDomain 
it seems as though the domains were removed though since although engine reported communication error to the host and rolled back on the domains if we try to remove them again we fail in vdsm:

Thread-8243::INFO::2012-11-06 15:37:06,592::task::1157::TaskManager.Task::(prepare) Task=`24c09012-01f8-4f19-85a0-5007af87a8e1`::aborting: Task is aborted: u'Failed reload: e32df40d-9c2a-4bd3-8cdd-eb9b917311a8' - code 100

vgs shows no domain: 

[root@gold-vdsc ~]# vgs e32df40d-9c2a-4bd3-8cdd-eb9b917311a8
  Volume group "e32df40d-9c2a-4bd3-8cdd-eb9b917311a8" not found

Version-Release number of selected component (if applicable):

vdsm-4.9.6-41.0.el6_3.x86_64
si24

How reproducible:

100%

Steps to Reproduce:
1. in two hosts cluster create/attach/detach 10 iscsi domains
2. remove all the domains concurrently 
3.
  
Actual results:

we report error in remove of some of the domains on timeout

Expected results:

we should not fail. 

Additional info: logs


looks like task dies: 

Thread-8238::ERROR::2012-11-06 15:37:03,582::dispatcher::69::Storage.Dispatcher.Protect::(run) Failed reload: c85eac8b-7802-4bc9-b963-beb37f75a963
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 61, in run
    result = ctask.prepare(self.func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 1164, in prepare
    raise self.error
AttributeError: Failed reload: c85eac8b-7802-4bc9-b963-beb37f75a963
Thread-8245::DEBUG::2012-11-06 15:37:03,616::BindingXMLRPC::171::vds::(wrapper) [10.35.97.65]
Thread-8245::DEBUG::2012-11-06 15:37:03,617::task::588::TaskManager.Task::(_updateState) Task=`db14bcc4-cef8-44f4-9c6d-f9fbe8625abd`::moving from state init -> state preparing


trying to remove again will give second error: 

Thread-8243::ERROR::2012-11-06 15:37:06,590::task::853::TaskManager.Task::(_setError) Task=`24c09012-01f8-4f19-85a0-5007af87a8e1`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 2328, in formatStorageDomain
    if not misc.parseBool(autoDetach) and sd.getPools():
  File "/usr/share/vdsm/storage/sd.py", line 371, in getPools
    pools = self.getMetaParam(key=DMDK_POOLS)
  File "/usr/share/vdsm/storage/sd.py", line 689, in getMetaParam
    return self._metadata[key]
  File "/usr/share/vdsm/storage/persistentDict.py", line 85, in __getitem__
    return dec(self._dict[key])
  File "/usr/share/vdsm/storage/persistentDict.py", line 193, in __getitem__
    with self._accessWrapper():
  File "/usr/lib64/python2.6/contextlib.py", line 16, in __enter__
    return self.gen.next()
  File "/usr/share/vdsm/storage/persistentDict.py", line 147, in _accessWrapper
    self.refresh()
  File "/usr/share/vdsm/storage/persistentDict.py", line 224, in refresh
    lines = self._metaRW.readlines()
  File "/usr/share/vdsm/storage/blockSD.py", line 186, in readlines
    for tag in vg.tags:
  File "/usr/share/vdsm/storage/lvm.py", line 68, in __getattr__
    raise AttributeError("Failed reload: %s" % self.name)
AttributeError: Failed reload: e32df40d-9c2a-4bd3-8cdd-eb9b917311a8
Thread-8243::DEBUG::2012-11-06 15:37:06,591::task::872::TaskManager.Task::(_run) Task=`24c09012-01f8-4f19-85a0-5007af87a8e1`::Task._run: 24c09012-01f8-4f19-85a0-5007af87a8e1 ('e32df40d-9c2a-4bd3-8cdd-eb9b917311a8', False) {} failed - stopping task

Comment 1 Dafna Ron 2012-11-07 09:00:14 UTC
grep on formatStorageDomain shows that Thread-8238: has no return 


Thread-8238::INFO::2012-11-06 15:36:52,908::logUtils::37::dispatcher::(wrapper) Run and protect: formatStorageDomain(sdUUID='c85eac8b-7802-4bc9-b963-beb37f75a963', autoDetach=False, options=None)

Comment 2 Ayal Baron 2012-11-25 10:56:29 UTC
Does this happen with less domains? 3? 5?

Comment 3 Dafna Ron 2012-11-25 12:14:06 UTC
(In reply to comment #2)
> Does this happen with less domains? 3? 5?

I tested with 3 and there was no problem.

Comment 4 RHEL Program Management 2012-12-14 07:52:34 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 5 Ayal Baron 2012-12-26 09:57:21 UTC
This requires changing the API to be async


Note You need to log in before you can comment on or make changes to this bug.