Bug 1118349 - [vdsm] Creating DataCenter 3.5 using master domain V1 fails with InquireNotSupportedError
Summary: [vdsm] Creating DataCenter 3.5 using master domain V1 fails with InquireNotSu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-3.6.0-rc3
: 3.6.0
Assignee: Ala Hino
QA Contact: Raz Tamir
URL:
Whiteboard:
: 1139401 1166066 (view as bug list)
Depends On: 1120712
Blocks: 1242092
TreeView+ depends on / blocked
 
Reported: 2014-07-10 13:28 UTC by Gadi Ickowicz
Modified: 2016-03-09 19:23 UTC (History)
24 users (show)

Fixed In Version: v4.17.8
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1242092 (view as bug list)
Environment:
Last Closed: 2016-03-09 19:23:12 UTC
oVirt Team: Storage
amureini: needinfo+


Attachments (Terms of Use)
engine and vdsm logs (725.33 KB, application/x-bzip)
2014-07-10 13:28 UTC, Gadi Ickowicz
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0362 normal SHIPPED_LIVE vdsm 3.6.0 bug fix and enhancement update 2016-03-09 23:49:32 UTC
oVirt gerrit 45965 None None None Never
oVirt gerrit 45966 None None None Never
oVirt gerrit 45967 None None None Never
oVirt gerrit 45968 None None None Never
oVirt gerrit 46019 None None None Never
oVirt gerrit 46020 None None None Never
oVirt gerrit 46021 None None None Never

Description Gadi Ickowicz 2014-07-10 13:28:36 UTC
Created attachment 917078 [details]
engine and vdsm logs

Description of problem:
We have an automated test that attempts to create attach and activate the first storage domain in a DC (iscsi storage domain) and fails on getSpmStatus with:

Thread-13::INFO::2014-07-10 10:37:56,625::logUtils::47::dispatcher::(wrapper) Run and protect: connectStoragePool, Return response: True
Thread-13::DEBUG::2014-07-10 10:37:56,625::task::1191::Storage.TaskManager.Task::(prepare) Task=`638d35cf-31d9-4436-8ba2-c99f93ad9fcd`::finished: True
Thread-13::DEBUG::2014-07-10 10:37:56,625::task::595::Storage.TaskManager.Task::(_updateState) Task=`638d35cf-31d9-4436-8ba2-c99f93ad9fcd`::moving from state preparing -> state finished
Thread-13::DEBUG::2014-07-10 10:37:56,626::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-13::DEBUG::2014-07-10 10:37:56,626::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-13::DEBUG::2014-07-10 10:37:56,626::task::993::Storage.TaskManager.Task::(_decref) Task=`638d35cf-31d9-4436-8ba2-c99f93ad9fcd`::ref 0 aborting False
Thread-13::DEBUG::2014-07-10 10:37:56,672::BindingXMLRPC::298::vds::(wrapper) client [10.35.161.69] flowID [564b5dd0]
Thread-13::DEBUG::2014-07-10 10:37:56,672::task::595::Storage.TaskManager.Task::(_updateState) Task=`0b81ee10-7f13-4667-a14a-345b3c059903`::moving from state init -> state preparing
Thread-13::INFO::2014-07-10 10:37:56,672::logUtils::44::dispatcher::(wrapper) Run and protect: getSpmStatus(spUUID='31510a2b-e15b-4476-b120-8b8fb1db0400', options=None)
Thread-13::ERROR::2014-07-10 10:37:56,673::task::866::Storage.TaskManager.Task::(_setError) Task=`0b81ee10-7f13-4667-a14a-345b3c059903`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 611, in getSpmStatus
    status = self._getSpmStatusInfo(pool)
  File "/usr/share/vdsm/storage/hsm.py", line 605, in _getSpmStatusInfo
    (pool.spmRole,) + pool.getSpmStatus()))
  File "/usr/share/vdsm/storage/sp.py", line 126, in getSpmStatus
    return self._backend.getSpmStatus()
  File "/usr/share/vdsm/storage/spbackends.py", line 416, in getSpmStatus
    lVer, spmId = self.masterDomain.inquireClusterLock()
  File "/usr/share/vdsm/storage/sd.py", line 511, in inquireClusterLock
    return self._clusterLock.inquire()
  File "/usr/share/vdsm/storage/clusterlock.py", line 119, in inquire
    raise InquireNotSupportedError()
InquireNotSupportedError



Version-Release number of selected component (if applicable):


How reproducible:
? - seems to reproduce 100% on this specific automated test (but so far *only* on this test)

Steps to Reproduce:
1. Create new DC, Cluster, Add new host
2. Add new iscsi storage domain to DC

Actual results:
Fails with error listed above

Expected results:
Should succeed

Additional info:

Comment 1 Federico Simoncelli 2014-07-17 11:57:31 UTC
The issue happens when you try to create a data center 3.5 using a master domain V1 (domVersion='0' in vdsm).

It shouldn't impact more than just a single test (create data center 3.5 with master domain V1).

On regular basis you should create any data center >= 3.1 using a master domain V3.

Relevant logs:

Thread-13::INFO::2014-07-15 09:25:19,376::logUtils::44::dispatcher::(wrapper) Run and protect: createStorageDomain(storageType=3, sdUUID='1c482ce1-fd64-4601-8687-a5ba4dcbf3c4', domainName='iscsi_0', typeSpecificArg='FuDRBH-I4ME-BLI0-55cy-lV7k-OyB0-bdSSb4', domClass=1, domVersion='0', options=None)

Thread-14::INFO::2014-07-15 09:25:26,872::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='50840a07-b2eb-4486-bde3-f8e6e3592676', poolName='datacenter_async_tasks', masterDom='1c482ce1-fd64-4601-8687-a5ba4dcbf3c4', domList=['1c482ce1-fd64-4601-8687-a5ba4dcbf3c4'], masterVersion=1, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None)

Thread-14::INFO::2014-07-15 09:25:49,139::logUtils::44::dispatcher::(wrapper) Run and protect: connectStoragePool(spUUID='50840a07-b2eb-4486-bde3-f8e6e3592676', hostID=1, msdUUID='1c482ce1-fd64-4601-8687-a5ba4dcbf3c4', masterVersion=1, domainsMap={'1c482ce1-fd64-4601-8687-a5ba4dcbf3c4': 'active'}, options=None)

Thread-14::INFO::2014-07-15 09:25:49,624::logUtils::44::dispatcher::(wrapper) Run and protect: getSpmStatus(spUUID='50840a07-b2eb-4486-bde3-f8e6e3592676', options=None)
Thread-14::ERROR::2014-07-15 09:25:49,624::task::866::Storage.TaskManager.Task::(_setError) Task=`7c9416c5-81f5-4f93-97d6-30fd003fe869`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 611, in getSpmStatus
    status = self._getSpmStatusInfo(pool)
  File "/usr/share/vdsm/storage/hsm.py", line 605, in _getSpmStatusInfo
    (pool.spmRole,) + pool.getSpmStatus()))
  File "/usr/share/vdsm/storage/sp.py", line 126, in getSpmStatus
    return self._backend.getSpmStatus()
  File "/usr/share/vdsm/storage/spbackends.py", line 416, in getSpmStatus
    lVer, spmId = self.masterDomain.inquireClusterLock()
  File "/usr/share/vdsm/storage/sd.py", line 511, in inquireClusterLock
    return self._clusterLock.inquire()
  File "/usr/share/vdsm/storage/clusterlock.py", line 119, in inquire
    raise InquireNotSupportedError()
InquireNotSupportedError

Comment 2 Allon Mureinik 2014-07-21 12:37:07 UTC
(In reply to Federico Simoncelli from comment #1)
> It shouldn't impact more than just a single test (create data center 3.5
> with master domain V1).
Removing AutomationBlocker based on this.

Comment 3 Sven Kieske 2014-08-05 11:28:54 UTC
Shouldn't this test be removed when it is not important
to test this feature?

Comment 4 Allon Mureinik 2014-08-05 11:31:51 UTC
The flow of creating a domain and then creating a pool on top of it is a valid flow, and should be tested.

What should be removed is the V1 test (which we'll handle in bug 1120712)

Comment 6 Allon Mureinik 2014-08-27 16:29:41 UTC
Moved to ON_QA as bug 1120712 is already merged.

Comment 7 Raz Tamir 2014-09-01 11:59:36 UTC
verified

Comment 8 Federico Simoncelli 2014-09-08 22:41:15 UTC
Bug 1120712 is not a complete fix for this.

Comment 9 Federico Simoncelli 2014-09-10 07:53:35 UTC
Impact of this is: you cannot create a new Data Center 3.5 using as first master domain a Storage Domain < V3.

Bug 1120712 mitigated the issue but the problem is still present.

Considering that the fix would involve some complex logic that we'll throw away soon I am not sure if we want to fix this (ever).

Comment 10 Federico Simoncelli 2014-09-10 07:56:49 UTC
(In reply to Federico Simoncelli from comment #1)
> The issue happens when you try to create a data center 3.5 using a master
> domain V1 (domVersion='0' in vdsm).
> 
> It shouldn't impact more than just a single test (create data center 3.5
> with master domain V1).

Gil I mentioned that we should have had an automated test failing on this for 3.5, the bug couldn't have been VERFIED.

This scenario must be tested for data centers < 3.5.

Do you want to review together the matrix for the automated tests?

Comment 11 Allon Mureinik 2014-09-10 08:51:40 UTC
(In reply to Federico Simoncelli from comment #9)
> Impact of this is: you cannot create a new Data Center 3.5 using as first
> master domain a Storage Domain < V3.
> 
> Bug 1120712 mitigated the issue but the problem is still present.
> 
> Considering that the fix would involve some complex logic that we'll throw
> away soon I am not sure if we want to fix this (ever).

Reducing priority since you'd have to explicitly create a V1 domain for this, and pushing out to 3.5.1 while we rethink if this is even worth the effort to fix.

Comment 12 Gil Klein 2014-09-10 10:54:32 UTC
Federico, could you please email me your suggestion for the testing matrix?

I'll pull in the relevant people from QE, and see if/when we could cover this in automation.

Comment 13 Nir Soffer 2014-09-10 16:39:49 UTC
*** Bug 1139401 has been marked as a duplicate of this bug. ***

Comment 14 Allon Mureinik 2014-11-20 13:28:08 UTC
*** Bug 1166066 has been marked as a duplicate of this bug. ***

Comment 15 Sandro Bonazzola 2015-03-03 12:57:06 UTC
Re-targeting to 3.5.3 since this bug has not been marked as blocker for 3.5.2 and we have already released 3.5.2 Release Candidate.

Comment 18 Red Hat Bugzilla Rules Engine 2015-09-22 07:43:25 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 19 Yaniv Lavi 2015-10-07 13:10:39 UTC
Can you please check if this is fixed and how to test this?

Comment 20 Allon Mureinik 2015-10-15 09:44:19 UTC
Ala, what patch solves this? How come it's on MODIFIED?

Comment 21 Eyal Edri 2015-11-02 12:29:31 UTC
bug is missing patch in external tracker,
can you please add the relevant fix, otherwise its impossible to match / verify if the fix is really in.

Comment 22 Raz Tamir 2015-11-10 12:46:38 UTC
Verified on vdsm-4.17.10.1-0.el7ev.noarch .
Followed the steps to reproduce

Comment 25 errata-xmlrpc 2016-03-09 19:23:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0362.html


Note You need to log in before you can comment on or make changes to this bug.