Bug 1118349 - [vdsm] Creating DataCenter 3.5 using master domain V1 fails with InquireNotSupportedError
Summary: [vdsm] Creating DataCenter 3.5 using master domain V1 fails with InquireNotSu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-3.6.0-rc3
: 3.6.0
Assignee: Ala Hino
QA Contact: Raz Tamir
URL:
Whiteboard:
: 1139401 1166066 (view as bug list)
Depends On: 1120712
Blocks: 1242092
TreeView+ depends on / blocked
 
Reported: 2014-07-10 13:28 UTC by Gadi Ickowicz
Modified: 2019-10-10 09:22 UTC (History)
24 users (show)

Fixed In Version: v4.17.8
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1242092 (view as bug list)
Environment:
Last Closed: 2016-03-09 19:23:12 UTC
oVirt Team: Storage
Target Upstream Version:
amureini: needinfo+


Attachments (Terms of Use)
engine and vdsm logs (725.33 KB, application/x-bzip)
2014-07-10 13:28 UTC, Gadi Ickowicz
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0362 0 normal SHIPPED_LIVE vdsm 3.6.0 bug fix and enhancement update 2016-03-09 23:49:32 UTC
oVirt gerrit 45965 0 None None None Never
oVirt gerrit 45966 0 None None None Never
oVirt gerrit 45967 0 None None None Never
oVirt gerrit 45968 0 None None None Never
oVirt gerrit 46019 0 None None None Never
oVirt gerrit 46020 0 None None None Never
oVirt gerrit 46021 0 None None None Never

Description Gadi Ickowicz 2014-07-10 13:28:36 UTC
Created attachment 917078 [details]
engine and vdsm logs

Description of problem:
We have an automated test that attempts to create attach and activate the first storage domain in a DC (iscsi storage domain) and fails on getSpmStatus with:

Thread-13::INFO::2014-07-10 10:37:56,625::logUtils::47::dispatcher::(wrapper) Run and protect: connectStoragePool, Return response: True
Thread-13::DEBUG::2014-07-10 10:37:56,625::task::1191::Storage.TaskManager.Task::(prepare) Task=`638d35cf-31d9-4436-8ba2-c99f93ad9fcd`::finished: True
Thread-13::DEBUG::2014-07-10 10:37:56,625::task::595::Storage.TaskManager.Task::(_updateState) Task=`638d35cf-31d9-4436-8ba2-c99f93ad9fcd`::moving from state preparing -> state finished
Thread-13::DEBUG::2014-07-10 10:37:56,626::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-13::DEBUG::2014-07-10 10:37:56,626::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-13::DEBUG::2014-07-10 10:37:56,626::task::993::Storage.TaskManager.Task::(_decref) Task=`638d35cf-31d9-4436-8ba2-c99f93ad9fcd`::ref 0 aborting False
Thread-13::DEBUG::2014-07-10 10:37:56,672::BindingXMLRPC::298::vds::(wrapper) client [10.35.161.69] flowID [564b5dd0]
Thread-13::DEBUG::2014-07-10 10:37:56,672::task::595::Storage.TaskManager.Task::(_updateState) Task=`0b81ee10-7f13-4667-a14a-345b3c059903`::moving from state init -> state preparing
Thread-13::INFO::2014-07-10 10:37:56,672::logUtils::44::dispatcher::(wrapper) Run and protect: getSpmStatus(spUUID='31510a2b-e15b-4476-b120-8b8fb1db0400', options=None)
Thread-13::ERROR::2014-07-10 10:37:56,673::task::866::Storage.TaskManager.Task::(_setError) Task=`0b81ee10-7f13-4667-a14a-345b3c059903`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 611, in getSpmStatus
    status = self._getSpmStatusInfo(pool)
  File "/usr/share/vdsm/storage/hsm.py", line 605, in _getSpmStatusInfo
    (pool.spmRole,) + pool.getSpmStatus()))
  File "/usr/share/vdsm/storage/sp.py", line 126, in getSpmStatus
    return self._backend.getSpmStatus()
  File "/usr/share/vdsm/storage/spbackends.py", line 416, in getSpmStatus
    lVer, spmId = self.masterDomain.inquireClusterLock()
  File "/usr/share/vdsm/storage/sd.py", line 511, in inquireClusterLock
    return self._clusterLock.inquire()
  File "/usr/share/vdsm/storage/clusterlock.py", line 119, in inquire
    raise InquireNotSupportedError()
InquireNotSupportedError



Version-Release number of selected component (if applicable):


How reproducible:
? - seems to reproduce 100% on this specific automated test (but so far *only* on this test)

Steps to Reproduce:
1. Create new DC, Cluster, Add new host
2. Add new iscsi storage domain to DC

Actual results:
Fails with error listed above

Expected results:
Should succeed

Additional info:

Comment 1 Federico Simoncelli 2014-07-17 11:57:31 UTC
The issue happens when you try to create a data center 3.5 using a master domain V1 (domVersion='0' in vdsm).

It shouldn't impact more than just a single test (create data center 3.5 with master domain V1).

On regular basis you should create any data center >= 3.1 using a master domain V3.

Relevant logs:

Thread-13::INFO::2014-07-15 09:25:19,376::logUtils::44::dispatcher::(wrapper) Run and protect: createStorageDomain(storageType=3, sdUUID='1c482ce1-fd64-4601-8687-a5ba4dcbf3c4', domainName='iscsi_0', typeSpecificArg='FuDRBH-I4ME-BLI0-55cy-lV7k-OyB0-bdSSb4', domClass=1, domVersion='0', options=None)

Thread-14::INFO::2014-07-15 09:25:26,872::logUtils::44::dispatcher::(wrapper) Run and protect: createStoragePool(poolType=None, spUUID='50840a07-b2eb-4486-bde3-f8e6e3592676', poolName='datacenter_async_tasks', masterDom='1c482ce1-fd64-4601-8687-a5ba4dcbf3c4', domList=['1c482ce1-fd64-4601-8687-a5ba4dcbf3c4'], masterVersion=1, lockPolicy=None, lockRenewalIntervalSec=5, leaseTimeSec=60, ioOpTimeoutSec=10, leaseRetries=3, options=None)

Thread-14::INFO::2014-07-15 09:25:49,139::logUtils::44::dispatcher::(wrapper) Run and protect: connectStoragePool(spUUID='50840a07-b2eb-4486-bde3-f8e6e3592676', hostID=1, msdUUID='1c482ce1-fd64-4601-8687-a5ba4dcbf3c4', masterVersion=1, domainsMap={'1c482ce1-fd64-4601-8687-a5ba4dcbf3c4': 'active'}, options=None)

Thread-14::INFO::2014-07-15 09:25:49,624::logUtils::44::dispatcher::(wrapper) Run and protect: getSpmStatus(spUUID='50840a07-b2eb-4486-bde3-f8e6e3592676', options=None)
Thread-14::ERROR::2014-07-15 09:25:49,624::task::866::Storage.TaskManager.Task::(_setError) Task=`7c9416c5-81f5-4f93-97d6-30fd003fe869`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 611, in getSpmStatus
    status = self._getSpmStatusInfo(pool)
  File "/usr/share/vdsm/storage/hsm.py", line 605, in _getSpmStatusInfo
    (pool.spmRole,) + pool.getSpmStatus()))
  File "/usr/share/vdsm/storage/sp.py", line 126, in getSpmStatus
    return self._backend.getSpmStatus()
  File "/usr/share/vdsm/storage/spbackends.py", line 416, in getSpmStatus
    lVer, spmId = self.masterDomain.inquireClusterLock()
  File "/usr/share/vdsm/storage/sd.py", line 511, in inquireClusterLock
    return self._clusterLock.inquire()
  File "/usr/share/vdsm/storage/clusterlock.py", line 119, in inquire
    raise InquireNotSupportedError()
InquireNotSupportedError

Comment 2 Allon Mureinik 2014-07-21 12:37:07 UTC
(In reply to Federico Simoncelli from comment #1)
> It shouldn't impact more than just a single test (create data center 3.5
> with master domain V1).
Removing AutomationBlocker based on this.

Comment 3 Sven Kieske 2014-08-05 11:28:54 UTC
Shouldn't this test be removed when it is not important
to test this feature?

Comment 4 Allon Mureinik 2014-08-05 11:31:51 UTC
The flow of creating a domain and then creating a pool on top of it is a valid flow, and should be tested.

What should be removed is the V1 test (which we'll handle in bug 1120712)

Comment 6 Allon Mureinik 2014-08-27 16:29:41 UTC
Moved to ON_QA as bug 1120712 is already merged.

Comment 7 Raz Tamir 2014-09-01 11:59:36 UTC
verified

Comment 8 Federico Simoncelli 2014-09-08 22:41:15 UTC
Bug 1120712 is not a complete fix for this.

Comment 9 Federico Simoncelli 2014-09-10 07:53:35 UTC
Impact of this is: you cannot create a new Data Center 3.5 using as first master domain a Storage Domain < V3.

Bug 1120712 mitigated the issue but the problem is still present.

Considering that the fix would involve some complex logic that we'll throw away soon I am not sure if we want to fix this (ever).

Comment 10 Federico Simoncelli 2014-09-10 07:56:49 UTC
(In reply to Federico Simoncelli from comment #1)
> The issue happens when you try to create a data center 3.5 using a master
> domain V1 (domVersion='0' in vdsm).
> 
> It shouldn't impact more than just a single test (create data center 3.5
> with master domain V1).

Gil I mentioned that we should have had an automated test failing on this for 3.5, the bug couldn't have been VERFIED.

This scenario must be tested for data centers < 3.5.

Do you want to review together the matrix for the automated tests?

Comment 11 Allon Mureinik 2014-09-10 08:51:40 UTC
(In reply to Federico Simoncelli from comment #9)
> Impact of this is: you cannot create a new Data Center 3.5 using as first
> master domain a Storage Domain < V3.
> 
> Bug 1120712 mitigated the issue but the problem is still present.
> 
> Considering that the fix would involve some complex logic that we'll throw
> away soon I am not sure if we want to fix this (ever).

Reducing priority since you'd have to explicitly create a V1 domain for this, and pushing out to 3.5.1 while we rethink if this is even worth the effort to fix.

Comment 12 Gil Klein 2014-09-10 10:54:32 UTC
Federico, could you please email me your suggestion for the testing matrix?

I'll pull in the relevant people from QE, and see if/when we could cover this in automation.

Comment 13 Nir Soffer 2014-09-10 16:39:49 UTC
*** Bug 1139401 has been marked as a duplicate of this bug. ***

Comment 14 Allon Mureinik 2014-11-20 13:28:08 UTC
*** Bug 1166066 has been marked as a duplicate of this bug. ***

Comment 15 Sandro Bonazzola 2015-03-03 12:57:06 UTC
Re-targeting to 3.5.3 since this bug has not been marked as blocker for 3.5.2 and we have already released 3.5.2 Release Candidate.

Comment 18 Red Hat Bugzilla Rules Engine 2015-09-22 07:43:25 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 19 Yaniv Lavi 2015-10-07 13:10:39 UTC
Can you please check if this is fixed and how to test this?

Comment 20 Allon Mureinik 2015-10-15 09:44:19 UTC
Ala, what patch solves this? How come it's on MODIFIED?

Comment 21 Eyal Edri 2015-11-02 12:29:31 UTC
bug is missing patch in external tracker,
can you please add the relevant fix, otherwise its impossible to match / verify if the fix is really in.

Comment 22 Raz Tamir 2015-11-10 12:46:38 UTC
Verified on vdsm-4.17.10.1-0.el7ev.noarch .
Followed the steps to reproduce

Comment 25 errata-xmlrpc 2016-03-09 19:23:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0362.html


Note You need to log in before you can comment on or make changes to this bug.