Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 670432

Summary: [vdsm] [storage] migrate master scenario - vdsm use old values in case restart takes places during operation
Product: Red Hat Enterprise Linux 6 Reporter: Haim <hateya>
Component: vdsmAssignee: Saggi Mizrahi <smizrahi>
Status: CLOSED ERRATA QA Contact: Haim <hateya>
Severity: high Docs Contact:
Priority: low    
Version: 6.1CC: abaron, bazulay, danken, dnaori, ewarszaw, iheim, mgoldboi, smizrahi, yeylon
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm-4.9-61 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 07:04:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm log. none

Description Haim 2011-01-18 09:47:02 UTC
Description of problem:

in deactivateStorageDomain on migrate master scenario, vdsm takes old values of master version, and thus, fails to connect storage pool (pool not connected). 

Thread-2904::INFO::2011-01-17 15:15:44,269::dispatcher::95::irs::Run and protect: deactivateStorageDomain, args: ( sdUUID=ae0b976c-83b0-458c-be2a-265637529d78 spUUID=04422aa0-39e6-475c-adac-ffb2ddf1e40c msdUUID=29b93fd7-1a68-406e-bfcf-3e85828575b7 masterVersion=2)


MainThread::ERROR::2011-01-17 16:15:17,891::misc::65::irs::Wrong Master domain or its version: 'SD=ae0b976c-83b0-458c-be2a-265637529d78, pool=04422aa0-39e6-475c-adac-ffb2ddf1e40c'
MainThread::ERROR::2011-01-17 16:15:17,892::misc::66::irs::Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 223, in __init__
    self._restorePool(spUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 417, in _restorePool
    pool.reconnect()
  File "/usr/share/vdsm/storage/sp.py", line 514, in reconnect
    return self.connect(hostId, scsiKey, msdUUID, masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 411, in connect
    mDom = self.getMasterDomain(msdUUID=msdUUID, masterVersion=masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 1223, in getMasterDomain
    self.masterDomain = self.findMasterDomain(msdUUID=msdUUID, masterVersion=masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 1293, in findMasterDomain
    raise e
StoragePoolWrongMaster: Wrong Master domain or its version: 'SD=ae0b976c-83b0-458c-be2a-265637529d78, pool=04422aa0-39e6-475c-adac-ffb2ddf1e40c'

MainThread::INFO::2011-01-17 16:15:17,910::dispatcher::139::irs::Starting StorageDispatcher...
Thread-17::INFO::2011-01-17 16:15:18,411::dispatcher::95::irs::Run and protect: getSpmStatus, args: ( spUUID=04422aa0-39e6-475c-adac-ffb2ddf1e40c)
Thread-17::DEBUG::2011-01-17 16:15:18,411::task::577::irs::Task 459b58b1-297d-493a-9963-eb170ad729bb: moving from state init -> state preparing
Thread-17::ERROR::2011-01-17 16:15:18,412::misc::65::irs::Unknown pool id, pool not connected: ('04422aa0-39e6-475c-adac-ffb2ddf1e40c',)
Thread-17::ERROR::2011-01-17 16:15:18,414::misc::66::irs::Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 978, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/spm.py", line 578, in public_getSpmStatus
    hsm.HSM.validateConnectedPool(spUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 86, in validateConnectedPool
    raise se.StoragePoolUnknown(spUUID)
StoragePoolUnknown: Unknown pool id, pool not connected: ('04422aa0-39e6-475c-adac-ffb2ddf1e40c',)

backend then send connectStorageServer and connectStoragePool again, and host is connected to the pool. 

repro steps: 

1) work with several storage domains 
2) put master domain in maintenance

notes: 

1) Ayal reviewed this bug on rhel 5.5.6 and asked to open a bug on rhel6 (2.3)
2) see attached log

Comment 1 Haim 2011-01-18 09:49:41 UTC
note: restart means restart of vdsm service.

Comment 3 Haim 2011-01-18 12:06:09 UTC
Created attachment 474036 [details]
vdsm log.

Comment 5 Saggi Mizrahi 2011-04-07 16:44:56 UTC
Patches in gerrit:
http://gerrit.usersys.redhat.com/247

Comment 6 Haim 2011-05-01 16:16:28 UTC
verified on vdsm-4.9-62, migrated master domain several times, restarted service, and operation passed as expected.

Comment 7 errata-xmlrpc 2011-12-06 07:04:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html