Bug 1397189 - [HE] hosted_engine storage domain fail to take master domain role
Summary: [HE] hosted_engine storage domain fail to take master domain role
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.HostedEngine
Version: 4.0.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Doron Fediuck
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks: 1400127
TreeView+ depends on / blocked
 
Reported: 2016-11-21 20:33 UTC by Raz Tamir
Modified: 2017-05-11 09:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-22 07:31:57 UTC
oVirt Team: SLA
Embargoed:
ratamir: planning_ack?
ratamir: devel_ack?
ratamir: testing_ack?


Attachments (Terms of Use)
engine and vdsm logs (441.54 KB, application/x-gzip)
2016-11-21 20:33 UTC, Raz Tamir
no flags Details

Description Raz Tamir 2016-11-21 20:33:47 UTC
Created attachment 1222475 [details]
engine and vdsm logs

Description of problem:
In environment with 1 data domain, master, and 1 hosted_storage storage domain, when deactivating the master domain, the hosted_storage should take the master domain role and it fails.
At first, ovirt-engine service is restarted and after the service is running again, a ReconstructMasterDomain action is also failed (engine.log):
2016-11-21 14:45:40,382 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler10) [] IrsBroker::Failed::GetStoragePoolInfoVDS: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Wrong Master domain or its version: u'SD=12f05e4f-382b-4319-bcca-b88703bb79ca, pool=c1390294-e3f4-45ba-84cc-a03a2ef561ff'
2016-11-21 14:45:40,535 WARN  [org.ovirt.engine.core.bll.storage.pool.ReconstructMasterDomainCommand] (org.ovirt.thread.pool-6-thread-50) [e6c81c5] Validation of action 'ReconstructMasterDomain' failed for user SYSTEM. Reasons: VAR__ACTION__RECONSTRUCT_MASTER,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_STORAGE_DOMAIN_STATUS_ILLEGAL2,$status PreparingForMaintenance

In vdsm.log:
jsonrpc.Executor/7::ERROR::2016-11-21 21:44:49,659::task::868::Storage.TaskManager.Task::(_setError) Task=`39bba449-8ef3-405c-b3ac-5a7db84ff3e3`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 875, in _run
    return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 988, in connectStoragePool
    spUUID, hostID, msdUUID, masterVersion, domainsMap)
  File "/usr/share/vdsm/storage/hsm.py", line 1053, in _connectStoragePool
    res = pool.connect(hostID, msdUUID, masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 646, in connect
    self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 1237, in __rebuild
    raise se.StoragePoolWrongMaster(self.spUUID, msdUUID)
StoragePoolWrongMaster: Wrong Master domain or its version: u'SD=12f05e4f-382b-4319-bcca-b88703bb79ca, pool=c1390294-e3f4-45ba-84cc-a03a2ef561ff'
jsonrpc.Executor/7::DEBUG::2016-11-21 21:44:49,659::task::887::Storage.TaskManager.Task::(_run) Task=`39bba449-8ef3-405c-b3ac-5a7db84ff3e3`::Task._run: 39bba449-8ef3-405c-b3ac-5a7db84ff3e3 (u'c1390294-e3f4-45ba-84cc-a03a2ef561ff', 4, u'12f05e4f-382b-4319-bcca-b88703bb79ca', 1, {u'9e1ce814-e23f-427b-ab41-b675ccd15e28': u'attached', u'129310da-7b83-4067-b35a-f377e6468310': u'attached', u'd24e3477-6799-48e2-aa8e-bf4776ec8463': u'attached', u'01bc79af-a55b-48e4-b451-cc7ff59ae8e6': u'active', u'7376201f-0c83-4fe7-a2ca-24893dd1de8c': u'attached', u'40d08016-cb96-4771-bd65-3910157ecefa': u'attached', u'6d03d0e7-4758-46c2-9cef-80f156851710': u'active', u'2a82ecbd-e7bd-473f-a713-0143ec06170a': u'attached', u'12f05e4f-382b-4319-bcca-b88703bb79ca': u'attached', u'b44a3e15-2ee0-4330-9d45-380109bffd54': u'attached', u'c9311021-8765-4058-b0c6-c02228574117': u'attached'}) {} failed - stopping task



Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
Setup: 1 data SD (master), 1 hosted_engine SD
1. Deactivate the master storage domain
2.
3.

Actual results:
ovirt-engine service is restarted and 5 minutes later, after the environment is accessible again, a ReconstructMasterDomain action fails too


Expected results:
hosted_engine storage domain should become master domain

Additional info:

Comment 1 Doron Fediuck 2016-11-22 07:31:57 UTC
Hosted storage domain cannot become the master domain since it's being controlled externally (connect was done by the ha agent via vdsm). 
This is by design since we need to decide who's in control- the engine or the ha-agent and the current implementation keeps the agent in control.


Note You need to log in before you can comment on or make changes to this bug.