Created attachment 1222475 [details] engine and vdsm logs Description of problem: In environment with 1 data domain, master, and 1 hosted_storage storage domain, when deactivating the master domain, the hosted_storage should take the master domain role and it fails. At first, ovirt-engine service is restarted and after the service is running again, a ReconstructMasterDomain action is also failed (engine.log): 2016-11-21 14:45:40,382 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler10) [] IrsBroker::Failed::GetStoragePoolInfoVDS: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Wrong Master domain or its version: u'SD=12f05e4f-382b-4319-bcca-b88703bb79ca, pool=c1390294-e3f4-45ba-84cc-a03a2ef561ff' 2016-11-21 14:45:40,535 WARN [org.ovirt.engine.core.bll.storage.pool.ReconstructMasterDomainCommand] (org.ovirt.thread.pool-6-thread-50) [e6c81c5] Validation of action 'ReconstructMasterDomain' failed for user SYSTEM. Reasons: VAR__ACTION__RECONSTRUCT_MASTER,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_STORAGE_DOMAIN_STATUS_ILLEGAL2,$status PreparingForMaintenance In vdsm.log: jsonrpc.Executor/7::ERROR::2016-11-21 21:44:49,659::task::868::Storage.TaskManager.Task::(_setError) Task=`39bba449-8ef3-405c-b3ac-5a7db84ff3e3`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 875, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 988, in connectStoragePool spUUID, hostID, msdUUID, masterVersion, domainsMap) File "/usr/share/vdsm/storage/hsm.py", line 1053, in _connectStoragePool res = pool.connect(hostID, msdUUID, masterVersion) File "/usr/share/vdsm/storage/sp.py", line 646, in connect self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion) File "/usr/share/vdsm/storage/sp.py", line 1237, in __rebuild raise se.StoragePoolWrongMaster(self.spUUID, msdUUID) StoragePoolWrongMaster: Wrong Master domain or its version: u'SD=12f05e4f-382b-4319-bcca-b88703bb79ca, pool=c1390294-e3f4-45ba-84cc-a03a2ef561ff' jsonrpc.Executor/7::DEBUG::2016-11-21 21:44:49,659::task::887::Storage.TaskManager.Task::(_run) Task=`39bba449-8ef3-405c-b3ac-5a7db84ff3e3`::Task._run: 39bba449-8ef3-405c-b3ac-5a7db84ff3e3 (u'c1390294-e3f4-45ba-84cc-a03a2ef561ff', 4, u'12f05e4f-382b-4319-bcca-b88703bb79ca', 1, {u'9e1ce814-e23f-427b-ab41-b675ccd15e28': u'attached', u'129310da-7b83-4067-b35a-f377e6468310': u'attached', u'd24e3477-6799-48e2-aa8e-bf4776ec8463': u'attached', u'01bc79af-a55b-48e4-b451-cc7ff59ae8e6': u'active', u'7376201f-0c83-4fe7-a2ca-24893dd1de8c': u'attached', u'40d08016-cb96-4771-bd65-3910157ecefa': u'attached', u'6d03d0e7-4758-46c2-9cef-80f156851710': u'active', u'2a82ecbd-e7bd-473f-a713-0143ec06170a': u'attached', u'12f05e4f-382b-4319-bcca-b88703bb79ca': u'attached', u'b44a3e15-2ee0-4330-9d45-380109bffd54': u'attached', u'c9311021-8765-4058-b0c6-c02228574117': u'attached'}) {} failed - stopping task Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: Setup: 1 data SD (master), 1 hosted_engine SD 1. Deactivate the master storage domain 2. 3. Actual results: ovirt-engine service is restarted and 5 minutes later, after the environment is accessible again, a ReconstructMasterDomain action fails too Expected results: hosted_engine storage domain should become master domain Additional info:
Hosted storage domain cannot become the master domain since it's being controlled externally (connect was done by the ha agent via vdsm). This is by design since we need to decide who's in control- the engine or the ha-agent and the current implementation keeps the agent in control.