Description of problem: When attaching storage domain to a DC, and restarting the engine service while the attachment in progress, storage domain looks Locked in the UI and no actions can be performed, accept destroy. Version-Release number of selected component (if applicable): ovirt-engine-4.1.0-0.2.master.20161203231307.gitd7d920b.el7.centos.noarch vdsm-4.18.999-1138.git6c51957.el7.centos.x86_64 How reproducible: Tried with 2 SDs, reproduced on both. Steps to Reproduce: 1. attach storage domain to a dc 2. while the process is still running, restart the ovirt-engine service 3. wait for the UI to come back and check the storage domains' status Actual results: Storage domain appears Locked and no actions can be performed on it (expect destroy) Expected results: Storage domain should be unattached and the user should be able to attach it to the dc Additional info: vdsm.log 2016-12-14 14:23:36,308 INFO (jsonrpc/1) [dispatcher] Run and protect: connectStoragePool(spUUID=u'cb93d507-6f32-4eda-b916-c99ff6a7afe1', hostID=1, msdUUID=u'bd0c9dd0-ca22-4ce5-bb47-3c903409baec', masterVersion=12, domainsMap={u'bd0c9dd0-ca22-4ce5-bb47-3c903409baec': u'active', u'8c14efe4-c881-47e7-a5b8-0fa8d3179e07': u'active', u'67944510-99f4-4746-88a0-ba5c6aeaf21d': u'active', u'ed6f577a-2d9c-4c31-ac08-720edf376940': u'active'}, options=None) (logUtils:49) engine.log 2016-12-14 14:21:48,529+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainInfoVDSCommand] (org.ovirt.thread.pool-6-thread-6) [ed406bd5-ba7c-401e-a444-4b6be6b1 7010] FINISH, HSMGetStorageDomainInfoVDSCommand, return: <StorageDomainStatic:{name='unattached_sd2', id='67944510-99f4-4746-88a0-ba5c6aeaf21d'}, null>, log id: 5dc0c956 2016-12-14 14:21:48,533+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.AttachStorageDomainVDSCommand] (org.ovirt.thread.pool-6-thread-6) [ed406bd5-ba7c-401e-a444-4b6be6b17010 ] START, AttachStorageDomainVDSCommand( AttachStorageDomainVDSCommandParameters:{runAsync='true', storagePoolId='cb93d507-6f32-4eda-b916-c99ff6a7afe1', ignoreFailoverLimit='false' , storageDomainId='67944510-99f4-4746-88a0-ba5c6aeaf21d'}), log id: 5a796712
Created attachment 1231753 [details] logs zip engine.log vdsm.log
Looking through the patch attached to to the BZ is a bit unsettling. While it should indeed solve the bug described here, the issue is deeper than just this flow. The bug occurs in the compensation infrastructure, and would, in theory, affect all the flow that use it if the engine is restarted in the middle of them. Raz - at the very least I think we wait with engine-restart tests till QA has a build with this fix. Do you want to track this here, or open a separate BZ(s) for it?
Allon, We can track it here
-------------------------------------- Tested with the following code: ---------------------------------------- rhevm-4.1.0-0.3.beta2.el7.noarch vdsm-4.19.1-1.el7ev.x86_64 Tested with the following scenario: Steps to Reproduce: 1. attach storage domain to a dc 2. while the process is still running, restart the ovirt-engine service 3. wait for the UI to come back and check the storage domains' status Actual results: After ovirt-engine restart, the attached storage domain appears unattached and can be attached again. Moving to VERIFIED!