Created attachment 1365572 [details] engine and vdsm logs Description of problem: In our automation, I see that reconstruct master storage domain is failing after deactivating gluster type master storage domain: engine.log: 2017-12-09 04:33:43,042+02 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.DeactivateStorageDomainVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-20) [30c9dcfd] Failed in 'DeactivateStorageDomainVDS' method 2017-12-09 04:33:43,049+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-20) [30c9dcfd] EVENT_ID: IRS_BROKER_COMMAND_FAILURE(10,803), VDSM command DeactivateSt orageDomainVDS failed: Error in storage domain action: ('sdUUID=650155db-bb01-4234-a974-724658ea8365, spUUID=93512097-d821-43d7-806d-1b7cb44091b4, msdUUID=6fae4482-ded3-4d6f-bf82-91ec82619032, masterVersion=3',) 2017-12-09 04:33:43,050+02 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-20) [30c9dcfd] IrsBroker::Failed::DeactivateStorageDomainVDS: IRSGenericException: IRSErrorExcep tion: Failed to DeactivateStorageDomainVDS, error = Error in storage domain action: ('sdUUID=650155db-bb01-4234-a974-724658ea8365, spUUID=93512097-d821-43d7-806d-1b7cb44091b4, msdUUID=6fae4482-ded3-4d6f-bf82-91ec82619032, masterVersion=3' ,), code = 350 vdsm.log: 2017-12-09 04:33:42,988+0200 ERROR (jsonrpc/0) [storage.StoragePool] migration to new master failed (sp:900) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 889, in masterMigrate exclude=('./lost+found',)) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line 67, in tarCopy raise TarCopyFailed(tsrc.returncode, tdst.returncode, out, err) TarCopyFailed: (1, 0, '', '') 2017-12-09 04:33:42,992+0200 DEBUG (jsonrpc/0) [storage.PersistentDict] Starting transaction (persistent:169) 2017-12-09 04:33:42,993+0200 DEBUG (jsonrpc/0) [storage.PersistentDict] Finished transaction (persistent:177) 2017-12-09 04:33:42,994+0200 INFO (jsonrpc/0) [storage.SANLock] Releasing Lease(name='SDM', path=u'/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge3__volume__2/6fae4482-ded3-4d6f-bf82-91ec82619032/dom_md/leases', offset=1048576) (clusterlock:435) 2017-12-09 04:33:43,023+0200 INFO (jsonrpc/0) [storage.SANLock] Successfully released Lease(name='SDM', path=u'/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge3__volume__2/6fae4482-ded3-4d6f-bf82-91ec82619032/dom_md/leases', offset=1048576) (clusterlock:444) 2017-12-09 04:33:43,024+0200 INFO (jsonrpc/0) [vdsm.api] FINISH deactivateStorageDomain error=(1, 0, '', '') from=::ffff:10.35.161.183,38888, flow_id=30c9dcfd, task_id=614923f7-de0b-41fd-a33e-f7e5c7be83e6 (api:50) 2017-12-09 04:33:43,025+0200 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='614923f7-de0b-41fd-a33e-f7e5c7be83e6') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in deactivateStorageDomain File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1234, in deactivateStorageDomain pool.deactivateSD(sdUUID, msdUUID, masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1191, in deactivateSD self.masterMigrate(sdUUID, newMsdUUID, masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 889, in masterMigrate exclude=('./lost+found',)) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line 67, in tarCopy raise TarCopyFailed(tsrc.returncode, tdst.returncode, out, err) TarCopyFailed: (1, 0, '', '') 2017-12-09 04:33:43,026+0200 DEBUG (jsonrpc/0) [storage.TaskManager.Task] (Task='614923f7-de0b-41fd-a33e-f7e5c7be83e6') Task._run: 614923f7-de0b-41fd-a33e-f7e5c7be83e6 ('650155db-bb01-4234-a974-724658ea8365', '93512097-d821-43d7-806d-1b7cb44091b4', '6fae4482-ded3-4d6f-bf82-91ec82619032', 3) {} failed - stopping task (task:894) After this failure, the environment contain 2 master storage domains, 1 is the "old" gluster master, and a second master is rotating on all rest storage domains (I have 8 more, from types: gluster, nfs, iscsi) Version-Release number of selected component (if applicable): vdsm-4.20.9-1.el7ev.x86_64 rhvm-4.2.0-0.6.el7 How reproducible: Seen once so far Steps to Reproduce: 1. Deactivate master storage domain of gluster type 2. 3. Actual results: deactivate master fails and second master storage appears Expected results: Additional info: * Couldn't reproduce it manually
Created attachment 1365573 [details] screenshot
*** This bug has been marked as a duplicate of bug 1514025 ***