845838 – ovirt-engine-backend: two reconstruct master actions sent to vdsm leaving two master domains while backend rolls back

Bug 845838 - ovirt-engine-backend: two reconstruct master actions sent to vdsm leaving two master domains while backend rolls back

Summary: ovirt-engine-backend: two reconstruct master actions sent to vdsm leaving two...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-engine
Sub Component:
Version:	3.1.0
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	3.1.0
Assignee:	Liron Aravot
QA Contact:	Dafna Ron
Docs Contact:
URL:
Whiteboard:	storage
Depends On:
Blocks:	769838
TreeView+	depends on / blocked

Reported:	2012-08-05 14:50 UTC by Dafna Ron
Modified:	2016-02-10 16:42 UTC (History)
CC List:	11 users (show)
Fixed In Version:	SI23
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team:	Storage
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
logs (1.11 MB, application/x-gzip) 2012-08-05 14:50 UTC, Dafna Ron	no flags	Details
View All

Description Dafna Ron 2012-08-05 14:50:01 UTC

Created attachment 602365 [details]
logs

Description of problem:

after storage failure, the system sends reconstruct master and I send reinitialize to a second domain (reconstruct)

as a result, the reinitialize failed in backend and all action rolled back removing the second domain I added (with reinitialize) from the pool. 

the vdsm however, has two master domains under the same pool. 

we are unable to recover from this action since we are getting wrong master domain in backend and vdsm has a sanlock error: 
AcquireLockFailure: Cannot obtain lock

Version-Release number of selected component (if applicable):

si13

How reproducible:

100%

Steps to Reproduce:
1. 1 domain and 1 host, iscsi. block connectivity to domain from the vds
2. once we get can't find master domain remove the iptables block from vds
3. reinitialize storage
  
Actual results:

backend gets exception on MarkPoolInReconstructModeVDSCommand and 
rollback on reinitialize (in backend only leaving only the original domain in the pool in db). 
we continue on both actions in vdsm which creates two master domains in vds and a domain which is not connected to pool in backend db but is connected to pool in vds. 
we cannot recover with sanlock error: 
AcquireLockFailure: Cannot obtain lock


Expected results:

we should block one of the actions (which one that started fist)

Additional info: both vdsm and backend logs are attached. 


e0a63f-a60e-4bb6-8394-80e3043c6204::INFO::2012-08-05 17:35:50,086::safelease::211::SANLock::(acquire) Acquiring cluster lock for domain e3c8df23-d94e-4d9f-ba77-6114c76776af (id: 1)
1de0a63f-a60e-4bb6-8394-80e3043c6204::ERROR::2012-08-05 17:35:50,087::task::853::TaskManager.Task::(_setError) Task=`1de0a63f-a60e-4bb6-8394-80e3043c6204`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 320, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/sp.py", line 255, in startSpm
    self.masterDomain.acquireClusterLock(self.id)
  File "/usr/share/vdsm/storage/sd.py", line 439, in acquireClusterLock
    self._clusterLock.acquire(hostID)
  File "/usr/share/vdsm/storage/safelease.py", line 228, in acquire
    "Cannot acquire cluster lock", str(e))
AcquireLockFailure: Cannot obtain lock: "id=e3c8df23-d94e-4d9f-ba77-6114c76776af, rc=17, out=Cannot acquire cluster lock, err=(17, 'Sanlock resource not acquired', 'File exists')"
1de0a63f-a60e-4bb6-8394-80e3043c6204::DEBUG::2012-08-05 17:35:50,088::task::872::TaskManager.Task::(_run) Task=`1de0a63f-a60e-4bb6-8394-80e3043c6204`::Task._run: 1de0a63f-a60e-4bb6-8394-80e3043c6204 () {} failed - stopping task
1de0a63f-a60e-4bb6-8394-80e3043c6204::DEBUG::2012-08-05 17:35:50,088::task::1199::TaskManager.Task::(stop) Task=`1de0a63f-a60e-4bb6-8394-80e3043c6204`::stopping in state running (force False)
1de0a63f-a60e-4bb6-8394-80e3043c6204::DEBUG::2012-08-05 17:35:50,089::task::978::TaskManager.Task::(_decref) Task=`1de0a63f-a60e-4bb6-8394-80e3043c6204`::ref 1 aborting True
1de0a63f-a60e-4bb6-8394-80e3043c6204::DEBUG::2012-08-05 17:35:50,089::task::903::TaskManager.Task::(_runJobs) Task=`1de0a63f-a60e-4bb6-8394-80e3043c6204`::abortin


2012-08-05 15:10:01,269 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-4-thread-44) [58baf64e] clear domain error-timers for pool ddc2020c-c909-4214-89ff-b3cdde436183
2012-08-05 15:10:01,270 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MarkPoolInReconstructModeVDSCommand] (pool-4-thread-44) [58baf64e] FINISH, MarkPoolInReconstructModeVDSCommand, log id: 34105a9a
2012-08-05 15:10:01,272 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (pool-4-thread-44) [58baf64e] START, DisconnectStoragePoolVDSCommand(vdsId = f0e829f6-dcbe-11e1-b93d-001a4a169741, storagePoolId =
 ddc2020c-c909-4214-89ff-b3cdde436183, vds_spm_id = 1, masterDomainId = 00000000-0000-0000-0000-000000000000, masterVersion = 0), log id: 504d243b
2012-08-05 15:10:01,303 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (pool-4-thread-44) [58baf64e] FINISH, DisconnectStoragePoolVDSCommand, log id: 504d243b
2012-08-05 15:10:01,308 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ReconstructMasterVDSCommand] (pool-4-thread-44) [58baf64e] Cannot perform logging: java.util.UnknownFormatConversionException: Conversion = 'i'
        at java.util.Formatter$FormatSpecifier.conversion(Formatter.java:2646) [rt.jar:1.7.0_05-icedtea]
        at java.util.Formatter$FormatSpecifier.<init>(Formatter.java:2675) [rt.jar:1.7.0_05-icedtea]
        at java.util.Formatter.parse(Formatter.java:2528) [rt.jar:1.7.0_05-icedtea]
        at java.util.Formatter.format(Formatter.java:2469) [rt.jar:1.7.0_05-icedtea]
        at java.util.Formatter.format(Formatter.java:2423) [rt.jar:1.7.0_05-icedtea]
        at java.lang.String.format(String.java:2845) [rt.jar:1.7.0_05-icedtea]
        at org.ovirt.engine.core.common.vdscommands.ReconstructMasterVDSCommandParameters.toString(ReconstructMasterVDSCommandParameters.java:86) [engine-common.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.toString(VDSCommandBase.java:49) [engine-vdsbroker.jar:]
        at java.text.MessageFormat.subformat(MessageFormat.java:1271) [rt.jar:1.7.0_05-icedtea]
        at java.text.MessageFormat.format(MessageFormat.java:860) [rt.jar:1.7.0_05-icedtea]
        at java.text.Format.format(Format.java:157) [rt.jar:1.7.0_05-icedtea]
        at java.text.MessageFormat.format(MessageFormat.java:836) [rt.jar:1.7.0_05-icedtea]
        at org.ovirt.engine.core.utils.log.Log.transform(Log.java:159) [engine-utils.jar:]
        at org.ovirt.engine.core.utils.log.Log.infoFormat(Log.java:120) [engine-utils.jar:]
        at org.ovirt.engine.core.utils.log.LoggedUtils.log(LoggedUtils.java:162) [engine-utils.jar:]
        at org.ovirt.engine.core.utils.log.LoggedUtils.logEntry(LoggedUtils.java:80) [engine-utils.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.Execute(VdcCommandBase.java:34) [engine-dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:390) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:35) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:1322) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.storage.ReconstructMasterDomainCommand.reconstructMaster(ReconstructMasterDomainCommand.java:138) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.storage.ReconstructMasterDomainCommand.executeCommand(ReconstructMasterDomainCommand.java:149) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.storage.RecoveryStoragePoolCommand.executeCommand(RecoveryStoragePoolCommand.java:103) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.ExecuteWithoutTransaction(CommandBase.java:810) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:901) [engine-bll.jar:]



[root@blond-vdsh tmp]# vdsClient -s 0 getStorageDomainsList
7a4c3779-ff28-49ea-9b75-76e34c1c0fc7
e3c8df23-d94e-4d9f-ba77-6114c76776af


[root@blond-vdsh tmp]# vdsClient -s 0 getStorageDomainInfo 7a4c3779-ff28-49ea-9b75-76e34c1c0fc7
	uuid = 7a4c3779-ff28-49ea-9b75-76e34c1c0fc7
	vguuid = caaAuF-pzB1-ZFVP-9iiE-pEiq-GADI-gui5Qu
	lver = -1
	state = OK
	version = 3
	role = Master
	pool = ['ddc2020c-c909-4214-89ff-b3cdde436183']
	spm_id = -1
	type = ISCSI
	class = Data
	master_ver = 0
	name = Dafna-si13-02

[root@blond-vdsh tmp]# vdsClient -s 0 getStorageDomainInfo e3c8df23-d94e-4d9f-ba77-6114c76776af
	uuid = e3c8df23-d94e-4d9f-ba77-6114c76776af
	vguuid = o6KZEf-FY7m-t0o0-fSIi-OGHH-Kh6P-hEaif3
	lver = 3
	state = OK
	version = 3
	role = Master
	pool = ['ddc2020c-c909-4214-89ff-b3cdde436183']
	spm_id = 1
	type = ISCSI
	class = Data
	master_ver = 1
	name = Dafna-si13-01

Comment 2 Haim 2012-08-05 15:18:08 UTC

expected results: 

this attempt should be blocked as RHEVM shouldn't allow parallel reconstruct attempts. 

please see related bug opened for VDSM: https://bugzilla.redhat.com/show_bug.cgi?id=845839. 
note that engine does the right thing trying to reconstruct the pool, though spmStart is failing due to issue in sanlock (vdsm bug).

Comment 3 Ayal Baron 2012-08-06 11:24:28 UTC

This is an error flow plus a race between automatic action and user initiated action, removing beta1 blocker.

Comment 5 Liron Aravot 2012-09-13 07:34:13 UTC

bug http://gerrit.ovirt.org/#/c/7964/ as a fix

Comment 6 Liron Aravot 2012-09-13 12:02:37 UTC

correct patch : http://gerrit.ovirt.org/#/c/7982/

Comment 7 mkublin 2012-09-13 13:33:37 UTC

The problem is following, Recovery triggered reconstruct, reconstruct successes,
vdsm has a master, but recovery was rollbacked at engine side - split engine and vsm has different masters.
The rollback was because of failure during update of vms ovf.
Place at logs:
2012-08-05 15:10:34,215 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ReconstructMasterVDSCommand] (pool-4-thread-44) [58baf64e] FINISH, ReconstructMasterVDSCommand, log
Sucess of Reconstruct.

Rollback of Recovery:
2012-08-05 15:10:46,238 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-44) [58baf64e] Error code VMPathNotExists and error message IRSGenericException: IRSErrorException: Failed to UpdateVMVDS, error = Cannot found VMs directory: ('/rhev/data-center/mnt/blockSD/7a4c3779-ff28-49ea-9b75-76e34c1c0fc7/master/vms',)
2012-08-05 15:10:46,238 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-4-thread-44) [58baf64e] IrsBroker::Failed::UpdateVMVDS due to: IRSErrorException: IRSGenericException: IRSErrorException: Failed to UpdateVMVDS, error = Cannot found VMs directory: ('/rhev/data-center/mnt/blockSD/7a4c3779-ff28-49ea-9b75-76e34c1c0fc7/master/vms',)
2012-08-05 15:10:46,242 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.UpdateVMVDSCommand] (pool-4-thread-44) [58baf64e] FINISH, UpdateVMVDSCommand, log id: 60af7e71
2012-08-05 15:10:46,243 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MarkPoolInReconstructModeVDSCommand] (pool-4-thread-44) [58baf64e] START, MarkPoolInReconstructModeVDSCommand(storagePoolId = ddc2020c-c909-4214-89ff-b3cdde436183, ignoreFailoverLimit = false, compatabilityVersion = null, reconstructMarkAction = ClearCache), log id: 43157064
2012-08-05 15:10:46,243 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-4-thread-44) [58baf64e] clearing cache for problematic entities in pool ddc2020c-c909-4214-89ff-b3cdde436183
2012-08-05 15:10:46,243 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MarkPoolInReconstructModeVDSCommand] (pool-4-thread-44) [58baf64e] FINISH, MarkPoolInReconstructModeVDSCommand, log id: 43157064
2012-08-05 15:10:46,243 ERROR [org.ovirt.engine.core.bll.storage.RecoveryStoragePoolCommand] (pool-4-thread-44) [58baf64e] Command org.ovirt.engine.core.bll.storage.RecoveryStoragePoolCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.irsbroker.IRSErrorException: IRSGenericException: IRSErrorException: Failed to UpdateVMVDS, error = Cannot found VMs directory: ('/rhev/data-center/mnt/blockSD/7a4c3779-ff28-49ea-9b75-76e34c1c0fc7/master/vms',)
2012-08-05 15:10:46,288 INFO  [org.ovirt.engine.core.bll.storage.RecoveryStoragePoolCommand] (pool-4-thread-44) [58baf64e] Command [id=9af8b87c-a4ba-4159-9027-9f8b453ec4f1]: Compensating CHANGED_ENTITY of org.ovirt.engine.core.common.businessentities.storage_domain_static; snapshot: id=e3c8df23-d94e-4d9f-ba77-6114c76776af.
2012-08-05 15:10:46,298 INFO  [org.ovirt.engine.core.bll.storage.RecoveryStoragePoolCommand] (pool-4-thread-44) [58baf64e] Command [id=9af8b87c-a4ba-4159-9027-9f8b453ec4f1]: Compensating CHANGED_ENTITY of org.ovirt.engine.core.common.businessentities.storage_domain_static; snapshot: id=7a4c3779-ff28-49ea-9b75-76e34c1c0fc7.
2012-08-05 15:10:46,302 INFO  [org.ovirt.engine.core.bll.storage.RecoveryStoragePoolCommand] (pool-4-thread-44) [58baf64e] Command [id=9af8b87c-a4ba-4159-9027-9f8b453ec4f1]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.storage_pool_iso_map; snapshot: storagePoolId = ddc2020c-c909-4214-89ff-b3cdde436183, storageId = 7a4c3779-ff28-49ea-9b75-76e34c1c0fc7.

Comment 8 Liron Aravot 2012-09-16 10:42:50 UTC

mkublin, one command is ReconstructMasterDomain command and one is RecoveryStoragePoolCommand (which calls recustrurct execute() method and not using the runAction mechanism). the issue does seems to be related to two reconstruct on the same pool that run simoultenious thus this patch does should solve the issue and i'd like pre-intergaration to verify it.

Comment 9 mkublin 2012-09-19 07:46:06 UTC

(In reply to comment #8)
> mkublin, one command is ReconstructMasterDomain command and one is
> RecoveryStoragePoolCommand (which calls recustrurct execute() method and not
> using the runAction mechanism). the issue does seems to be related to two
> reconstruct on the same pool that run simoultenious thus this patch does
> should solve the issue and i'd like pre-intergaration to verify it.

Prove it by logs

Comment 10 Liron Aravot 2012-09-19 08:53:21 UTC

2012-08-05 15:09:50,079 INFO  [org.ovirt.engine.core.bll.storage.ReconstructMasterDomainCommand] (QuartzScheduler_Worker-58) [417a872b] Running command: ReconstructMasterDomainCommand internal: true. Entities affected :  ID: e3c8df23-d94e-4d9f-ba77-6114c76776af Type: Storage
2012-08-05 15:09:50,127 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MarkPoolInReconstructModeVDSCommand] (QuartzScheduler_Worker-58) [417a872b] START, MarkPoolInReconstructModeVDSCommand(storagePoolId = ddc2020c-c909-4214-89ff-b3cdde436183, ignoreFailoverLimit = false, compatabilityVersion = null, reconstructMarkAction = ClearCache), log id: 10aa801c
2012-08-05 15:09:50,128 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-58) [417a872b] clearing cache for problematic entities in pool ddc2020c-c909-4214-89ff-b3cdde436183
2012-08-05 15:09:50,128 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MarkPoolInReconstructModeVDSCommand] (QuartzScheduler_Worker-58) [417a872b] FINISH, MarkPoolInReconstructModeVDSCommand, log id: 10aa801c
2012-08-05 15:09:59,583 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-75) domain e3c8df23-d94e-4d9f-ba77-6114c76776af in problem. vds: blond-vdsh
2012-08-05 15:10:00,001 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-77) Checking autorecoverable hosts
2012-08-05 15:10:00,003 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-77) Autorecovering 0 hosts
2012-08-05 15:10:00,003 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-77) Checking autorecoverable hosts done
2012-08-05 15:10:00,003 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-77) Checking autorecoverable storage domains
2012-08-05 15:10:00,003 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-77) Autorecovering 0 storage domains
2012-08-05 15:10:00,003 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-77) Checking autorecoverable storage domains done
2012-08-05 15:10:00,175 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-76) hostFromVds::selectedVds - blond-vdsh, spmStatus Unknown_Pool, storage pool iSCSI
2012-08-05 15:10:00,193 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (QuartzScheduler_Worker-76) START, ConnectStoragePoolVDSCommand(vdsId = f0e829f6-dcbe-11e1-b93d-001a4a169741, storagePoolId = ddc2020c-c909-4214-89ff-b3cdde436183, vds_spm_id = 1, masterDomainId = e3c8df23-d94e-4d9f-ba77-6114c76776af, masterVersion = 1), log id: 57b2a23
2012-08-05 15:10:00,483 INFO  [org.ovirt.engine.core.bll.storage.RecoveryStoragePoolCommand] (pool-4-thread-44) [58baf64e] Running command: RecoveryStoragePoolCommand internal: false. Entities aff

Comment 11 mkublin 2012-09-19 09:02:27 UTC

2012-08-05 15:09:50,127 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MarkPoolInReconstructModeVDSCommand] (QuartzScheduler_Worker-58) [417a872b] START, MarkPoolInReconstructModeVDSCommand(storagePoolId = ddc2020c-c909-4214-89ff-b3cdde436183, ignoreFailoverLimit = false, compatabilityVersion = null, reconstructMarkAction = ClearCache), log id: 10aa801c
2012-08-05 15:09:50,128 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-58) [417a872b] clearing cache for problematic entities in pool ddc2020c-c909-4214-89ff-b3cdde436183
2012-08-05 15:09:50,128 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MarkPoolInReconstructModeVDSCommand] (QuartzScheduler_Worker-58) [417a872b] FINISH, MarkPoolInReconstructModeVDSCommand, log id: 10aa801c

It is means that it finished.
finally {
            // reset cache and mark reconstruct for pool as finished
            Backend.getInstance()
                    .getResourceManager()
                    .RunVdsCommand(
                            VDSCommandType.MarkPoolInReconstructMode,
                            new MarkPoolInReconstructModeVDSCommandParameters(getStoragePoolId()
                                    .getValue(), ReconstructMarkAction.ClearCache));
        }
Where the second is running?

Comment 12 Liron Aravot 2012-09-19 10:44:51 UTC

mkublin, the storagepool is loaded within the canDoAction() method, the print is done after canDoAction() has finished. so basically while thread A is executing the reconstruct we can load the storagepool from thread B with stale values - so race can occur and cause this bug.

Comment 13 mkublin 2012-09-19 11:42:18 UTC

(In reply to comment #12)
> mkublin, the storagepool is loaded within the canDoAction() method, the
> print is done after canDoAction() has finished. so basically while thread A
> is executing the reconstruct we can load the storagepool from thread B with
> stale values - so race can occur and cause this bug.

Like I said, race can occurred, but these was not a reason for bug.
Preventing a race will not solve a problem described in bug.
You can not provide some random solution for any bug, you should solve a problem described in the bug.

Comment 14 Liron Aravot 2012-09-19 13:20:11 UTC

after reconstruct complets on the two different threads, a connect and refresh commands are sent to the hosts (from the log) - note the different master domain id and different version


2012-08-05 15:10:41,245 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.RefreshStoragePoolVDSCommand] (pool-4-thread-44) [58baf64e] START, RefreshStoragePoolVDSCommand(vdsId = f0e829f6-dcbe-11e1-b93d-001a4a169741, storagePoolId = ddc2020c-c909-4214-89ff-b3cdde436183, masterStorageDomainId=7a4c3779-ff28-49ea-9b75-76e34c1c0fc7, msterVersion=2), log id: 2fdb5609

2012-08-05 15:10:46,172 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.RefreshStoragePoolVDSCommand] (pool-4-thread-45) [35d639c1] START, RefreshStoragePoolVDSCommand(vdsId = f0e829f6-dcbe-11e1-b93d-001a4a169741, storagePoolId = ddc2020c-c909-4214-89ff-b3cdde436183, masterStorageDomainId=7a4c3779-ff28-49ea-9b75-76e34c1c0fc7, msterVersion=1), log id: 77a4c559


what causes the updatevm to fail? we need to verify on vdsm side as we pass the storagepoolId, but i'm pretty sure that this mixure causes it as one reconstruct doesn't cause that issue.

Comment 15 mkublin 2012-09-24 11:33:38 UTC

(In reply to comment #14)
> after reconstruct complets on the two different threads, a connect and
> refresh commands are sent to the hosts (from the log) - note the different
> master domain id and different version
> 

2012-08-05 15:10:41,245  != 2012-08-05 15:10:46,172
                 41      !=                  46
One finished, next started

Comment 16 Liron Aravot 2012-10-14 15:34:42 UTC

http://gerrit.ovirt.org/#/c/8390/

Comment 17 Allon Mureinik 2012-10-25 11:36:54 UTC

Merged Ida51b027b8078555ffab5b80d2ffe54572fa6cca

Comment 18 Dafna Ron 2012-11-06 10:07:33 UTC

verified on si24

Note You need to log in before you can comment on or make changes to this bug.