Bug 844306

Summary: [engine-core] locking mechanism: host is stuck in unassigned state for 10 minutes if activate and remove takes place (vds_proxy for this host is lost)
Product: Red Hat Enterprise Virtualization Manager Reporter: Haim <hateya>
Component: ovirt-engineAssignee: mkublin <mkublin>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Stehlik <pstehlik>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.1.0CC: acathrow, bazulay, dyasny, iheim, lpeer, mkublin, Rhev-m-bugs, yeylon, ykaul, yzaslavs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: si13 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-04 20:07:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine.log none

Description Haim 2012-07-30 09:18:58 UTC
Description of problem:

I have the following case were host is stuck in unassigned state till transaction is dead (rolled back), afterwards, host moves to active state, but engine deletes its vdsm-proxy, and bad things happen (reconstruct fails for example).

later on, I get the following log: 


2012-07-30 15:38:40,106 ERROR [org.ovirt.engine.core.vdsbroker.ResourceManager] (QuartzScheduler_Worker-87) [50059c70] Cannot get vdsManager for vdsid=78e0bf24-d979-11e1-9e8e-001a4a16970e
2012-07-30 15:38:40,106 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-87) [50059c70] ERROR, org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand, exception: RuntimeException: VdcBLLException: Vds with id: 78e0bf24-d979-11e1-9e8e-001a4a16970e was not found, log id: 2d159d5e: java.lang.RuntimeException: VdcBLLException: Vds with id: 78e0bf24-d979-11e1-9e8e-001a4a16970e was not found
        at org.ovirt.engine.core.vdsbroker.ResourceManager.CreateCommand(ResourceManager.java:349) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:387) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.gethostFromVds(IrsBrokerCommand.java:652) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.access$400(IrsBrokerCommand.java:124) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData$3.runInTransaction(IrsBrokerCommand.java:550) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData$3.runInTransaction(IrsBrokerCommand.java:547) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:168) [engine-utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:107) [engine-utils.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.getIrsProxy(IrsBrokerCommand.java:547) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.getIrsProxy(IrsBrokerCommand.java:1511) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.ExecuteVDSCommand(IrsBrokerCommand.java:1519) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.ExecuteCommand(VDSCommandBase.java:60) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.Execute(VdcCommandBase.java:41) [engine-dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:390) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.ProceedStoragePoolStats(IrsBrokerCommand.java:300) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.access$200(IrsBrokerCommand.java:124) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData$1.runInTransaction(IrsBrokerCommand.java:205) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:168) [engine-utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:107) [engine-utils.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData._updatingTimer_Elapsed(IrsBrokerCommand.java:195) [engine-vdsbroker.jar:]
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) [:1.7.0_05-icedtea]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_05-icedtea]
        at java.lang.reflect.Method.invoke(Method.java:601) [rt.jar:1.7.0_05-icedtea]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:64) [engine-scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz-2.1.2.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz-2.1.2.jar:]
Caused by: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: Vds with id: 78e0bf24-d979-11e1-9e8e-001a4a16970e was not found
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.initializeVdsBroker(VdsBrokerCommand.java:52) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.<init>(VdsBrokerCommand.java:45) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand.<init>(SpmStatusVDSCommand.java:20) [engine-vdsbroker.jar:]
        at sun.reflect.GeneratedConstructorAccessor66.newInstance(Unknown Source) [:1.7.0_05-icedtea]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [rt.jar:1.7.0_05-icedtea]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525) [rt.jar:1.7.0_05-icedtea]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.CreateCommand(ResourceManager.java:344) [engine-vdsbroker.jar:]


[root@hateya-rhevm ~]# grep 50d11f7f  /var/log/ovirt-engine/engine.log
2012-07-30 15:25:00,005 INFO  [org.ovirt.engine.core.bll.ActivateVdsCommand] (QuartzScheduler_Worker-7) [50d11f7f] Lock Acquired to object EngineLock [exclusiveLocks= key: 16cbcf58-d97a-11e1-bba6-001a4a16970e value: VDS
2012-07-30 15:25:00,006 INFO  [org.ovirt.engine.core.bll.ActivateVdsCommand] (QuartzScheduler_Worker-7) [50d11f7f] Running command: ActivateVdsCommand internal: true. Entities affected :  ID: 16cbcf58-d97a-11e1-bba6-001a4a16970e Type: VDS
2012-07-30 15:25:00,009 ERROR [org.ovirt.engine.core.vdsbroker.ResourceManager] (QuartzScheduler_Worker-7) [50d11f7f] Cannot get vdsManager for vdsid=16cbcf58-d97a-11e1-bba6-001a4a16970e
2012-07-30 15:25:00,009 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (QuartzScheduler_Worker-7) [50d11f7f] START, SetVdsStatusVDSCommand(vdsId = 16cbcf58-d97a-11e1-bba6-001a4a16970e, status=Unassigned, nonOperationalReason=NONE), log id: 6eeb761
2012-07-30 15:25:00,009 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (QuartzScheduler_Worker-7) [50d11f7f] FINISH, SetVdsStatusVDSCommand, log id: 6eeb761
2012-07-30 15:25:00,182 ERROR [org.ovirt.engine.core.vdsbroker.ResourceManager] (QuartzScheduler_Worker-7) [50d11f7f] Cannot get vdsManager for vdsid=16cbcf58-d97a-11e1-bba6-001a4a16970e
2012-07-30 15:25:00,182 INFO  [org.ovirt.engine.core.vdsbroker.ActivateVdsVDSCommand] (QuartzScheduler_Worker-7) [50d11f7f] START, ActivateVdsVDSCommand(vdsId = 16cbcf58-d97a-11e1-bba6-001a4a16970e), log id: 78859b39
2012-07-30 15:25:00,182 INFO  [org.ovirt.engine.core.vdsbroker.ActivateVdsVDSCommand] (QuartzScheduler_Worker-7) [50d11f7f] FINISH, ActivateVdsVDSCommand, log id: 78859b39
2012-07-30 15:25:00,186 INFO  [org.ovirt.engine.core.bll.ActivateVdsCommand] (QuartzScheduler_Worker-7) [50d11f7f] Command [id=ca65ffe2-3b9c-47ed-bc38-1e44569dc5d1]: Compensating CHANGED_STATUS_ONLY of org.ovirt.engine.core.common.businessentities.VdsDynamic; snapshot: EntityStatusSnapshot [id=16cbcf58-d97a-11e1-bba6-001a4a16970e, status=NonOperational].
2012-07-30 15:25:00,223 ERROR [org.ovirt.engine.core.bll.ActivateVdsCommand] (QuartzScheduler_Worker-7) [50d11f7f] Transaction rolled-back for command: org.ovirt.engine.core.bll.ActivateVdsCommand.
2012-07-30 15:25:00,223 INFO  [org.ovirt.engine.core.bll.ActivateVdsCommand] (QuartzScheduler_Worker-7) [50d11f7f] Lock freed to object EngineLock [exclusiveLocks= key: 16cbcf58-d97a-11e1-bba6-001a4a16970e value: VDS
2012-07-30 15:25:00,232 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-7) [50d11f7f] Checking autorecoverable hosts done
2012-07-30 15:25:00,232 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-7) [50d11f7f] Checking autorecoverable storage domains
2012-07-30 15:25:00,232 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-7) [50d11f7f] Autorecovering 0 storage domains
2012-07-30 15:25:00,232 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-7) [50d11f7f] Checking autorecoverable storage domains done

Comment 1 Haim 2012-07-30 09:21:51 UTC
Created attachment 601162 [details]
engine.log

Comment 2 mkublin 2012-07-30 11:31:56 UTC
*** Bug 844182 has been marked as a duplicate of this bug. ***

Comment 3 mkublin 2012-07-31 12:39:12 UTC
http://gerrit.ovirt.org/#/c/6778/

Remove will not remove proxy with out entry at DB