Bug 795105

Summary: [ovirt] [engine-core] deadlock on reconstruct master flow
Product: [Retired] oVirt Reporter: Haim <hateya>
Component: ovirt-engine-coreAssignee: mkublin <mkublin>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: acathrow, bazulay, iheim, mgoldboi, mkublin, yeylon, ykaul
Target Milestone: ---   
Target Release: 3.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-09 08:05:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
engine logs. none

Description Haim 2012-02-19 11:51:03 UTC
Created attachment 564174 [details]
engine logs.

Description of problem:


overview:
---------
noticed system doesn't behave the same, where in one data-center, status was activate, host was SPM, but storage domain were 'locked', while in fact, no commands was sent to vdsm, so using 'jstack', i dumped a stack trace to a file and noticed the following: 

Found one Java-level deadlock:
=============================
"QuartzScheduler_Worker-98":
  waiting to lock monitor 0x00007f448400df30 (object 0x00000000e3ba3b50, a java.lang.Object),
  which is held by "QuartzScheduler_Worker-38"
"QuartzScheduler_Worker-38":
  waiting to lock monitor 0x00000000025346d8 (object 0x00000000e3b973c0, a java.lang.Object),
  which is held by "QuartzScheduler_Worker-47"
"QuartzScheduler_Worker-47":
  waiting to lock monitor 0x00007f448400df30 (object 0x00000000e3ba3b50, a java.lang.Object),
  which is held by "QuartzScheduler_Worker-38"

        at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72)
        at org.ovirt.engine.core.common.businessentities.IVdsEventListener$$$view7.MasterDomainNotOperational(Unknown Source)
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.ExecuteVDSCommand(IrsBrokerCommand.java:1607)
        - locked <0x00000000e3b973c0> (a java.lang.Object)
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.ExecuteCommand(VDSCommandBase.java:60)
        at org.ovirt.engine.core.dal.VdcCommandBase.Execute(VdcCommandBase.java:41)
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:414)
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.ProceedStoragePoolStats(IrsBrokerCommand.java:312)
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.access$200(IrsBrokerCommand.java:132)
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData$1.runInTransaction(IrsBrokerCommand.java:213)
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:168)
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:107)
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData._updatingTimer_Elapsed(IrsBrokerCommand.java:203)
        - locked <0x00000000e3b973c0> (a java.lang.Object)
        at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:64)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)

Found 1 deadlock.

Comment 1 mkublin 2012-06-19 14:20:40 UTC
http://gerrit.ovirt.org/#/c/5482/ - possible solution

Comment 2 mkublin 2012-06-21 07:29:52 UTC
pushed http://gerrit.ovirt.org/#/c/5482/

Comment 3 Itamar Heim 2012-08-09 08:05:37 UTC
closing ON_QA bugs as oVirt 3.1 was released:
http://www.ovirt.org/get-ovirt/