Hide Forgot
Date of First Response: 2009-09-10 00:43:16 securitylevel_name: Public I believe I have discovered a deadlock that can occur in drools 4.0.7 if the RuleAgent refreshes its associated RuleBase whilst a new stateful session is being created on the RuleBase by another thread. The thread dump below shows the deadlock where the two thread "Timer-15" and "DataSource(com.acme.Source)-2" are deadlocked. Thread "Timer-15" is the timer thread created by the RuleAgent rule refresh mechanism to check if the rules files have changed, and to refresh the rules when a change is found. If it finds changes to the rules then it obtains a lock (<0x189a8d88> (a java.util.HashMap)) and proceeds to removes the old version of the changed package from the RuleBase. To do this it needs to obtain another lock (<0x189a2ba0> (a org.drools.reteoo.ReteooRuleBase)) before it can call removeRule. However, in another thread "DataSource(com.acme.Source)-2" a new stateful session is being created on the same RuleBase. This has already obtained the lock (<0x189a8d88> (a java.util.HashMap)) that the Timer thread is waiting for, and is itself waiting for the another lock <0x189a2ba0> (a org.drools.reteoo.ReteooRuleBase) that the Timer thread has already locked, hence the deadlock. In my own application I coded around this by eventually not using the RuleAgent in-built refresh mechanism, but instead periodically calling refreshRuleBase() on the RuleAgent in the SAME thread used to create the Stateful session, thus avoiding any deadlock. "Timer-15" daemon prio=6 tid=0x02f4b0e8 nid=0x1864 waiting for monitor entry [0x038cf000..0x038cfa68] at org.drools.reteoo.ReteooRuleBase.removeRule(ReteooRuleBase.java:270) - waiting to lock <0x189a2ba0> (a org.drools.reteoo.ReteooRuleBase) at org.drools.common.AbstractRuleBase.removeRule(AbstractRuleBase.java:656) at org.drools.common.AbstractRuleBase.removePackage(AbstractRuleBase.java:570) - locked <0x189a8d88> (a java.util.HashMap) at org.drools.agent.PackageProvider.removePackage(PackageProvider.java:45) at org.drools.agent.PackageProvider.applyChanges(PackageProvider.java:63) at org.drools.agent.RuleAgent.refreshRuleBase(RuleAgent.java:320) at org.drools.agent.RuleAgent$2.run(RuleAgent.java:438) at java.util.TimerThread.mainLoop(Unknown Source) at java.util.TimerThread.run(Unknown Source) "DataSource(com.acme.Source)-2" daemon prio=4 tid=0x03027610 nid=0xfac waiting for monitor entry [0x0378f000..0x0378fb68] at org.drools.reteoo.ReteooRuleBase.newStatefulSession(ReteooRuleBase.java:225) - waiting to lock <0x189a8d88> (a java.util.HashMap) - locked <0x189a2ba0> (a org.drools.reteoo.ReteooRuleBase) at org.drools.common.AbstractRuleBase.newStatefulSession(AbstractRuleBase.java:284) at com.acme.RunRules.flush(RunRules.java:3337) at com.acme.ControlThread.run(ControlThread.java:465) at java.lang.Thread.run(Unknown Source)
Link: Added: This issue is related to JBRULES-1876
Fix in place.
For documenting this in the Release Notes, can you please confirm the following and fill in the missing information. Dot point explanations are fine: The CAUSE (what was actually broken) * A deadlock would occur in Drools 4.0.7 if the RuleAgent refreshed its associated RuleBase whilst a new stateful session was being created by another thread on that same RuleBase. CONSEQUENCES of the bug (how it impacts users.) * This would result in a deadlock. The FIX (what was changed to eliminate this bug) and * RESULTS of the fix (what now happens for users.) * The error no longer occurs???
We are still awaiting the outstanding information for the Release Notes on this one. Please provide it as soon as possible. Thanks.
added to the 5.0.CP01 release notes as resolved: JBRULES-1876 The RuleAgent can now safely refresh its associated RuleBase whilst a new stateful session is being created on the RuleBase by another thread. Previously this could result in a deadlock.